Data Collection – Types, Methods and Errors
By the time you’ve finished reading this blog post, you’ll have learnt about the different types of data and why and how to collect it. You’ll also have learnt about potential pitfalls along the way, and collecting, storing and removing personal data with Brosix. Let’s dig in.
Table of Contents
Data collection definition
Data collection is the process of collecting and assessing vital information for your business, operation or research. In other words, some data collection is inevitable and some is voluntary, as part of a goal-oriented task.
In the healthcare industry, for example, you have to collect and store various personal data of your clients in order to provide the best healthcare service possible. The same applies to finance, where you have to have a social security number in order to open a bank account.
On the other hand, you could be collecting data in order to learn more about your clients’ behavior, predict trends in the market and test those predictions. For example, you might want to know the number of clients who have downloaded your banking app, are actually using it and if that statistic is close to your target.
You can also carry out data collection as part of a prevention strategy. In medicine, this would be reflected in patient monitoring with the goal of preventing further health issues.
Either way, data collection is happening and in order for data to safely flow from one end to the other, it must be collected in a way that doesn’t harm the people whose information is shared online.
Thanks to Brosix’s safety features like peer-to-peer file transfer and heavy-duty data encryption, companies can share client information internally without fear of a third-party data breach. Even Brosix can’t access the data, as it’s encrypted on the sender’s end and decrypted on the receiver’s end.
Data collection and data security with Brosix
As far as user data is concerned, Brosix never stores user data on its servers unless explicitly requested. Also, users can delete or change their information at any time. Basically, you can have a Brosix account with zero personal information on it. That’s why we don’t purge unused accounts.
One of our data security features is our Web Control Panel that allows you to restrict access to specific users’ data. Another feature is the end user authentication process which makes it impossible for third party users to hack into your network.
Data collection process
Regardless of what data you want to collect and what data collection procedures you use, the process itself is always the same:
- Setting a goal for data collection (example: find out how many gen x and baby boomers use your banking app)
- Setting a schedule for data collection
- Settling on data collection methods (example: will you only conduct a survey or will you also do in-person interviews)
- Collecting and analyzing the data
Before you learn about each step in more detail, let’s set up some ground rules.
Data collection regulations
More and more regulations are being enforced over data collection, and rightly so. As a result, healthcare and pharmaceutical companies in the US have to adhere to the HIPAA (Health Insurance Portability and Accountability Act) federal law, while all EU-based businesses that handle their users’ personal data have to adhere to the GDPR (General Data Protection Regulation) law when collecting data.
HIPAA is, as stated by the Centers for Disease Control and Prevention is, “a federal law that required the creation of national standards to protect sensitive patient health information from being disclosed without the patient’s consent or knowledge.”
GDPR is a regulation that allows individuals to disclose as little personal information as they wish when online. At the same time, it holds companies accountable for their data collection malpractices and the EU’s data protection authorities can fine them up to almost $21 million.
Now that you know a thing or two about how data collection is regulated, let’s learn how to conduct it.
Real-life examples of data
Data collected by healthcare practitioners on a daily basis: medications and prescriptions administered to patients, operations data, encounter and discharge forms
Data that financial institutions typically collect: assets, liabilities, equity, cash flow, income and expenses
Data that is collected in real estate: purpose, value and ownership of a property, municipal changes in the area of the property
Having seen some real-life examples, let’s now explore the technical side of data segmentation.
Two main types of data
Even though there are many ways to segment data, the two main types are: quantitative data and qualitative data. The first gives you raw information, while the latter gives you context.
Questions usually asked when collecting qualitative data are closed-ended questions that have to do with amounts of things and frequencies of events (e.g. have you downloaded our banking app?).
Quantitative data is context-free and raw. You’ll need to accompany it with qualitative data that can shed some light on the gathered information. Otherwise, you might end up believing that eating ice cream causes sunburn (the legendary “correlation doesn’t imply causation” argument that depicts how a lack of data can lead to false conclusions).
Qualitative data tells you why the results of a quantitative data collection are the way they are. For instance, by gathering survey results about the number of clients actively using your banking app, you can set yourself a mission to find out why the number is lower than expected – mission qualitative data collection.
Collecting qualitative data gets you closer to understanding the reasons behind the numbers. In the case of our imaginary underused banking app, you might conclude that the demographic that is underperforming has low trust in technology when it comes to business transactions.
This conclusion might’ve begun as an assumption that you decided to test out. To test your theory, you’ve devised an interview that clearly addresses the possibility of your clients’ distrust of online transactions.
Both quantitative and qualitative data can be collected either by you during your research or by another organization at an earlier time. Depending on who conducted the data collection, we can divide the data into primary and secondary.
Primary data and secondary data
As we said, when you classify data according to how it was collected or better yet, who collected it, there are two types – primary and secondary. All the data you gathered is primary data, and all the data that someone else gathered earlier is secondary data. Both sets have their place in your research.
This type of data is first-hand data collected in real-time. This entire blog post revolves more or less around it. It is costlier, takes longer to gather, requires greater involvement and a lot of planning. However it’s also more accurate and reliable, specific and relevant to your research goals.
While primary data helps you achieve your research goals or solve a problem, secondary data is actually very important for budgeting your research. That’s because some of the information you need might already be publicly available, which is the case more often than not. It can even be data from your earlier research.
The most common sources of secondary data are government statistics, publications from trade associations, journals, articles, university research reports, financial and sales reports.
Now that you and data are on a first-name basis, let’s learn the ways you can obtain it.
Data collection method roundup
The five major data collection methods are:
- Document review
- Questionnaires and surveys
- Focus groups
We can group them according to the type of data we need:
- Quantitative data (document review, questionnaires & surveys)
- Qualitative data (interviews, focus groups and observations)
Although data collection is undoubtedly valuable for business analytics, keep in mind that it can be unwanted and stressful for your current loyal customers. You should use the methods listed below tactfully and carefully.
Quantitative data collection methods
Raw ingredients are always the starting point of any recipe, the same goes for data collection and quantitative data is very much a raw ingredient. Below are the best methods to collect quantitative data.
Document review is the process of assessing secondary data you obtained from sources like government statistics or a publicly available independent study. This should be the first step you take when conducting your research because you don’t want to waste resources doing work that someone else has already done.
During the document review, you’ll learn the objective of the study, how the data was collected, what the population of the study was and what response categories were used in questionnaires (i.e. strongly agree, agree, not sure, disagree, strongly disagree).
Questionnaires and surveys
Think of questionnaires and surveys as lego blocks and lego figures.
Questionnaires are lego blocks – when you create forms with closed-ended questions, send them to your target audience who fill them out and return them, you’re left with a lot of lego blocks that you might not know what to do with.
Surveys are the lego figures – they give the “blocks” of questionnaires meaning, and depending on the quality of the data, your survey “lego figure” could look amazing. A survey is the entire process of handling raw data gathered from conducting questionnaires.
Finally, questionnaires and surveys might produce qualitative data as well. All it takes is asking open-ended questions.
Qualitative data collection methods
Collecting this data gives more insight into the quantitative data as well as open opportunities for further research. Here’s what to do:
Interviews are probably the first thing that spring to mind when you think about data collection, and for good reason. A thoughtfully planned interview can reveal a wealth of precious information about a problem you’re trying to solve or a goal you’re trying to achieve.
There are two types of interviews or two stages if you’d like – structured and unstructured.
You would conduct structured interviews in the initial stages of interviewing job candidates and unstructured in the later stages. Let’s see why.
Both can be carried out as in-person interviews or through alternative methods (phone, IM chat, video meetings).
These interviews are no-nonsense – here, you ask interviewees the same set of closed-ended questions in the same order with no additional ones.
They’re great because they’re easily quantifiable, they’re quick to conduct and therefore easily applicable to a broader population. Another name for them is formal interviews and they are categorized as quantitative data because of the lack of context they provide.
These interviews consist of open-ended questions you would ask a candidate you’re seriously considering hiring for a job. Questions like “what would you change in our current marketing campaign?” or “what sustainability policies do you believe our company could improve on?”. In addition to asking open-ended questions, during an unstructured interview, you can steer the conversation wherever you think you’d gain more context and learn more about the interviewee.
The problem with unstructured interviews is that only trained interviewers can conduct them successfully, so you’ll need an interviewing expert in your team if you’re not already one.
Devise focus group interviews
Focus group interviews are a great tool for market research or for gaining a deeper understanding of social issues. The ingredients of a good focus group are five to eight participants from a deliberately selected group (e.g. gen Zers) and a moderator.
You can think of focus groups as discussion panels. Everybody, including the moderator, has knowledge on the selected topic. The moderator’s job is to keep the conversation on topic and to politely manage participants who are dominating the conversation and motivate the ones that are yet to contribute their points of view.
Focus groups are best conducted as in-person interviews because a lot of observational data can be missed during a video conference, which brings us to the next data collection method – observation.
One method of collecting qualitative data is observation. The outcome of this method is information that is more than statistical data. It depends heavily on the researcher’s ability to interpret the collected data so it’s susceptible to bias. However, when conducted correctly, it’s a great way to determine pain points within your target group.
Two common examples of observational data collection are measurable observations and direct observations.
An example of measurable observations is when data scientists use sensors to measure noise levels at an airport with the goal of determining which areas need to be soundproofed.
On the other hand, direct observation would be when Superintendent Chalmers watches Edna teach a class to Bart Simpson and other kids from Springfield elementary to determine how well she delivers the curriculum.
Being aware of potential pitfalls can help to steer you towards success. Here are some common data collection problems you might face during your research.
Faulty data collection practices – five common mistakes
Data collection is hard, but accurate data collection is even harder. Your goals and understanding of your audience/target market have to be clear. If not, your data collection plan could well be riddled with errors. The five most common are:
- Population specification error
- Sample frame error
- Selection error
- Non-responsive error
- Measurement (observational) error
Population specification error
This happens when you wrongly assume a certain group is the target of your research. For instance, asking moms to rate ice cream flavors and designing your ice cream “lineup” for the following summer with their feedback in mind could drag your company into bankruptcy. Why? Because it’s not the moms who eat the most ice cream, it’s the kids that do.
Sample frame error
This is an error of covering a smaller sample of the target demographics than required for the research to be accurate.
A good example of this is a national survey conducted only via landlines. You’d be missing out on all the mobile users, leading to very skewed results.
This is more of an on-field error that interviewers are susceptible to. Selection errors often happen during mall intercept interviews, where untrained interviewers tend to approach people belonging to a group they’re comfortable communicating with, rather than approaching a wide range of passersby randomly.
This error happens when a significant number of participants fail to fill out the questionnaires and send them back. It could also mean that you haven’t sent the questionnaires to 100% of your sample group.
These can be systematic and random errors. There are four systematic errors: instrumental, environmental, observational and theoretical. Let’s take the aforementioned example of measuring noise levels at the airport to explain them.
- An instrumental error would mean the airport were using faulty sensors to measure noise level
- An environmental error would be, that noise levels were measured both during airport downtime and at active time
- An observational error would be that airport staff had drawn wrong conclusions from the sensors’ data
- A theoretical error would be that the airport conducted the research believing that noise was causing the lack of air travel during the last two quarters. In fact, in the last two quarters, there were fewer travelers due to the pandemic. Therefore attributing noise as the main factor would be a theoretical error (the wrong theory behind the assumption)
- A random error example would be, the airport staff experiencing burnout due to being understaffed, thus increasing human error
Data collection is a one thousand-piece jigsaw puzzle of a painting from the impressionist era and each puzzle piece represents a valuable piece of information. It takes planning, separating the puzzle pieces into groups, assembling patches and connecting them to slowly reveal the big picture.
Essentially, the better you prepare your research, the better the outcome of your data collection will be. If you set your goals well and target interviews, surveys and observations accordingly and avoid bias, the data you’ll get in return will be able to answer your questions, evaluate your hypothesis and even successfully forecast trends.
What is data collection?
Data collection definition: systematized procedure of gathering, measuring and analyzing data using standard methods.
Data collection example: A supermarket observing walking patterns of customers to determine where to build a new pathway.
What are the four types of data collection?
The four types of data collection are observational, experimental, simulation and derived.
Observational type is conducted through human observation, open-eyed surveys and the use of data collection instruments.
Experimental data collection is conducted in order to determine the relationship between two variables, for example, to what degree a new ointment reduces muscle fatigue in sports professionals.
Simulation data collection is a research technique in which you reproduce actual events under test conditions, for example, an F1 simulator used by formula one pilots to prepare for races.
Derived data collection is the process of pulling existing data from various sources. Researchers do this during the secondary data collection and document review stages. An example of this is aggregating data from a nationwide survey.
What are the 5 methods of data collection?
The five methods of collecting data are:
- Document review
- Questionnaires and surveys
- Focus groups