Data Collection

Our Institute believes effective data collection is essential for evidence based decision making.
Our approach in data collection follows the following steps:

1. Defining the goal

Defining the goal is a crucial first step. We engage relevant stakeholders and team members in an iterative and collaborative process to establish clear goals. It’s important that projects start with the identification of key questions and desired outcomes to ensure we focus our efforts on gathering the right information. 

We start by understanding the purpose of the project– what problem our clients are trying to solve, or what change do they want to bring about? We think about the project’s potential outcomes and obstacles and try to anticipate what kind of data would be useful in these scenarios. We consider the type of our clients who will be using the data we collect and what data would be the most valuable to them. We think about the long-term effects of the project and how it can be measured over time. Lastly, we leverage any historical data from previous projects to help our clients refine key questions that may have been overlooked previously. 

2. Identifying the data sources

The crucial next step in the research process is determining the potential data sources. Essentially, there are two main data types to choose from: primary and secondary.

  • Primary data is the information one can collect directly from first-hand engagements. It’s gathered specifically for the research at hand and tailored to the research questions. Primary data collection methods can range from surveys and interviews to focus groups and observations. Because we design the data collection process, primary data can offer precise, context-specific information directly related to the research objectives.
  • Secondary data, on the other hand, is derived from resources that already exist. This can include information gathered for other research projects, administrative records, historical documents, statistical databases, and more. While not originally collected for the specific study, secondary data can offer valuable insights and background information that complement the primary data.

3. Choosing the data collection method

When choosing the data collection method, there are many options at our disposal. Depending on the type of data collection method that suits the project at hand we employ quantitative or qualitative surveys. It can be done by administering structured questionnaires on-line, by phone or in-person. It can also be done through less structured interview guides such as in Focus Group Discussions (FGDs), Key Informant Interviews (KIIs).

4. Determining the sampling method

Once we establish our data collection goals and how we’ll collect the data, the next step is deciding whom to collect the data from. Sampling involves carefully selecting a representative group from a larger population. Choosing the right sampling method is crucial for gathering representative and relevant data that aligns with the data collection goal.

We consider the following guidelines to choose the appropriate sampling method for the research goal and data collection method:

  • Understand the Target Population: Start by conducting thorough research of the target population. Understand who they are, their characteristics, and subgroups within the population.
  • Anticipate and Minimize Biases:Anticipate and address potential biases within the target population to help minimize their impact on the data. For example, will the sampling method accurately reflect all ages, gender, cultures, etc., of the target population? Are there barriers to participation for any subgroups? The sampling method should allow us to capture the most accurate representation of the target population.
  • Maintain Cost-Effective Practices: Consider the cost implications of the chosen sampling methods. Some sampling methods will require more resources, time, and effort. The chosen sampling method should balance the cost factors with the ability to collect data effectively and accurately. 
  • Consider the Project’s Objectives: Tailor the sampling method to meet the specific objectives and constraints, such as M&E teams requiring real-time impact data and researchers needing representative samples for statistical analysis.

By adhering to these guidelines, we can make informed choices when selecting a sampling method, maximizing the quality and relevance of the data collection efforts.

5. Identify and train the data collection team

Not every data collection use case requires data collectors, but training individuals responsible for data collection becomes crucial in scenarios involving field presence.

Whether we’re hiring and training data collectors, utilizing an existing team, or training existing field staff, we offer comprehensive guidance and the right tools to ensure effective data collection practices.   Here are some common training approaches for data collectors:

  • In-Class Training: Comprehensive sessions covering protocols, survey instruments, and best practices empower data collectors with skills and knowledge.
  • Tests and Assessments: Assessments evaluate collectors’ understanding and competence, highlighting areas where additional support is needed.
  • Mock Interviews:Simulated interviews refine collectors’ techniques and communication skills.
  • Pre-Recorded Training Sessions: Accessible reinforcement and self-paced learning to refresh and stay updated.

Training data collectors is vital for successful data collection techniques. The training should focus on proper instrument usage and effective interaction with respondents, including communication skills, cultural literacy, and ethical considerations. We understand training is an ongoing process. Knowledge gaps and issues may arise in the field, necessitating further training.

6. Design and test the survey tools

Designing effective data collection instruments like surveys and questionnaires is key. It’s crucial to prioritize respondent consent and privacy to ensure the integrity of the research. Thoughtful design and careful testing of survey questions are essential for optimizing research insights. Other critical considerations are: 

  • Clear and Unbiased Question Wording: Crafting unambiguous, neutral questions free from bias to gather accurate and meaningful data is crucial.
  • Logical Ordering and Appropriate Response Format: Arrange questions logically and choose response formats (such as multiple-choice, Likert scale, or open-ended) that suit the nature of the data we seek to collect.
  • Coverage of Relevant Topics: we ensure that our instrument covers all topics pertinent to the data collection goals while respecting cultural and social sensitivities. We make sure our instrument avoids assumptions, stereotypes, and languages or topics that could be considered offensive or taboo in certain contexts. The goal is to avoid marginalizing or offending respondents based on their social or cultural background.
  • Collect Only Necessary Data:We design survey instruments that focus solely on gathering the data required for the research objectives, avoiding unnecessary information.
  • Language(s) of the Respondent Population: We tailor our instruments to accommodate the languages the target respondents speak, offering translated versions if needed. Similarly, we take into account accessibility for respondents who can’t read by offering alternative formats like images in place of text.
  • Desired Length of Time for Completion:Respect respondents’ time by designing instruments that can be completed within a reasonable timeframe, balancing thoroughness with engagement. Having a general timeframe for the amount of time needed to complete a response will also help us weed out bad responses. For example, a response that was rushed and completed outside of the response timeframe could indicate a response that needs to be excluded.
  • Collecting and Documenting Respondents’ Consent and Privacy: We ensure a robust consent process, transparent data usage communication, and privacy protection throughout data collection.

Put the Instrument to the Test

Through rigorous testing, we uncover flaws, ensure reliability, maximize accuracy, and validate the instrument’s performance. This can be achieved by:

  • Conducting pilot testingto enhance the reliability and effectiveness of data collection. Administer the instrument, identify difficulties, gather feedback, and assess performance in real-world conditions.
  • Making revisionsbased on pilot testing to enhance clarity, accuracy, usability, and participant satisfaction. Refine questions, instructions, and format for effective data collection.
  • Continuously iterating and refining the instrument based on feedback and real-world testing. This ensures reliable, accurate, and audience-aligned methods of data collection. Additionally, this ensures the instrument adapts to changes, incorporates insights, and maintains ongoing effectiveness.

7. Collect the data

Now that we have well-designed survey, interview questions, observation plan, or form, it’s time to implement it and gather the needed data. Data collection is not a one-and-done deal; it’s an ongoing process that demands attention to detail. Imagine spending weeks collecting data, only to discover later that a significant portion is unusable due to incomplete responses, improper collection methods, or falsified responses. To avoid such setbacks, we adopt an iterative approach.

We leverage data collection tools with real-time monitoring to proactively identify outliers and issues. We take immediate action by fine-tuning the instruments, optimizing the data collection process, addressing concerns like additional training, or reevaluating personnel responsible for inaccurate data (for example, a field worker who sits in a coffee shop entering fake responses rather than doing the work of knocking on doors).

8. Clean and organize the data

After data collection, the next step is to clean and organize the data to ensure its integrity and usability.

  • Data Cleaning: This stage involves sifting through the data to identify and rectify any errors, inconsistencies, or missing values. It’s essential to maintain the accuracy of the data and ensure that it’s reliable for further analysis. Data cleaning can uncover duplicates, outliers, and gaps that could skew the results if left unchecked. With real-time data monitoring, this continuous cleaning process keeps the data precise and current throughout the data collection period. Similarly, review and corrections workflows allow us to monitor the quality of the incoming data.
  • Organizing the Data: Post-cleaning, it’s time to organize the data for efficient analysis and interpretation. Labeling the data using appropriate codes or categorizations can simplify navigation and streamline the extraction of insights. When we use a survey or form, labeling the data is often not necessary because we can design the instrument to collect in the right categories or return the right codes. An organized dataset is easier to manage, analyze, and interpret, ensuring that our collection efforts are not wasted but lead to valuable, actionable insights.

Each stage of the data collection process, from design to cleaning, is iterative and interconnected. By diligently cleaning and organizing the data, we are setting the stage for robust, meaningful analysis that can inform our clients’ data-driven decisions and actions.

9. Safely store and handle data

Throughout the data collection process, and after it has been collected, it is vital that we follow best practices for storing and handling data to ensure the integrity of the research. While the specifics of how to best store and handle data will depend on the project, here are some important guidelines we keep in mind regarding data storage and handling.

  • We Use cloud storageto hold our data if possible, since this is safer than storing data on hard drives and keeps it more accessible,
  • We Periodically back up and purge old datafrom our system, since it’s safer to not retain data longer than necessary,
  • When we use mobile devices or tablets to collect and store data, we use options for private, internal apps-specific storageif and when possible,
  • We restrict access to stored data to only those who need to work with that data.

We uphold ethical standards in interpreting and reporting our data. Clear communication, respectful handling of sensitive information, and adhering to confidentiality and privacy rights are all essential to fostering trust, promoting transparency, and bolstering our work’s credibility.