At the start of my career in Market Research, I was put through a relative sampling boot camp before touching a project or interacting with clients. Some (cough cough) years later, I’m still drawing on nuggets of gold from that training. Still, in the phone heyday, and mostly before online survey popularity, the brilliant Linda Piekarski and others would put us through our paces. We were taught US geography, telephone exchanges, and incidence rate calculations, as the data collection call centers required. I learned the nuance between the varieties of incidence rate formulas and how they impacted production rates. This was then later reinforced numerous times through countless telephone and online data collection projects I’ve amassed over the years.
Given that context, it’s striking that we seem to operate in an era where incidence formulas and production rates are not nearly as aligned as they were back then. The buyers and sellers of online sample may often have extremely different business views of these two metrics and how to calculate them. Given how vital this single metric is to determining both feasibility and CPI for online panels, this is striking. In a partner blog with Brookmark Research, I touched on this when we articulated how “conversion rates” serve as the more operative/relevant formula for online panel firms who tend to manage their supply programmatically, particularly in the US. Among OR’s clients, incidence can be defined in a variety of ways. So, which is “right”? Well, that depends on what guidelines you follow, and more importantly, how you interpret them in your study. Let’s review a few approaches.
Definitions As we explore available industry definitions, we find some similarities in wording to what Incidence is. From ESOMAR: Incidence (aka Strike Rate) is the proportion of respondents contacted in a survey who qualify for the survey. From the Insights Association (The merged entity that was CASRO and MRA): Any figure referring to the percentage of people in a category. Insights Association (the second definition in the same glossary): The proportion of respondents contacted in a survey who qualify for the survey. From SSI: Survey Incidence- This gives us information on what proportion of participants will qualify and how much we should charge per complete for the study. The high-level themes are common. But as we start applying these definitions to practical situations, the nuances in the approaches become much more apparent.
Formulas Since Incidence is a straightforward production metric as it is an analytical one, the formula behind calculating IR on a project is a given. However, the standards are not set, and here are a few examples of how these calculations can vary.
1. Survey Incidence = [the number of completes/(the number of completes + screenouts)]
2. Incidence = # of people who qualify / (# of people who qualify + # of people who do not qualify)
3. Incidence = # of completes / (# of people who completed + # of people who screened-out including quota full terms)
4. Incidence = (# of completes + # of qualified incompletes + #Cheaters) / (# of people who completed + # of people who screened-out including quota full terms + # of qualified incompletes + # Cheaters)
The commonalities in all the above formulas are: inclusion of complete counts and terminates (screenouts). In scenario 2, the wording is less specific on the terminate category. Specifically, someone who “does not qualify”could be broader than Terms, and it could include cheaters or even early incompletes. In the third scenario, qualified respondents but only for a closed quota are considered against the IR. Scenario 4 tries to represent almost all survey clicks but including the qualified incompletes and Cheaters equally in the numerator and denominator of the formula. The logical next step is to analyze on a theoretical project how this may impact the reported IR.
In Practice: So, let’s run these through a typical project scenario. Let’s say an online survey project produced a disposition file that looked like this: Completes: 1,000 Terms: 500 Demo OQs: 200 Category OQs: 300 QC Kickouts (automated): 100 Dropouts: 100 (50 in screener, 50 after)
Here’s how these align with the four formulas above:
1. 1000 completes / (1000 completes + 500 terms) = 67% IR
2. 1000 / (1000 + 500 terms + 100 “Cheaters”) = 62% IR
3. 1000 / (1000 + 500 terms + 500 OQs) = 50% IR
4. (1000 completes + 50 qual incompletes + 100 cheaters) / (1000 completes + 500 terms + 500 OQs + 50 qual incompletes + 100 “cheaters”) = 53% IR
If we went to a simple “conversion rate”, this formula becomes a simple:
5. Completes / Starts = 45% CR
Which is “right”?
I don’t think there is one correct answer. Instead, think of this as a call to align your thinking of incidence with the partner(s) you choose. These differences in interpretation can have a cost impact and drive one’s overall understanding of relative difficulty on a project. Over Quotas are a perfect demonstration. Methodological allowances (or assumed sampling capability) on targeting and/or profiling will impact project outcomes in this area.
Many legacy assumptions remain that one of the most significant benefits of double opt-in panels are the depth of profiling. But what if that profiling isn’t there like it once was? OQs are where this gap shows up. In light of this and the current age of sample tech, we may be entering an era where IR becomes irrelevant. But that’s a topic of another day! There is no one perfect approach to IR. It depends on the sample design and strategy. But this area is ripe for manipulation by the buyer or seller of sample if it’s not deeply understood. Sometimes a price is too good to be true, while other times, a client spec is too good to be true.