Can Occur if the Sampling Frame Excludes Certain Members of the Population

Abstract

Background

In this paper, the bones elements related to the option of participants for a health research are discussed. Sample representativeness, sample frame, types of sampling, likewise every bit the bear on that non-respondents may take on results of a study are described. The whole discussion is supported by practical examples to facilitate the reader's agreement.

Objective

To introduce readers to bug related to sampling.

Keywords: Dermatology, Epidemiology and biostatistics, Epidemiologic studies, Sample size, Sampling studies

INTRODUCTION

The essential topics related to the choice of participants for a health research are: 1) whether to work with samples or include the whole reference population in the written report (census); 2) the sample ground; 3) the sampling process and 4) the potential effects nonrespondents might have on report results. Nosotros volition refer to each of these aspects with theoretical and practical examples for better understanding in the sections that follow.

TO SAMPLE OR Not TO SAMPLE

In a previous newspaper, nosotros discussed the necessary parameters on which to estimate the sample size.¹ We define sample equally a finite role or subset of participants drawn from the target population. In turn, the target population corresponds to the unabridged set of subjects whose characteristics are of interest to the research team. Based on results obtained from a sample, researchers may draw their conclusions almost the target population with a certain level of confidence, post-obit a process called statistical inference. When the sample contains fewer individuals than the minimum necessary, but the representativeness is preserved, statistical inference may be compromised in terms of precision (prevalence studies) and/or statistical power to detect the associations of interest.^one On the other hand, samples without representativeness may not be a reliable source to describe conclusions near the reference population (i.eastward., statistical inference is non deemed possible), even if the sample size reaches the required number of participants. Lack of representativeness can occur as a result of flawed pick procedures (sampling bias) or when the probability of refusal/not-participation in the study is related to the object of research (nonresponse bias).^i,2

Although well-nigh studies are performed using samples, whether or not they represent whatsoever target population, census-based estimates should be preferred whenever possible.^iii,4 For case, if all cases of melanoma are bachelor on a national or regional database, and data on the potential risk factors are also available, it would be preferable to carry a demography instead of investigating a sample.

However, in that location are several theoretical and practical reasons that prevent us from carrying out census-based surveys, including:

Ethical issues: information technology is unethical to include a greater number of individuals than that effectively required;
Budgetary limitations: the high costs of a census survey oft limits its use equally a strategy to select participants for a study;
Logistics: censuses often impose great challenges in terms of required staff, equipment, etc. to deport the study;
Fourth dimension restrictions: the corporeality of fourth dimension needed to programme and conduct a census-based survey may be excessive; and,
Unknown target population size: if the study objective is to investigate the presence of premalignant skin lesions in illicit drugs users, lack of information on all existing users makes it impossible to conduct a census-based study.

All these reasons explain why samples are more frequently used. However, researchers must be aware that sample results can be affected by the random error (or sampling error).³ To exemplify this concept, we will consider a research study aiming to estimate the prevalence of premalignant peel lesions (outcome) among individuals >18 years residing in a specific city (target population). The city has a total population of 4,000 adults, but the investigator decided to collect data on a representative sample of 400 participants, detecting an eight% prevalence of premalignant skin lesions. A week subsequently, the researcher selects another sample of 400 participants from the same target population to ostend the results, but this fourth dimension observes a 12% prevalence of premalignant skin lesions. Based on these findings, is it possible to assume that the prevalence of lesions increased from the starting time to the second week? The answer is probably not. Each fourth dimension we select a new sample, it is very likely to obtain a dissimilar result. These fluctuations are attributed to the "random error." They occur because individuals composing different samples are not the aforementioned, fifty-fifty though they were selected from the aforementioned target population. Therefore, the parameters of interest may vary randomly from i sample to another. Despite this fluctuation, if it were possible to obtain 100 different samples of the same population, approximately 95 of them would provide prevalence estimates very close to the real guess in the target population - the value that nosotros would observe if nosotros investigated all the 4,000 adults residing in the city. Thus, during the sample size estimation the investigator must specify in advance the highest or maximum adequate random error value in the written report. Most population-based studies use a random error ranging from 2 to 5 percentage points. Nonetheless, the researcher should be aware that the smaller the random error considered in the study, the larger the required sample size.¹

SAMPLE FRAME

The sample frame is the group of individuals that can be selected from the target population given the sampling process used in the study. For instance, to identify cases of cutaneous melanoma the researcher may consider to utilize equally sample frame the national cancer registry system or the anatomopathological records of skin biopsies. Given that the sample may stand for just a portion of the target population, the researcher needs to examine carefully whether the selected sample frame fits the study objectives or hypotheses, and especially if there are strategies to overcome the sample frame limitations (see Chart i for examples and possible limitations).

Chart i

Examples of sample frames and potential limitations equally regards representativeness

Sample frames	Limitations
Population demography	• If the census was not conducted in contempo years, areas with high migration might be outdated
Population demography	• Homeless or afoot people cannot be represented

Hospital or Health Services records	• Ordinarily include only data of affected people (this is a limitation, depending on the study objectives)
	• Depending on the service, data may be incomplete and/or outdated
	• If the lists are from public units, results may differ from those who seek private services

School lists	• School lists are currently available only in the public sector
	• Children/ teenagers non attention school will not be represented
	• Lists are rapidly outdated
	• There will be problems in areas with high percent of school absenteeism

Listing of phone numbers	• Several population groups are not represented: individuals with no phone line at dwelling (depression-income families, immature people who utilize only cell phones), those who spend less fourth dimension at home, etc.

Mailing lists	• Individuals with multiple email addresses, which increase the gamble of selection compared to individuals with merely one address
Mailing lists	• Individuals without an email address may exist different from those who have information technology, according to age, education, etc.

SAMPLING

Sampling can be defined equally the procedure through which individuals or sampling units are selected from the sample frame. The sampling strategy needs to exist specified in advance, given that the sampling method may affect the sample size estimation.^one,v Without a rigorous sampling plan the estimates derived from the study may be biased (selection bias).³

TYPES OF SAMPLING

In figure 1, nosotros describe a summary of the main sampling types. There are two major sampling types: probabilistic and nonprobabilistic.

An external file that holds a picture, illustration, etc. Object name is abd-91-03-0326-g01.jpg

Sampling types used in scientific studies

NONPROBABILISTIC SAMPLING

In the context of nonprobabilistic sampling, the likelihood of selecting some individuals from the target population is naught. This blazon of sampling does not render a representative sample; therefore, the observed results are ordinarily not generalizable to the target population. Still, unrepresentative samples may be useful for some specific research objectives, and may aid respond particular enquiry questions, as well as contribute to the generation of new hypotheses.⁴ The unlike types of nonprobabilistic sampling are detailed below.

Convenience sampling: the participants are consecutively selected in guild of apperance according to their user-friendly accessibility (also known every bit consecutive sampling). The sampling process comes to an cease when the total corporeality of participants (sample saturation) and/or the time limit (fourth dimension saturation) are reached. Randomized clinical trials are usually based on convenience sampling. After sampling, participants are ordinarily randomly allocated to the intervention or control group (randomization).ⁱⁱⁱ Although randomization is a probabilistic process to obtain ii comparable groups (treatment and command), the samples used in these studies are generally not representative of the target population.

Purposive sampling: this is used when a diverse sample is necessary or the stance of experts in a particular field is the topic of interest. This technique was used in the written report by Roubille et al, in which recommendations for the treatment of comorbidities in patients with rheumatoid arthritis, psoriasis, and psoriatic arthritis were made based on the opinion of a group of experts.^six

Quota sampling: according to this sampling technique, the population is first classified past characteristics such every bit gender, age, etc. Later on, sampling units are selected to complete each quota. For example, in the written report by Larkin et al., the combination of vemurafenib and cobimetinib versus placebo was tested in patients with locally-avant-garde melanoma, stage IIIC or 4, with BRAF mutation.⁷ The study recruited 495 patients from 135 wellness centers located in several countries. In this type of written report, each center has a "quota" of patients.

"Snowball" sampling: in this case, the researcher selects an initial group of individuals. Then, these participants indicate other potential members with like characteristics to take part in the study. This is often used in studies investigating special populations, for example, those including illicit drugs users, as was the case of the study by Gonçalves et al, which assessed 27 users of cocaine and crack in combination with marijuana.^eight

PROBABILISTIC SAMPLING

In the context of probabilistic sampling, all units of the target population have a nonzero probability to have office in the study. If all participants are equally likely to exist selected in the study, equiprobabilistic sampling is existence used, and the odds of being selected by the research team may be expressed by the formula: P=ane/N, where P equals the probability of taking part in the written report and N corresponds to the size of the target population. The main types of probabilistic sampling are described below.

Unproblematic random sampling: in this case, we have a full list of sample units or participants (sample footing), and we randomly select individuals using a table of random numbers. An example is the study by Pimenta et al, in which the authors obtained a listing from the Health Department of all elderly enrolled in the Family Health Strategy and, by simple random sampling, selected a sample of 449 participants.^nine

Systematic random sampling: in this instance, participants are selected from fixed intervals previously defined from a ranked list of participants. For example, in the written report of Kelbore et al, children who were assisted at the Pediatric Dermatology Service were selected to evaluate factors associated with atopic dermatitis, selecting always the second child by consulting order.¹⁰

Stratified sampling: in this type of sampling, the target population is first divided into separate strata. Then, samples are selected inside each stratum, either through simple or systematic sampling. The full number of individuals to exist selected in each stratum can be fixed or proportional to the size of each stratum. Each private may be equally likely to exist selected to participate in the written report. Notwithstanding, the fixed method usually involves the utilize of sampling weights in the statistical analysis (inverse of the probability of option or 1/P). An instance is the study conducted in Due south Australia to investigate factors associated with vitamin D deficiency in preschool children. Using the national census as the sample frame, households were randomly selected in each stratum and all children in the age grouping of involvement identified in the selected houses were investigated.¹¹

Cluster sampling: in this blazon of probabilistic sampling, groups such as health facilities, schools, etc., are sampled. In the to a higher place-mentioned study, the choice of households is an example of cluster sampling.¹¹

Complex or multi-stage sampling: This probabilistic sampling method combines unlike strategies in the pick of the sample units. An example is the study of Duquia et al. to assess the prevalence and factors associated with the utilise of sunscreen in adults. The sampling process included two stages.¹² Using the 2000 Brazilian demographic census every bit sampling frame, all 404 demography tracts from Pelotas (Southern Brazil) were listed in ascending order of family income. A sample of 120 tracts were systematically selected (get-go sampling stage units). In the second stage, 12 households in each of these census tract (second sampling stage units) were systematically fatigued. All developed residents in these households were included in the report (third sampling stage units). All these stages accept to be considered in the statistical assay to provide correct estimates.

NONRESPONDENTS

Frequently, sample sizes are increased by 10% to compensate for potential nonresponses (refusals/losses).¹ Allow u.s.a. imagine that in a report to appraise the prevalence of premalignant skin lesions there is a higher percent of nonrespondents among men (10%) than among women (1%). If the highest percentage of nonresponse occurs because these men are not at home during the scheduled visits, and these participants are more likely to be exposed to the lord's day, the number of skin lesions will be underestimated. For this reason, it is strongly recommended to collect and describe some basic characteristics of nonrespondents (sexual practice, age, etc.) and then they can exist compared to the respondents to evaluate whether the results may have been affected by this systematic error.

Often, in study protocols, refusal to participate or sign the informed consent is considered an "exclusion criteria". Withal, this is not correct, as these individuals are eligible for the report and need to be reported as "nonrespondents".

SAMPLING METHOD ACCORDING TO THE Type OF STUDY

In full general, clinical trials aim to obtain a homogeneous sample which is not necessarily representative of whatever target population. Clinical trials frequently recruit those participants who are nigh likely to do good from the intervention.³ Thus, the more strict criteria for inclusion and exclusion of subjects in clinical trials often brand it difficult to locate participants: after verification of the eligibility criteria, just 1 out of ten possible candidates will enter the study. Therefore, clinical trials commonly show limitations to generalize the results to the entire population of patients with the illness, but only to those with similar characteristics to the sample included in the study. These peculiarities in clinical trials justify the necessity of conducting a multicenter and/or global studiesto accelerate the recruitment rate and to reach, in a shorter time, the number of patients required for the study.^xiii

In plow, in observational studies to build a solid sampling plan is important because of the swell heterogeneity usually observed in the target population. Therefore, this heterogeneity has to exist too reflected in the sample. A cross-sectional population-based study aiming to assess disease estimates or identify gamble factors often uses complex probabilistic sampling, because the sample representativeness is crucial. However, in a case-control study, we face the challenge of selecting two dissimilar samples for the same study. One sample is formed by the cases, which are identified based on the diagnosis of the illness of interest. The other consists of controls, which need to be representative of the population that originated the cases. Improper option of command individuals may introduce choice bias in the results. Thus, the business organisation with representativeness in this type of study is established based on the human relationship between cases and controls (comparability).

In accomplice studies, individuals are recruited based on the exposure (exposed and unexposed subjects), and they are followed over time to evaluate the occurrence of the consequence of interest. At baseline, the sample tin be selected from a representative sample (population-based cohort studies) or a non-representative sample. Notwithstanding, in the successive follow-ups of the cohort member, written report participants must exist a representative sample of those included in the baseline.^fourteen,15 In this type of study, losses over time may cause follow-up bias.

CONCLUSION

Researchers demand to decide during the planning stage of the study if they volition piece of work with the entire target population or a sample. Working with a sample involves different steps, including sample size estimation, identification of the sample frame, and selection of the sampling method to be adopted.

Footnotes

Financial Support: None.

^*Study performed at Faculdade Meridional - Escola de Medicina (IMED) - Passo Fundo (RS), Brazil.

REFERENCES

one. Martínez-Mesa J, González-Chica DA, Bastos JL, Bonamigo RR, Duquia RP. Sample size: how many participants do I need in my research? An Bras Dermatol. 2014;89:609–615. [PMC complimentary commodity] [PubMed] [Google Scholar]

two. Röhrig B, du Prel JB, Wachtlin D, Kwiecien R, Blettner M. Sample size calculation in clinical trials: part 13 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2010;107:552–556. [PMC free commodity] [PubMed] [Google Scholar]

iii. Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques for clinical research. Ann Indian Acad Neurol. 2011;fourteen:287–290. [PMC free commodity] [PubMed] [Google Scholar]

four. Rothman KJ, Gallacher JE, Hatch EE. Why representativeness should exist avoided. Int J Epidemiol. 2013;42:1012–1014. [PMC free article] [PubMed] [Google Scholar]

v. Krause Yard, Lutz W, Boehnke JR. The role of sampling in clinical trial design. Psychother Res. 2011;21:243–251. [PubMed] [Google Scholar]

half dozen. Roubille C, Richer Five, Starnino T, McCourt C, McFarlane A, Fleming P, et al. Bear witness-based Recommendations for the Management of Comorbidities in Rheumatoid Arthritis, Psoriasis, and Psoriatic Arthritis: Expert Opinion of the Canadian Dermatology-Rheumatology Comorbidity Initiative. J Rheumatol. 2015;42:1767–1780. [PubMed] [Google Scholar]

seven. Larkin J, Ascierto PA, Dréno B, Atkinson Five, Liszkay Thou, Maio M, et al. Combined vemurafenib and cobimetinib in BRAF-mutated melanoma. N Engl J Med. 2014;371:1867–1876. [PubMed] [Google Scholar]

8. Goncalves JR, Nappo SA. Factors that lead to the use of crevice cocaine in combination with marijuana in Brazil: a qualitative study. BMC Public Health. 2015;xv:706–706. [PMC free article] [PubMed] [Google Scholar]

ix. Pimenta FB, Pinho L, Silveira MF, Botelho Air conditioning. Factors associated with chronic diseases among the elderly receiving treatment under the Family Health Strategy. Cien Saude Colet. 2015;20:2489–2498. [PubMed] [Google Scholar]

x. Kelbore AG, Alemu W, Shumye A, Getachew Southward. Magnitude and associated factors of Atopic dermatitis amidst children in Ayder referral infirmary, Mekelle, Ethiopia. BMC Dermatol. 2015;15:15–15. [PMC free article] [PubMed] [Google Scholar]

xi. Zhou SJ, Skeaff Thousand, Makrides K, Gibson R. Vitamin D status and its predictors amidst pre-schoolhouse children in Adelaide. J Paediatr Child Health. 2015;51:614–619. [PubMed] [Google Scholar]

12. Duquia RP, Menezes AM, Almeida HL, Jr, Reichert FF, Santos Ida S, Haack RL, et al. Prevalence of sun exposure and its associated factors in southern Brazil: a population-based study. An Bras Dermatol. 2013;88:554–561. [PMC complimentary commodity] [PubMed] [Google Scholar]

13. Barrios CH, Werutsky M, Martinez-Mesa J. The global conduct of cancer clinical trials: challenges and opportunities. Am Soc Clin Oncol Educ Volume. 2015:e132–e139. [PubMed] [Google Scholar]

xiv. Victora CG, Barros FC. Cohort contour: the 1982 Pelotas (Brazil) nascency cohort study. Int J Epidemiol. 2006;35:237–242. [PubMed] [Google Scholar]

xv. Boing Air conditioning, Peres KG, Boing AF, Hallal PC, Silva NN, Peres MA. EpiFloripa Health Survey: the methodological and operational aspects behind the scenes. Rev Bras Epidemiol. 2014;17:147–162. [PubMed] [Google Scholar]

bernardpeasse1944.blogspot.com

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4938277/