In research, some units of the analysis are more difficult to study than others. In survey research, where people are the units of analysis, some people (or more generally, groups of people with certain characteristics) are more difficult to study than others, and the term “hard-to-reach” has been used to invoke the resources needed to sample, recruit, and secure participation from individuals with these characteristics  . This may be due to their geographical or physical location, or to their social and economic situation  ; these populations have also been described as “vulnerable” or “hidden” (although see  , for a critique of these terms).
1.1. Hard-to-Reach Populations
Lambert and Wiebel  note that hard-to-reach populations might benefit most from knowledge about them. For example, racial minorities or ethnicities and people of low socioeconomic status (SES), who have been described as hard-to- reach    , experience poorer health status   and poorer educational achievement    than non-minorities and people of higher SES.
Several factors may account for the difficulty of conducting research on these populations. For remote or geographically isolated populations, traveling to con- duct the research, coupled with possible language and cultural barriers, may be prohibitively expensive (for examples, see  ). However, even research using easily-identified local populations in the United States (e.g., inner-city adolescents) can be difficult.
The result is missing data, and despite best intentions of researchers, missing data are common in survey research involving hard-to-reach populations. This may take either (or both) of two forms: (a) lack of participation by cases identified in the sampling frame (or that would have been identified had the sampling frame been known) and targeted for recruitment; and (b) participant dropout in longitudinal studies (A third form of missing data, failure to answer specific questions in a survey, will not be addressed here; indeed, others argue that this form of missing data is “little more than a nuisance”  ). Although they are typically discussed as different issues, (a) and (b) are two sides of the same coin: failure to obtain data from targeted research participants either at the point of initial recruitment or at the point of longitudinal follow up. Missing data does not, by itself, pose a threat to the validity of study results. However, when reasons for missingness are correlated with variables to be studied, selection bias results and threatens validity.
Consider, for example, how missing data might occur in a study inner-city adolescents. Inner-city neighborhoods are, by definition, impoverished; and given the overlap of poverty and race/ethnicity in the U.S.   , they are heavily populated by racial and ethnic minorities. Some have studied the “hard-to-reach” nature of the Black American research experience (e.g.,    ). And several studies have demonstrated lower general levels of trust among people of low SES and racial/ethnic minorities   , including distrust of researchers (see  ). Thus, response rates may be low in research conducted in inner-city neighborhoods.
Impoverished households are also more prone to residential mobility  , making it more difficult to identify an accurate sampling frame when the research population is geographically defined, and to find research participants (either at enrollment or follow up) once a sampling frame has been established. Perhaps even more problematic is the violence and victimization that is often common in inner-city neighborhoods   ; this may lead researchers to simply forego studying inner-city residents, or target neighborhoods with less violence, given the risk of injury to members of the research team. Alternatively, researchers may choose to study inner-city residents in safe environments (e.g., schools, clinics) or using safer research venues (e.g., telephone or on-line surveys); yet, each of these approaches risks missing significant and important segments of the population (e.g., students prone to chronic absenteeism or who have dropped out of school; adults who do not have working phones).
Beyond the risks of selection bias in studies that focus on hard-to-reach populations, a more insidious problem involving exclusion of underrepresented po- pulations has been noted   . The National Institutes of Health require underrepresented populations (i.e., women and people of color) to be included in studies it funds  . While this acknowledges the problem, it is unclear whether it solves the problem. Vulnerability is nested, particularly within hard- to-reach populations. As noted, poverty tends to be nested within race, as are high residential mobility rates. Poor health is further nested within race, largely because of poverty. Social isolation is likely nested within all of these conditions as well. Thus, a mandate to include underrepresented populations does not guarantee that the most vulnerable members of these disadvantaged populations will be included in research. And as a result, even studies of disadvantage may to be biased, with important negative consequences.
1.2. Missing Data: The Problem and Approaches to Address It
Of course, the easiest way to address missing data is to avoid it. Community- based participatory research    , participatory action research   , and similar approaches  have been suggested a way to accomplish this. These designs involve the community of interest in the study from the point of original study inception (including choice of research questions and wording of survey questions), and the generated buy-in from members of the community make their participation more likely. However, this approach is limited by the fact that many study designs cannot be changed to reflect community input; moreover, even with high levels of community buy-in, missing data is inevitable.
An alternative approach involves data analysis. While a number of data analysis procedures accommodate missing data (e.g., multiple imputation, maximum likelihood), they all assume that missing data are ignorable. In all cases, missing data are non-ignorable if missingness (R’) depends on the outcome variable (Y); to the extent that missing data are non-ignorable (also termed Missing Not at Random, MNAR), results are potentially biased   . In contrast, missing data are ignorable if R’ does not depend on the predictor variable(s) X itself (missing at random, MAR), or R’ does not depend on Y itself (missing completely at random, MCAR). Essentially, if a random sample is strictly representative of the population that generated it, any unsampled cases (cases that are identified and recruited but who do not participate) are either MAR or MCAR and therefore ignorable. On the other hand, if these cases are MNAR, the sample is not representative. The concept of ignorability has been used most typically for dropout in longitudinal studies, but it is equally applicable to enrollment and representativeness (e.g., in a representative sample, missingness is ignorable).
Of course, it is difficult (if not impossible) to determine definitively whether or not the outcome or predictor variables are related to missingness, since by definition they are not measured for missing cases. While empirical means of exploring whether missing data are ignorable have been developed, (e.g.,   ), they are strictly exploratory; they often rely on tenuous assumptions about the missing observations; they are conceptually and mathematically inaccessible to most researchers; and they have not been incorporated into any statistical analysis software. Some additional papers have developed procedures to analyze datasets with non-ignorable data (e.g.,    ; however, these typically also make assumptions about the data and are not available through data analysis software. As an alternative, many papers (e.g.,    ) that use multiple imputation or maximum likelihood to accommodate missing data justify their decision based on papers that have logically (with some empirical support) examined whether the ignorability assumption is plausible. The paper most often used for this purpose is by Graham and colleagues  , who argue that missing data in school-based studies are seldom MNAR; others (e.g.,  ) are not so sanguine.
Graham and colleagues  use a combination of data analysis and thought experiment to reach their conclusion. The empirical component of their paper uses a common practice of estimating the impact of missing data based on temporally proximal observations as proxy measures of unobserved data; based on this approach, they found little evidence that the missing data in their study were MNAR. While perhaps the best available substitute for the true values of unobserved data, they are not the same―particularly for vulnerable populations, where the vagaries of life reduce consistency of attitudes, beliefs, and behaviors across time   . Graham and colleagues’ thought experiment considers multiple random factors that might affect school attendance, and hence participation in a school-based survey. While their assumptions very likely hold for non-poverty samples, they are less likely to hold for inner-city samples, where the effects of poverty (e.g. residential mobility; household stress; poor health) may influence school attendance. These consequences of poverty are not universal within this population, however; nor are they constant across time for any given case. Thus, the effects of poverty may manifest themselves for any given student at any given time as risk behavior and missing data. That is, the greater the poverty, and the harder-to-reach the population, the greater the threat to ignorability of missing data.
None of this means that the conclusion reached by Graham and colleagues  is necessarily wrong when applied to hard-to-reach populations. But as the vulnerability of any given population increases, the applicability of their conclusion is increasingly questionable and should not, by itself, be accepted as support for treating missing data as ignorable. Moreover, the conclusion reached by Graham and colleagues is limited to school-based studies and does not necessarily apply to community-based studies. All of these issues raise a critical question: Can research be conducted with hard-to-reach populations without selection bias?
2. The Present Study
The present study addresses this question through the analysis of Mobile Youth Survey (MYS) data; the MYS was a community-based longitudinal study of poverty and adolescent risk conducted in inner-city community-based sample in Mobile, Alabama. As is typical of such a study of this type, the sampling frame was poorly defined and high rates of missing data occurred. However, the study was able to access complete public school records for both study participants and potential participants. Thus, auxiliary data were available for both observed and missing cases each year of the study. This information is important, in that it extends the findings of Graham and colleagues  to community-based studies of hard-to-reach populations; in doing so, it provides researchers conducting similar studies with additional results that can be used in considering whether missing data in their studies are ignorable. Second, information can be used by researchers without access to auxiliary data to support assumptions that must be made concerning whether or not missing data are ignorable.
The aim of this study is to determine whether systematic differences exist between the sample of students who (a) enrolled in the MYS (E) and those who did not, (b) participated in the MYS during any given year (P) and those who did not, and (c) were retained as MYS participants during consecutive waves of data collection (R) and those who were not. In the following notation, i denotes a person (or case), while j and k denote years (or waves) in the longitudinal data collection sequence. We define participation as a matrix P, where pij = 1 if person i participated in the MYS during year j. We define enrollment as a vector e, where ei = 1 if pij = 1 for any j and ei = 0 if pij = 0 for every j. Generally speaking, E allows us to examine the representativeness of the sample, while P allows us to examine within-case year-to-year variability in representativeness. We define wave-to-wave retention as a matrix R (which varies by adjacent data waves j and k) where rijk = 1 if pij = 1 and pik = 1; rijk = 0 if pij = 1 and pik = 0; and rijk is undefined if pij = 0 or i fails to meet the inclusion criteria (see the Methods Section) for either year j or year k). R allows us to examine whether missing data due to dropout are informative. In conducting these analyses, we examine how both demographic (i.e., age, sex, race) and functional (i.e., cognitive and behavioral) variables are related to missingness.
3.1. Mobile Youth Survey
The Mobile MSA has a population of 540,258. The largest city within this MSA is Mobile, which has a population of approximately 200,000. In 2000, 46.1% of the residents of Mobile were Black American and 22.4% of this population lived in poverty, where the median household income was $31,445  . Prichard, the second largest city in the Mobile MSA, borders Mobile to the north. In 2000, 83.3% of its residents were Black American and 44.1% of this population lived in poverty, with a median household income of $19,544  .
3.2. Study Sample
A total of 8708 adolescents enrolled in the MYS between 1998 and 2007; this constitutes the core sample for the study, although noted in the Inclusion section, the final sample size is somewhat smaller. In 1998, the initial wave of MYS data collection was conducted in the 13 most impoverished neighborhoods in the Mobile MSA; 98% or respondents reported that they were Black). In 1990, the population in these neighborhoods was over 95% Black; median household income was $5190 and the poverty rate was 73%  .
When it began, the sampling frame for the MYS was 10 - 18 year-old adolescents who lived in these neighborhoods, with the goal of recruiting the entire eligible population of eligible adolescents. This goal was pursued using both an active (random) and passive (non-random) recruitment strategy (see  for additional detail). Since a large number of MYS participants were not contacted through active recruitment and the final sample fell well short of the full population, the MYS sample cannot be viewed as randomly selected.
In 1998, 1771 respondents completed the MYS. The response rate is difficult to calculate because a definitive census of the sampling frame was not available. However, a conservative estimate of a response rate among those actively recruited is 60%  . During each subsequent year, previous participants who remained age-eligible were actively recruited to participate, even if residential mobility took respondents beyond one of the 13 target neighborhoods. By 2005, 38.8% of the MYS participants lived outside of the original target neighborhoods. Residential relocation tended to be clustered, leading to the identification of “expansion neighborhoods”.
Research participants were paid for their time each year ($10 between 1998 and 2004; $15 after 1994). The MYS typically required between 60 and 90 minutes completing.
3.3. Inclusion Criteria
The first criterion for inclusion in this study is that youths must be students in the Mobile County Public School System (MCPSS). MCPSS records were used in this study as an auxiliary dataset to assess the characteristics of cases with missing data (including those who were eligible to participate but were never enrolled in the study); therefore, only MCPSS respondents who could be matched to school records were selected for analysis in this study. Ninety percent of all MYS participants were matched to school records; these were supplemented with non-MYS participants who lived in MYS neighborhoods and attended the same schools as MYS participants. However, among the 10% of enrollees who were not matched to school records, approximately half could not be verified by any other source (i.e., records from the Mobile County Juvenile Court, the Mobile Housing Authority, the Mobile Police Department’s Family Intervention Team program for at-risk youths; the Lexis Nexispublic records database), and we assume that these cases are bogus (some adolescents may have given fake names so as to be allowed to participate multiple times and receive multiple cash incentives for their participation). Thus, we obtain an effective match to MCPSS records of approximately 95%. An analysis of Family Intervention Team program records shows that only five of 656 (0.7%) MYS enrollees in the program attended non-public schools, and none were home schooled. This is consistent with national statistics, showing that (a) fewer than 5% of Black American youths living in households with annual incomes less than $20,000 attended private schools  , and (b) the homeschooling rate for youths living in poverty is 0.7%  ; the lower private and home school estimates for the MYS sample may reflect the extreme levels of poverty in which MYS enrollees live.
The second inclusion criterion involves age. Public school records provide more-or-less complete data for adolescents living in the MYS neighborhoods, with one major caveat: Alabama law allows students to drop out of school when they turn 16; thus age is used as an exclusionary factor in this study. While age was limited by the MYS’ design (eligibility of participants limited to 10 to 18 years old), in the current study, data are limited to students aged 10 through 15. Because students are legally required to attend school through the age of 15, a neighborhood sample of youths under age 16 (adjusted for students who do not attend private schools and are not home schooled) and the MCPSS census of students under age 16 are coincident. Because there is little home-schooling and private-school attendance in these neighborhoods, the adjustment makes little difference. Beyond age 15, however, the neighborhood youth census and the MCPSS census begin to deviate, due to the possibility of school dropout; since dropouts are more likely to engage in risk behaviors   and hold deviant attitudes or beliefs   than non-dropouts, including youths aged 16 through 18 in the study would introduce bias. Age eligibility was calculated by date of birth from MCPSS records. Students were eligible if they turned 10 by August 15 of the school year and were not eligible if turned 16 after May 16th of the school year.
The third inclusion criterion involves neighborhood. Addresses of MYS participants were geocoded using GIS software, and MYS neighborhoods were identified as geographical areas bounded by major made-made (streets, railroad tracks) or natural (bodies of water) barriers where MYS participants were clustered. Schools serving these neighborhoods that were attended by five or more MYS participants were selected for study. Addresses of all appropriately-aged (i.e., <16) students attending these schools were geocoded, and those living in the MYS neighborhoods were included in this study (these included both MYS participants and non-participants). Through the use of this inclusion criterion, we eliminated geographical outliers, who may also have been statistical outliers in terms of their characteristics.
Based on these inclusion criteria, the final sample consisted of 7142 MYS enrollees and 25,442 students who were not MYS enrollees but who lived in MYS neighborhoods.
3.4. Demographic Characteristics
Demographic variables examined in this study include sex, race, free lunch eligibility status, and grade, all based on MCPSS records. Sex is straightforward, and for analysis purposes, girl = 1 and boy = 0. Race/ethnicity was identified in these records as Black, White, Asian, Native American, Hispanic, and unspecified. Overall, 99.8% of MYS enrollees were classified as Black or White; this is consistent with MYS self-reported race, where 99.8% also reported themselves to be Black or White. We therefore treated cases that were coded by MCPSS as Asian, Native American, Hispanic, or unspecified as missing. For analysis purposes, Black = 1 and White = 0. School lunch status was coded in the MCPSS records as free, reduced-cost, and paid, with the vast majority of MCPSS participants receiving free lunches; for convenience, and because of the small number of MYS participants who were in the other two categories, the categories were combined to yield a dichotomous measure: “Free” and “Not Free” status. For analysis purposes, free lunch = 1 and paid lunch = 0. Free lunch status was a proxy for vulnerability associated with low SES. For ease of interpretation, grade was centered for analysis purposes at 3rd grade = 0. Finally, 48 MYS neighborhoods were identified; these were divided into two groups: original target neighborhoods and expansion neighborhoods.
3.5. Functional Characteristics
In this study, functional characteristics include school achievement, as indicated by Stanford Achievement Test (SAT) percentile ranks, and by violation of school code-of-conduct; each is a component of MCPSS student records.
Student achievement in Alabama was assessed annually using the SAT. The SAT, 9th edition was completed by 3rd through 11th grade students through the 2001-2002 academic year. Beginning in spring 2003, the SAT, 10th edition, was administered annually to 3rd through 8th grade students. While results for the SAT 9th edition are not directly comparable to results from the SAT 10th edition, this study compares students within year; our use of standardized percentile scores across years makes such comparisons less problematic. The reading and math subscales of the SAT are used in these analyses.
The MCPSS records school violations of school code of conduct. Violations were classified as A, B, C, D, or E violations, in order of severity, with “A” violations indicating minor infractions (e.g., not wearing clothes conforming to a school’s color code) and “E” violations indicating major infractions (e.g., bringing a gun to school). As an overall measure, violations were weighted for severity (A = 1, B = 2, C = 3, D = 4, E = 5) and summed for each student each year.
3.6. Analysis Plan
In this study, we conduct statistical analyses to determine whether demographic and functional characteristics affect enrollment, participation, and retention. We also examine the magnitude of each observed effect to determine how much it might potentially have biased the MYS sample. Each of the ten waves of MYS data was paired to the subsequent school year (e.g., 1998 MYS (wave 1) corresponds to the 1998-1999 school year). While such comparisons cannot indicate whether the missing data are ignorable or non-ignorable (no analysis can strictly demonstrate this), they can provide an indication of whether the MYS sample is representative of the larger neighborhood population from which it was drawn on indicators that reflect potential sources of bias.
Strictly speaking, a sample is representative only if its characteristics do not differ from the population in terms of specific study variables. This requirement can be relaxed to some extent, although conclusions about representativeness are stronger as the characteristics used to assess representativeness align more closely with the study variables. Thus, while representativeness does not guarantee MAR or MCAR, these conditions do guarantee representativeness; but as the characteristics used to assess representativeness more closely approximate study variables, any loss of correspondence between representativeness and MAR or MCAR will have decreasing importance and can be safely ignored.
To assess whether missingness in MYS data were affected by demographic and functional characteristics (i.e., whether the MYS sample is representative of the population on these variables), models were estimated using a Generalized Estimating Equations (GEE) framework   , as implemented in SAS procgenmod. Because all of the outcome variables (E, P, and R) were dichotomous, the models were estimated using a logit link function and a binomial error distribution. We specified the working longitudinal correlation structure as first order autoregressive. For each outcome variable, two analyses were conducted. First (Model 1), the outcome variable was regressed on the demographic characteristics; second (Model 2), functional characteristics were added to the previous model. Since neighborhood is completely nested within neighborhood type, we included that nesting in the models. Because of the large number of neighborhoods, we treat it as a statistical control, reporting only omnibus p values derived from a Type 3 test for each outcome variable.
There is no general agreement on how to calculate effect sizes for GEE models, so we calculated means for different levels of each characteristic as a proxy. For categorical predictors, these are least squares estimates of the probability of the outcome (e.g., enrollment) for each category of the predictor variable. For continuous predictors, these are probability estimates of the outcome variable when the predictor is one standard deviation below and above the mean, with all other variables held at their means (or modal category).
Because differences between raw probabilities are difficult to interpret (e.g., a difference [Δp] of 0.05 is more meaningful in the tail of a distribution than at its center), we converted these differences into a pseudo-measure (h’) of effect size, using a procedure suggested by Cohen  :
Note that h’ is not equivalent Cohen’s measure of effect size (h) for the simple test of proportions, since h does not take into account the complexity of repeated measures or multiple covariates, nor is it meant to compare two points in a continuous distribution (e.g., M ± S). Nonetheless, it provides a guide for interpreting the magnitude of reported effects. Cohen specifies small effect sizes in the range of h = 0.2, medium effect sizes in the range of h = 0.5, and large effect sizes in the range of h = 0.8.
Table 1 provides descriptive statistics for the variables that structure the analyses reported in this section: enrollment, participation, and retention. Table 1(a) shows the frequency distributions for categorical variables and the means and standard deviations for continuous variables for (a) the entire sample; (b) students enrolled in the MYS; and (c) students not enrolled in the MYS. The first thing to note is that for the total sample 107,689 observations are nested in 32,584 students. Observations consist of annual slices of data derived from school records for each student during the years that they were age eligible for inclusion in the study. New students entered the sample each year and remained in the sample until they moved to a non-MYS neighborhood or aged out of the study. Thus, students in the sample had an average of 3.30 observations. Second, it is clear that observations are not independent. Thus, the statistics reported in the table should be interpreted cautiously. But they nonetheless show that overall, the students in the sample were overwhelmingly Black; they were evenly split between girls and boys; they were living in poverty; their performance on the SAT was well below average; and they were caught violating the MCPSS Code of Conduct with some frequency. Table 1(a) also provides a comparison of demographic and functional characteristics of MYS enrollees and students living in MYS neighborhoods who were not enrolled in the MYS. It is important to recognize that MYS enrollees remain in that classification each year for which MCPSS records are available for them, even if they did not actually participate in the MYS that year.
Table 1(b) shows statistics for the demographic and functional variables as a function of participatory status. Panel A is the same as Panel A in Table 1(a), since the entire sample can be divided participatory groupings. However, unlike enrollment, a student enrolled in the MYS does not automatically retain MYS participant status each year; during those years that they participated in the MYS, they were classified as MYS participants, but during those years that they did not they were classified as non-participants. This is reflected in the lower number of observations for MYS participants (14,448) than for MYS enrollees (27,761).
Table 1(c) shows statistics for the demographic and functional variables as a function of retention status. This table only includes data for MYS enrollees who
(a)aObservations indicate the number of discrete data points obtained over multiple years. Because each person potentially contributed data during multiple years. Observations ≥ N. bMean. cStandard Deviation.
(b)aObservations indicate the number of discrete data points obtained over multiple years. bMean. cStandard Deviation.
(c)aObservations indicate the number of discrete data points obtained over multiple years. bMean. cStandard Deviation.
Table 1. (a) Descriptive statistics by MYS enrollment status. (b) Descriptive statistics by MYS participation status. (c) Descriptive statistics by MYS retention status.
eligible to participate in the MYS during at least one pair of consecutive years. Observations reflect MCPSS records for the second of each pair of eligible years. Students who participated during the second year were classified as retained for that year, and students who did not participate during the second year were classified as not retained. Thus, the same student could be classified in each of the two groups for different years. The number of observations is considerably smaller than for enrollment and retention (10,691 total observations for retention versus 101,874 for enrollment and participation). As with Table 1(a) and Table 1(b), the statistics reported here are based on non-independent samples and should be interpreted cautiously.
Table 2 shows results for enrollment. For Model 1, race, lunch status, and grade level were all statistically significant predictors of enrollment. With respect to race, Black Americans were more likely to enroll in the MYS than White students (p = 0.218 versus p = 0.083). Free lunch recipients were more likely to enroll than those who did not receive free lunch (p = 0.143 versus p = 0.128). Grade level was positively related to MYS enrollment, with p = 0.221 for students one standard deviation below the mean (M − S) and p = 0.238 for students one standard deviation above the mean (M + S).
Model 2 shows that after controlling for demographic factors, both reading
Table 2. Determinants of MYS enrollment and associated probabilities. Mode1 l: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood; Mode1 2: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood+ b6 SAT (R) + b7 SAT (M) + b8 violations.
aStandard Error. bLeast Squares means controlling for other variables in the equation; these have been converted to probabilities. cMean. dStandard Deviation.
and math SAT percentiles and school violations statistically affect the probability of MYS enrollment. For SAT percentile scores, the effect was negative. For reading, M ± S decreases from p = 0.251 for students one standard deviation below the reading mean to p = 0.227 for students one standard deviation above the reading mean; for math, M ± S decreases from p = 0.244 to p = 0.233. In contrast, weighted school violations have a positive effect on enrollment, with M ± S increasing from p = 0.234 (M − S) to p = 0.244 (M + S).
Table 2 also reports the least square means for the categorical variables (in this case, the probability of enrollment in the MYS given a particular characteristic and controlling for the other characteristics in the model); the probability of enrolling in the MYS when a given continuous characteristic is one standard below and one standard deviation above its mean; and the effect size (measured as h’). Of particular note is the small magnitude of most of the effect sizes; recall that Cohen  specifies small, medium, and large effect sizes as 0.20. 0.50, and 0.80). Only race (h’ = 0.387) even approaches a moderate effect size; the effect sizes of other significant predictors range between miniscule and very small (all h’s < 0.06).
In addition to the terms reported in Table 2, a supplemental analysis was run adding 12 interaction terms (four demographic variables by three functional variables) to the main effects in Model 2. Of these, two were statistically significant predictors of enrollment: sex × SAT reading percentile (Z = 1.98, p = 0.048) and grade × school violations (Z = 3.03, p = 0.002). In the interest of brevity, these interaction results are not included in Table 2.
By definition, participation rates are lower than enrollment rates (unless retention equals unity, which was not the case in the MYS, and rarely occurs in a community-based study). Table 3 shows that in Model 1, race, free lunch status, and grade were all statistically significant predictors of participation. Black students had higher participation rates than White students (p = 0.195 versus p = 0.015). Students receiving free lunch were more likely to participate than students who did not (p = 0.048 versus p = 0.030). And, students in higher grades were more likely to participate than those in lower grades (p = 0.119 versus p = 0.159 for students in grades one standard deviation below and one standard deviation above the grade mean, respectively).
Model 2 shows the effects of functional characteristics on participation controlling for demographic characteristics. As was the case with enrollment, both reading and math SAT percentile scores were negatively associated with participation. For reading, M ± S decreases from p = 0.150 for students one standard deviation below the reading mean to p = 0.122 for students one standard deviation
Table 3. Determinants of MYS participation and associated probabilities. Mode1 l: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood. Mode1 2: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood + b6 SAT (R) + b7 SAT (M) + b8 violations.
aStandard Error. bLeast Squares means controlling for other variables in the equation; these have been converted to probabilities. cMean. dStandard Deviation.
above the reading mean; for math, M ± S decreases from p = 0.144 to p = 0.127. Weighted school violations had a positive effect on enrollment, with M ± S increasing from p = 0.119 (M − S) to p = 0.154 (M + S).
Table 3 also reports estimated probability of annual participation given a particular characteristic as well as the effect sizes for these characteristics. Like enrollment, race is the only variable that approaches a moderate effect size (h’ = 0.381). All other effect sizes are very small (all h’s < 0.12).
As was the case with enrollment, a supplemental version of Model 2 was also run with the twelve demographic × functional interaction terms. Two were statistically significant predictors of participation: school lunch status × SAT reading percentile (Z = −2.53, p = 0.011) and grade × school violations (Z = 2.55, p = 0.011). These results are not reported in Table 3.
Table 4 shows results for retention. In Model 1, race and school lunch status were statistically significant. Retention for Black students was higher than for White students (p = 0.570 versus p = 0.351), and higher for students who received free lunch than for those did not (p = 0.523 versus p = 0.396). Model 2 shows results for functional characteristics controlling for demographic characteristics. Here, only SAT reading percentile was a statistically significant predictor
Table 4. Determinants of MYS retention and associated probabilities. Mode1 l: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood; Mode1 2: Ŷ = b0 + b1 race (white) + b2 sex (boy) + b3 lunch (free) + b4 grade + b5 neighborhood + b6 SAT (R) + b7 SAT (M) + b8 violations.
aStandard Error. bLeast Squares means controlling for other variables in the equation; these have been converted to probabilities. cMean. dStandard Deviation.
of retention: for students with a SAT reading percentile score one standard deviation below the reading mean, p = 0.715 compared with p = 0.665 for students with a SAT reading percentile score one standard deviation above the mean.
Table 4 also reports estimated probability of annual retention given a particular characteristic as well as the effect sizes for these characteristics. Here, race approaches a moderate effect size (h’ = 0.443), and the effect size for free lunch status falls in the small-to-moderate range (h’ = 0.256. All other effect sizes range between miniscule and very small (all h’s < 0.11).
None of the 12 demographic × functional terms included in the supplemental analysis were statistically significant.
The study of minority youths living in poverty has proven difficult, resulting in non-representative samples and high attrition rates (see   ); the result is missing data that cannot be ignored     . Even when response rates are high, the possibility of bias, whether or not overtly detectable, is not eliminated   .
In some studies, like those involving poverty, random sampling strategies may not be successful due to factors like high residential mobility within impoverished populations   . In studies like the MYS, where we attempted to study the population of adolescents living in impoverished neighborhoods, arguably random sampling is irrelevant; however, as the participation rate decreases from unity, this strategy increasingly generates a convenience sample (i.e., those who chose to participate). Thus, it is important to examine missing data mechanisms that may lead to non-representativeness in all research studies, especially those involving hard-to-reach populations, because when important characteristics in the sample differ from the population, missing data are not ignorable. We developed a protocol for doing this with the MYS, an important dataset that has been used in over 60 publications to date.
While matching sample demographic characteristics to those of the population is an important step in any study, it is seldom sufficient to demonstrate representativeness; the reason is largely due to the previously-discussed nesting of vulnerability that is particularly evident in hard-to-reach populations. When research questions involve beliefs, attitudes, and behaviors, it is also important to determine whether the sample is representative of the population in terms of functional characteristics like cognitive abilities and behaviors. This study assesses the extent to which the MYS sample deviates from the population of adolescents living in MYS neighborhoods, in terms of their enrollment, year-to-year participation, and retention, and therefore, the extent to which missing data may be nonignorable.
Results suggest that demographically, the MYS sample falls short of being representative of the population in three key areas. First, though, sex had no effect on any of the outcome variables, as measured by both statistical significance and effect size. However, grade, race, and school lunch status did show some statistically significant results. In terms of grade, statistically different rates of MYS enrollment and participation were evident; for example, students one standard deviation below the grade mean were less likely to enroll than students one standard deviation above the mean (p = 0.221 versus p = 0.238). The difference, while statistically significant, is quite small (Δp = 0.017; h’ = 0.04). The effect for participation is also significant and also small (Δp = 0.040; h’ = 0.116). The statistical significance coupled with the small effect size is likely due to the very large sample size used for these analyses (e.g., for the enrollment demographic analysis, N = 107,689 observations across 10 years), which results reduces the standard error of the estimate dramatically. We see this same outcome (a statistically significant estimate coupled with a small effect size) throughout the results. The differences for grade, although small, may nonetheless suggest the possibility that parents of younger students were more likely to withhold consent for MYS enrollment or participation; it may also suggest the possibility that younger students were less likely to venture out of their homes and walk to the survey administration site.
Race has a significant effect on both enrollment and participation; but here, effect sizes border on robust (Δp = 0.135, h’ = 0.387 for enrollment; Δp = 0.08; h’ = 0.381 for participation). Perhaps because very few White students lived in the MYS neighborhoods (4.4%), the perception may have developed that the MYS was a study of Black adolescents, and White adolescents largely self-se- lected out of the study. Alternatively, demographic distributions may not be geographically uniform within any given neighborhood, and there is evidence  that rates of enrollment and participation were higher in certain sectors of each neighborhood than in others. This was particularly the case for expanded neighborhoods, where MYS households tended to cluster on selected street blocks. We recruited most heavily on these blocks, and previous non-partici- pants were more aware of the study than adolescents living at some distance from the selected blocks. School lunch status had a significant effect on all three outcomes; effect sizes were small for enrollment (Δp = 0.015, h’ = 0.044) and participation (Δp = 0.018, h’ = 0.094), but larger for retention (Δp = 0.127, h’ = 0.256).
In the introduction, we suggested that missing data is a bigger problem in studying vulnerable and hard-to-reach populations (e.g., racial minorities, those living in poverty) than in studying populations that are not vulnerable or hard to reach. Even though our reported results suggest that the MYS sample may not be strictly representative of the population, those for race and school lunch status run contrary to what we would expect in studying hard-to-reach populations: racial minorities and impoverished youths were more likely to enroll and participate than their non-minority and less-impoverished counterparts. Thus, since the purpose of the MYS was to study the most vulnerable youths, these differences suggest that it largely succeeded in its mission. We should note, however, that the neighborhoods were overwhelming Black and impoverished, so these differences may be less important than their probabilistic magnitude would suggest. One additional explanation might argue that since MYS participants were paid for their time, those living in greater poverty (i.e., receiving free lunches) would be more likely to participate. While this explanation may be partially valid, its importance is undermined by three factors: a) the size of the payments ($10 or $15, depending on the year) were small; b) the neighborhoods were impoverished, and even those students who did not qualify for free lunch did not live in wealthy households; c) the fact that Blacks were overrepresented in the MYS, even controlling for free lunch status, suggests that the most vulnerable adolescents in the neighborhoods chose to participate in the MYS.
Even though SES is related to cognitive ability  and behavior   , we nonetheless expected to see a wide range of cognitive abilities and behaviors in the neighborhoods we studied; indeed, this was the case. SAT reading and math scores both ranged between the 1st percentile and the 99th percentile, and weighted school violations zero and 81 in a given year. But we found that after controlling for demographic factors, neither standardized school test scores nor behavioral violations of school codes of conduct differed in a meaningful way by enrollment, participation, or retention. To be sure, a number of these differences were statistically significant; for example, the SAT reading scores had a statistically significant effect on enrollment. However, the effect size was quite small (Δp = 0.024; h’ = 0.056); the largest effect size among all of the functional characteristics was for SAT reading on retention (Δp = 0.05, h’ = 0.108). All differences for functional characteristics showed that the most vulnerable segments of the population (i.e., those with lower cognitive abilities; those whose behaviors violated school codes of conduct and/or resulted in disciplinary action) were oversampled.
Overall, demographic variables did not interact with functional variables to affect outcomes. Of the 36 variable pairs tested across three outcomes, only four results achieved statistical significance, and only one pair achieved significance across two different outcomes: the grade by school violations interaction was a significant predictor of both enrollment and participation, but not of retention (p = 0.241). Further examination showed that the effect of school violations on both enrollment and participation became increasingly positive as grade increased. The MYS was not conducted in schools during the school year, so school discipline (e.g., suspension) as a response to violations would not explain the effect. However, younger adolescents are more likely than older adolescents to be subject to parental monitoring and restrictive rules―rules that might prevent them from MYS participation. Thus adolescents who were “in trouble” (e.g., who had been disciplined for violating school rules) may well have come under even greater parental monitoring and restrictive rules, and their MYS enrollment and participation may have suffered as a consequence. But overall, demographic and functional characteristics did not interact in predicting outcomes, and it is reasonable to treat them as separate characteristics in the analysis.
One methodological note is of importance. The rates of enrollment (0.258) and annual participation (0.134) are quite low, although the annual retention rate (0.729) is quite reasonable in a study of this type. In the 13 initial target neighborhoods, the enrollment and participation rates were much higher than these estimates suggest; for example, in the largest of the initial target neighborhoods, the 1998 participation rate was approximately 0.50, and the enrollment rate for that neighborhood was approximately 0.75  . The overall much lower enrollment and participation rates for the study occur because expansion neighborhoods were added as the study progressed. Thus, the results reported here are based on all students (of appropriate ages; see the Method section for inclusion criteria) in all original and expansion neighborhoods, even though some of these neighborhoods had very few if any MYS participants during the study’s early years. It should also be noted that the expansion neighborhoods, though still more impoverished and with higher concentrations of minorities than the median neighborhood in the Mobile MSA, were, by definition not as poor as the initial target neighborhoods, and had greater racial heterogeneity.
The general conclusion, then, is that, with the exception of race (and to a lesser extent, school lunch status), the demographic and functional characteristics of the MYS sample were very similar to students living in MYS neighborhoods who were not enrolled in the study. Even the racial differences in enrollment, participation, and retention and the school lunch difference in retention provide an indication that missing data was least common in the least vulnerable segment of this population. Thus, the hardest-to-reach segment of this hard-to-reach population was not only equally-likely to participate but actually even more likely to enroll, participate, and follow-up than other less-vulnerable segments of the population. In terms of the question posed earlier, we apparently were able to study this population without sampling or retention bias. It is, then, possible to reach hard to reach populations, in this case low-income minority adolescents.
We should briefly describe how we conducted the study, because that potentially influenced the results we were able to obtain. First, written parental consent was obtained for all participants enrollees. But we explicitly asked for consent for each adolescent to participate each year until he or she turned 19 (when the participant aged out). This allowed each enrollee to participate each year, whether or not direct contact with him or her or a parent/guardian occurred during that year. Word spread quickly that the MYS was in progress each year, and it was well regarded in each of the neighborhoods (to the point where study participants were overheard bragging to each other about how many times they had participated). Thus, each year many adolescents participated even though they were not individually and/or directly recruited.
How did the MYS establish a positive reputation in these neighborhoods? We can only speculate, but several factors may be relevant. First, it was a community-based survey, and the research team spent considerable person hours in each neighborhood each year knocking on doors and talking with both adults and youths. In other words, the MYS had a very visible presence in each neighborhood, and the survey became a special event in neighborhoods where special events are rare. Certainly, the fact that participants were paid for their time was important; but this rate of pay was considerably lower than what is often paid in similar studies. Surveys were also administered in the neighborhoods where each participant lived―or if that was inconvenient for the participant, in his or her home. Arguably, then, the fact that the research team was willing to go into their neighborhoods (which many of them recognized as potentially dangerous places) and their homes earned a great deal of respect.
Finally, a word about the members of the research team is warranted. Each year, an internship was offered, at first to college students in Alabama, then increasingly to students nationally. Small stipends were provided for the interns, to cover their living expenses. But in deciding who to select for the internship (typically the number of applicants outnumbered the number of available positions by a factor of three-to-four), priority was given to applicants with (a) research experience (preferably in the field) and (b) an appreciation for the effects of poverty and a strong desire to better understand how it affects people’s lives. Thus, the interns were both very respectful of the people with whom they interacted, and they were very good listeners, learning as much from their day-to-day experiences in the neighborhoods and interactions with neighborhood residents as from the actual data they collected.
Strengths and Limitations
The commitment of the research team, coupled with the time it spent in each neighborhood recruiting and surveying adolescents, allowed it to gain respect and trust by neighborhood residents. While undoubtedly a strength of the MYS, this came at a monetary cost: approximately $200,000 per year to survey between 2000 and 3000 respondents. Not all research endeavors will have this type of budget. Moreover, the fact that the MYS research team returned year after year also contributed to the trust and respect that was established (the smallest sample was obtained during Year 1 and increased nearly each year thereafter); this long-term commitment also increased the overall budget for the study. So, the conclusions about studying a hard-to-reach population without selection bias may be limited by budgetary constraints. Second, the auxiliary dataset we used to establish the representativeness of the study was age-limited. Within the MYS neighborhoods, nearly all youths attend public schools; but they are only required to do so through age 15, since they can legally drop out at age 16. The problem with school dropout was particularly important in the MYS neighborhoods, where the high-school graduation rates were as low as 30%. Thus, we limited the analysis to MYS participants and non-participants to youths aged 10 through 15. We are therefore not able to draw any conclusions about the representativeness of the older segment (aged 16 - 18) of the MYS sample. However, given the findings for the younger MYS participants (aged 10 - 15), and without any theoretical reason to believe that they should be different for older youths, this may be a relatively minor limitation. Finally, measures of functional characteristics used in this study do not directly correspond to risk behaviors in the MYS; therefore, we cannot say with certainty that the representativeness we found extends to all cognitive, attitudinal, and behavioral domains. However, even the limited set of measures we use go well beyond what is available to address the issues of representativeness and missing data in hard-to-reach populations considered by most studies.
This study suggests that researchers can conduct studies of impoverished adolescents without bias, so long as they carefully attend to issues of establishing legitimacy, respect, and trust in the communities they study. This study may also generalize to other hard-to-reach populations, although more research is needed to make this leap. Unfortunately, we do not know definitively how to establish legitimacy, respect, and trust in vulnerable communities. Our previous discussion suggests ways that this may be accomplished, and other papers (e.g.,    ) explore effective ways of conducting research with hard-to-reach populations; but again, more research is needed, particularly in community-based (rather than clinical or school-based) studies.
Over 60 papers have been published using the MYS data. This study benefits those papers (and future papers that use MYS data) by suggesting that missing data in the MYS sample are largely ignorable. Our results indicate that while demographically, the MYS sample (ages 10 through 15) is not strictly representative of the population, the deviations do not suggest that those who were eligible, but did not participate, did so for largely-ignorable reasons. Moreover, survey research literature shows lower response rates for minorities and people living in poverty. The fact that we find higher rates of enrollment and year-by-year participation for Black adolescents and those who are eligible for free lunch suggests that the strategy of focusing on the most at-risk segments of the MYS neighborhoods is successful and further supports the idea that the neighborhood per se is an inappropriate sampling frame. Further, results show that functionally, while there are significant effects for reading and math scores and for school violations, these differences are quite small and show higher enrollment, participation, and retention rates for this most vulnerable segment of the population. Thus, the results provide support for treating missing data in the MYS as ignorable.
Finally, and perhaps most important, this research extends results and theoretical arguments by others (e.g.,  ) suggesting that missing data do not create bias in studies of adolescent risk behavior. This is important since it allows researchers to use analytic approaches that accommodate missing data (e.g., maximum likelihood estimation, multiple imputations) so long as missing data are ignorable. Results presented here may encourage more researchers in the future to study vulnerable populations without fear that their results will be rejected because they cannot demonstrate that missing data do not bias their results.
The research reported here was partially supported by the National Institutes of Health Office for Research on Minority Health through a cooperative agreement administered by the National Institute for Child Health and Human Development (HD30060); a grant from the Center for Substance Abuse Treatment, Substance Abuse and Mental Health Services Administration (TI13340); a grant from the National Institute on Drug Abuse (DA017428); a grant from the Centers for Disease Control and Prevention (CE000191); a grant from the National Institute for Child Health and Human Development (HD058857); The University of Alabama; the cities of Mobile and Prichard; the Mobile Housing Board; and the Mobile County Health Department.