Physical activity (PA) has been defined as any voluntary movement that results in energy expenditure (Casperson, Powell, & Christenson, 1985). The plethora of benefits that PA imparts to health and wellbeing are well documented, however, physical inactivity remains a global public health concern stretching across high-, middle- and low-income countries (Heath et al., 2012) . The World Health Organisation (WHO) estimates that 1.9 million deaths, 21% - 25% of breast and rectal cancer, 27% of diabetes and 30% of ischemic heart disease are attributable to physical inactivity. Additionally, at least 2.6 million deaths worldwide are a result of complications associated with being overweight or obese (WHO, 2018) . A growing range of interventions have been developed and delivered to increase PA in various populations; their impact on behaviour however has often been small to moderate in effect and the longevity of this effect is inconsistent so further research is needed (Bauman et al., 2012) .
Previous research has shown that a diverse range of personal and environmental factors influence PA. Therefore, the key to understanding PA across different populations and developing effective interventions is to identify and investigate the underlying factors which influence PA behaviours (Lubans, Foster, & Biddle, 2008) . Further, it is important to distinguish between factors which serve as correlates, mediators and/or determinants of behaviour within specific groups (Biddle, Whitehead, O’Donovan, & Nevill, 2005) . It follows that understanding the role of factors such as demographics, health status, self-efficacy, access, and socioeconomic status within smaller and more specific groups is an important prerequisite to effective intervention (Trost et al., 2002) . Cross sectional research has also highlighted particular subgroups of society that are largely inactive. These groups include: ethnic minorities, lower socioeconomic groups, those with disabilities, adolescents, and females (Banks-Wallace, 2000) .
Globally, females are less physically active than their male counterparts and therefore more susceptible to the health problems associated with sedentarism (Butcher, Sallis, Mayer, & Woodruff, 2008) . Multiple disciplines claim that single-sex programs specially tailored to meet the unique needs of females are imperative yet inadequately researched (Wiese-Bjornstal & Lavoi, 2007) . For example, Biddle, Braithwaite and Pearson’s (2014) meta-analysis investigating the effectiveness of interventions to increase PA among girls aged 5 - 11 concluded that interventions aimed at young females are most effective when they are single sex, multi-component (e.g. combine PA with diet education), relatively short, and atheoretical but of high quality. In particular, multi-component school-based interventions for young females have been most effective when they are supported by curriculum physical education (PE), as they specifically address the needs of young females and attenuate some of the common barriers they face (Camacho-Miñano, LaVoi, & Barr-Anderson, 2011) . However, explicit consideration of how interventions attend to and measure influences which impact young females’ participation in PA is lacking from literature.
Levels of physical inactivity within the female population vary, and age is considered a significant correlate in the prevalence of sufficient PA (Butcher et al., 2008; Pate et al., 2009) . According to the UK Department of Health, individuals should engage in at least 60 minutes of moderate to vigorous PA every day up to the age of 18, whilst those 19 - 64 years old should engage in the equivalent of 30 minutes of moderate PA on at least 5 days a week (Department of Health, 2016). Pre-adolescent children (approx. 10 - 13 years) are reported to be the most active segment of society in the UK, with 34% of pre-adolescent girls meeting the recommended levels of PA. However, PA participation declines with age for young females, and strikingly, drops to 0% for adolescent females (Townsend et al., 2012) . The decline found in PA is inclusive of competitive sports, and several studies report that by the age of fifteen the drop-out rate seen in young females is significantly higher than their male counter-parts (De Knop, Engström, Skirstad, & Weiss, 1996; Fraser-Thomas, Côté, & Deakin, 2008; Enoksen, 2011) . The NHS’ National Clinical Director for Children, Young People and Transitions to Adulthood has emphasised the importance of supporting and providing services to young people up to the age of twenty-five, noting the transitions they experience until this age and the culmination of brain development (Cornish, 2015) . The disparity between the recommended and actual PA levels, along with the more pronounced decline in PA among females, identifies young females as a high priority group for PA promotion and intervention. Furthermore, the age-related drop-off apparent by the age of fifteen as well as the developmental implications up to the age of twenty-five, highlights a crucial opportunity to intervene to establish long-term PA habits (Keating, Guan, Piñero & Bridges, 2005) .
Early PA interventions have typically focused on intrapersonal factors as predictors of PA participation (Biddle et al., 2005; Giles-Corti, Timperio, Bull, & Pikora, 2005) . Social psychological theories such as Theory of planned behaviour (Ajzen, 1985, 1991) , Social cognitive theory (Bandura, 1986, 2004) and Self-determination theory (Deci & Ryan, 1985, 1991) provide an understanding of the individual-level factors that influence participation in PA (Craike, Symons & Zimmermann, 2009) . Although such approaches have proven useful in predicting and explaining human behavior and can contribute to the development of interventions that target changes beyond the intrapersonal level, their focus is understanding individual-level factors (Vohs & Baumeister, 2016) . Differently, Bronfenbrenner’s (1977) ecological systems theory is concerned with how individuals are influenced by multiple levels and systems of interaction. The theory consists of four levels: Microsystems which are concerned with the developing person, Mesosystems which are concerned with interrelations, Exosystems which are concerned with social structures and institutions, and Macrosystems which involve institutional structures such as political, legal and economic systems. Health and exercise research has begun to adopt a more multi-level approach, and also focus on the broader interaction between individuals and their social and physical environment to investigate the wider determinants that shape exercise behaviour (Bauman et al., 2012; Pan et al., 2009) . Based on Bronfenbreener’s (1977) theory it has been suggested that participation in PA is influenced by the relationships within and between the intrapersonal, interpersonal, organisational/environmental, and policy and legislative levels (Pearson et al., 2014) . These interactions have been shown to affect the impact of PA promotion (Biddle et al., 2014) . The Ecological Model of Health Behaviour synthetises these and other interactions, and is concerned with understanding behavioural influences that take place at multiple levels, meaning it is able to inform the development of interventions that focus on these broader interactions (Banks-Wallace, 2000) . It emphasises the policy and environmental contexts of behaviour, while incorporating social and psychological influences (Sallis, Owen, & Fisher 2008) . Through explicit consideration of multi-level influences, the ecological model can guide the development of more comprehensive and context-specific interventions.
Interventions targeting health behaviours such as smoking or self-management of diseases (e.g. HIV) appear to be most effective when they influence several levels of the ecological framework (Wilson et al., 2012; Grau et al., 2017) . Several PA reviews suggest that considering the multiple levels of influence would be a promising direction for further research (Pearson et al., 2014; Biddle et al., 2005; Biddle et al., 2014) . For example, one study successfully engaged females in PA in a school setting with an intervention that took into account intrapersonal, interpersonal and environmental factors such as self-efficacy, teacher role and equipment (Goodyear, Casey, & Kirk, 2014) . There is currently a gap for a literature review, which collates intervention studies and investigates the factors and measures through a multi-level approach such as the ecological model.
Several reviews have summarized evidence regarding PA promotion in young people, and among various female populations (Salmon et al., 2007; Shaya, Flores, Gbarayor, & Wang, 2008; Van Sluijs, McMinn, & Griffin, 2007) . Although PA intervention literature is abundant, it has been suggested that further studies in the area must pay closer attention to factors that impact young females’ participation in various settings (Cengiz & Ince, 2014) . Some research has explored specific cultural factors (Sharma, 2010) , and others have explored specific intervention settings (Shaya et al., 2008) and demographic factors (Pate et al., 2009) . However, to understand the specific relationships between the factors influencing engagement and retention of young females in PA, a contextualised multi-level theoretical approach to reviewing the evidence is required, and the ecological model can provide a framework for this. Currently in the UK, increasing female PA is high on government agenda (Public Health England, 2018) . A multi-million pound nationally funded campaign “This girl can” has been developed by Sport England to encourage female participation and lessen the disparity between male and female participation. Therefore, in light of the ecological model of health behaviour, this review aims to synthesise the evidence pertaining to the characteristics of PA interventions aimed at UK-based females, to explore the measures used to evaluate intervention impact, and to make recommendations for future PA intervention research through an ecological perspective.
The current study adopted the methodology of a narrative systematic review in order to answer a range of questions and include a range of evidence types (Snilstveit, Oliver, & Vojtkova, 2012) . Qualitative synthesis in the form of thematic summaries was used to categorise studies in to relevant conceptual framework groupings (intrapersonal, interpersonal, organisational and environmental, and, policy and legislative). The Preferred Reporting Items for Systematic reviews and Meta-Analyses guidelines (PRISMA; Liberati et al., 2009 ) recommended for reliable methodology and clear reporting were adhered to where possible. The review was pre-registered in the national institute for health research’s international database PROSPERO, this was to avoid duplication and help reduce reporting bias (registration number: CRD42018039427).
2.1. Search Strategy
An electronic literature search was conducted using five databases: SPORTD iscus, Psycarticles®, Web of science™, Scopus®, and Medline® (complete search strategy for Scopus shown in Appendix). The search identified all peer-reviewed articles published up to May 2016. Given the research question, the search was built around three groups of key terms: population, behaviour and intervention. The following synonyms were then used to conduct the search: [“women” OR “woman” OR “girl” OR “female”] AND [“physical activity” OR “exercise” OR “sport” OR “fitness”] AND [“intervention” OR “program” OR “uptake” OR “adherence”]. The terms were entered into each database, filters were applied so that only peer-reviewed articles published in English remained, and where possible, a limiter for UK studies was applied; 7584 articles were exported into the reference management program (Mendeley Ltd. v1.16.1) where duplicates were removed (see Figure 1 “identification”).
Figure 1. PRISMA flow chart of the study selection process. Each step was performed by two independent reviewers.
2.2. Inclusion Criteria
The following criteria were used for the inclusion of publications in the narrative systematic review: 1) study reported an intervention with at least two points of contact; 2) physical activity was included in the intervention, this included exercise which is a sub-set of PA (e.g., Caspersen et al., 1985; Aguilar-Cordero et al., 2018 ); 3) studies recruited participants free from chronic health conditions except for obesity; 4) participants were female only or the study presented an analysis segregated by gender; 5) the mean age reported fell within the range of 14 to 25 years; 6) the intervention was UK-based; 7) studies were reported in English language.
2.3. Identification of Studies
Articles were selected by two independent reviewers (RH, RO) who screened potential articles by 1) title 2) abstract 3) and then reviewed whole texts and discussed discrepancies until they were in agreement as to whether articles met the inclusion criteria specified above (see Figure 1). Prior to discussion of the discrepancies, the reviewers agreed on 81% of papers (17 out of 21). Where papers fit the inclusion criteria but the analysis did not separate males and females or the target age group, corresponding authors were contacted and asked for additional data. In one case, we received additional data, which allowed for the article’s inclusion.
2.4. Study Characteristics Table
A study characteristics table, with topics agreed upon by all four authors, was designed specifically for this study to extract relevant information from the articles included in the review. The table provides a summary of the characteristics of each article, and includes the following topics: references, theoretical framework where stated, recruitment details, participant details (e.g. age, gender), intervention details (e.g. setting, length, activity), outcome measures, design, main findings, and respective ecological level(s). The 21 individual studies included in this review were tabulated, analysed for differences, divergent findings and patterns, and then synthesised according to their corresponding ecological level(s) and therefore thematic group (see Table 1).
2.5. Quality Assessment
Table 2 shows the items used for the quality assessment of the studies. The National Heart, Lung and Blood Institute instrument designed especially for intervention studies was used (NHLBI, 2018) . The instrument requires two independent assessors to assess each article based on 14 items (see Table 2). The articles were then rated good fair or poor (see Table 3).
Table 1. Summary of studies included in the review. NR = Not reported, V = Voluntary, C = Compulsory.
Table 3. Quality assessment ratings for the included studies. Studies were discussed and rated “good”, “fair” or “poor” by two reviewers based on the articles fulfilment of the NHLBI quality assessment items. Y = Yes, N = No, NR = Not reported, NA = Not applicable, CD = Cannot determine.
3.1. Descriptive Characteristics
The study selection process identified 21 studies, which met the inclusion criteria and were therefore included in this review. The publication period of the included articles spanned between 1995 and 2016. Of the 19 studies that took quantitative measures, 14 followed a pre-test and post-test within-subject design; two studies took measures of cardiorespiratory fitness (e.g. maximal oxygen uptake capacity: VO2max) and affective responses throughout the PA sessions; and three studies did not specify the timeline of their measurements. Only two studies conducted follow-up measurements (Chatzisarantis & Hagger, 2009; Epton et al., 2014) . The average length of interventions was 19.4 weeks (SD = 22.4), however the median length was 10 weeks. The age range of participants was 11 to 25 years. Only three studies mentioned the ethnicity of their participants, which they reported as “primarily white” or “all Caucasian” samples (Brooks & Magnusson, 2006; Hamlyn-Williams, Freeman, & Parfitt, 2014; Goodyear, Casey, & Kirk, 2014) . Although most studies reported no health conditions present within their population, one study was based on general practitioners’ referrals with 65% of participants overweight (Hanson, Allin, Ellis, & Dodd-Reynolds, 2013) . The average number of participants within each study was 187.6 (SD = 372.7), with a median of 54 and a large range of 10 to 892 participants. Of the 21 articles included in the review 11 did not state geographical location, but of the 10 that did five took place in rural areas and five took place in urban areas (see Table 1).
3.2. Quality Assessment
The quality of the included studies were assessed as either good, fair or poor through a quality-rating tool proposed by NHLBI (2018) . The majority of the studies were rated either good or fair (95%), of which 43% were deemed good. Of the 21 studies, 95% had valid, reliable and consistently implemented measures. One study did not use reliable measures and failed to report predicted outcomes before analysis, which was the only study included in this review that was considered to be of poor quality (Brooks and Magusson, 2006) . Only 9% of included interventions reported a sample size that was sufficiently large enough to detect differences in their main outcomes, and only 4% of studies described a high adherence to intervention protocols across treatment groups.
3.3. Reported Methodologies
Out of the 21 studies, 19 used a variety of quantitative methods to evaluate their intervention outcome measures. Most frequently, these consisted of statistical analysis of VO2max, heart rate, body composition and psychological measures using questionnaires (details provided later). Only four studies utilized qualitative methods (Moon et al., 1999; Cooke et al., 2013; Brooks & Magnusson, 2006; Goodyear et al., 2014) . One study led focus groups with a purposive sample of 31 self-identified formerly PE-adverse individuals after a modified PE program. The students, recruited due to their withdrawal from curriculum PE, attributed their disengagement to the environment, perceiving it as one that placed an emphasis on winning, masculinity and physical prowess (Brooks & Magnusson, 2006) . Two studies used mixed quantitative and qualitative methods (Moon et al., 1999; Cooke et al., 2013) . Alongside self-administered questionnaires, which measured pupil’s health-related knowledge, attitudes and behaviours, Moon et al. (1999) used semi-structured interviews to collate attitudes and perceptions of staff, parents and school governors from eleven intervention and five control schools. Cooke et al. (2013) recruited 136 medical students (control n = 66) to wear pedometers for 4 weeks with the target of increasing their daily step count. Alongside statistical analysis of pedometer data, which found a greater mean change in the daily step count of the intervention group, 26 volunteers (13 control & 13 intervention) of the 136 then participated in mixed focus groups. Participants discussed their experiences regarding PA measurement and their views on health promotion. Five themes were identified: walking and exercise, barriers to PA, doctors as role models, confidence in counselling, and primary care.
Most of the interventions (17 of 21) took place within educational institutions with seven set within universities, eight in schools, and two in colleges. Other settings for interventions were workplaces (Brock & Legg, 1997; Bray et al., 2001) and local gyms (Delextrat & Neupert, 2016; Hanson et al., 2013) . All university and college-based interventions (n = 9) recruited volunteers for study participation (Beauchamp, Welch, & Hulley, 2007; Boreham et al., 2005; Boreham, Wallace, & Nevill, 2000; Bray et al., 2001; Cooke et al., 2013; Engels, Bowen, & Wirth, 1995; Epton et al., 2014; Stear et al., 2003; Tully & Cupples, 2011) . In contrast, the eight interventions that took place within school settings were generally compulsory programmes, in which schools partnered with external providers to target sub groups or individuals to substitute curriculum PE. A third strategy was used for a contemporary dance intervention. Classes were offered to nine high schools, which either made them mandatory or allowed students to choose between the dance intervention and traditional PE classes (Connolly, Quinn, & Redding, 2011) . Although the authors acknowledge this may have impacted measures such as intrinsic motivation, the two groups were not analysed separately to examine whether making the activity mandatory or a choice made a difference. School-based interventions included in this review, which were mostly mandatory, were more effective than university-based interventions, as five of eight school-based interventions reported significant increases in PA-related affect such as body image and self-esteem.
The type of activities used in this population has been limited, and within interventions, participants were generally not given the option to choose the activities in which they participated. Most of the interventions reported instructor-led PA such as gym and PE classes (n = 7; Beauchamp et al., 2007; Bray et al., 2001; Brooks & Magnusson, 2006; Chatzisarantis & Hagger, 2009; Epton, et al., 2014; Hanson et al., 2013; Moon et al., 1999 ). Dance and exercise to music was also a frequent activity offered to this population (n = 5; Burgess et al., 2006; Connolly et al., 2011; Delextrat & Neupert, 2016; Engels et al., 1995; Stear, Prentice, Jones, & Cole, 2003 ). There were also some interventions using walking or running (n = 4; Cooke et al., 2013; Hamlyn-Williams et al., 2014; Ho et al., 2013; Tully & Cupples, 2011 ), stair climbing (n = 2; Boreham et al., 2005; Boreham et al., 2000 ), or basketball and cycling (n = 2; Goodyear et al., 2014; McPhee, Williams, Degens, & Jones, 2010 ). Finally, Brock and Legg (1997) used a variety of physical activities as part of 6-week physical fitness programme for British female army recruits with mean age of 19.2 (SD = 1.4). It included a variety of compulsory activities: swimming, games, endurance training, personal training sessions and obstacle courses. Although participants were not given choice in the activity, the interventions which allowed other types of choices showed positive impacts on PA-related affect such as autonomy (Chatzisarantis & Hagger, 2009) , affective responses measured by an affect valence scale (Hamlyn-Williams et al., 2014) , and lower ratings of perceived exertion in the absence of intensity differences (Delextrat & Neupert, 2016) . One study compared prescribed intensity versus self-selected intensity of running on a treadmill (Hamlyn-Williams et al., 2014) . Another study compared energy expenditure during an instructor-led exercise classes versus when following an exercise DVD (Delextrat & Neupert, 2016) . A third study compared students’ intentions and self-reported exercise following autonomy-supportive sessions versus standard sessions (Chatzisarantis & Hagger, 2009) .
3.5. Ecological Levels
Only seven studies (33%) explored multiple levels of influence on PA participation (see Table 1). Three of these extracted measures to evaluate factors from both intrapersonal and interpersonal levels of the ecological model (Beauchamp et al., 2007; Bray et al., 2001; Chatzisarantis & Hagger, 2009) . For example, Bray et al., (2001) investigated the relationship between self-efficacy, instructor-efficacy, and exercise attendance. They recruited 127 volunteers from a university campus and enrolled them in 10 weeks of structured group fitness classes. The study found that the combination of the intrapersonal and interpersonal level factors accounted for 34% of the variance in exercise class attendance with self-efficacy in relation to scheduling, barrier and exercise accounting for 22% and fitness instructor efficacy accounting for 12% of variance explained. Three articles looked at the intrapersonal, interpersonal, and organisation and environmental levels (Brook & Magnusson, 2006; Cooke et al., 2013; Goodyear et al., 2014 ), whilst one considered all four levels (Moon et al., 1999) . These are explained in more detail later in this review when each level of the ecological model is explored (see Figure 2 for adapted ecological model and overview of factors).
Intrapersonal. The most frequently investigated factors are from the intrapersonal level of the ecological model. All studies measured one or more intrapersonal factor, these included: aerobic fitness (n = 9), measures of body
composition (n = 6), self-efficacy (n = 5), intervention compliance (n = 11) and self-reported PA levels (n = 5). Some studies looked only at physiological (n = 8) or psychological (n = 10) changes. Only three studies investigated both the physiological and psychological impact of interventions (Connolly et al., 2011; Delextrat & Neupert, 2016; Epton et al., 2014) .
Figure 2. Illustration of ecological levels. Examples at each level are factors drawn from the results of this review. (Figure was designed by the authors and based on Bronfenbrenner’s (1977) Ecological systems theory).
The most frequently reported physiological and health-related measurement was aerobic fitness (n = 9). Of these, seven studies used a variety of methods (e.g., VO2 maximal tests, VO2 sub-maximal tests, incremental running tests, NATO cycling tests) to assess aerobic fitness, while heart rate and oxygen uptake were used in two studies to monitor the physical intensity of sessions (Delextrat & Neupert, 2016; Hamlyn-Williams et al., 2014) . Six interventions, which all took pre and post measurements, showed a significant improvement of aerobic fitness ( Boreham at el., 2005; Boreham et al, 2000; Brock & Legg, 1997; Connolly et al., 2011; Engles et al., 1995; McPhee et al., 2010 ). The remaining study, a 6-week walking intervention, which asked participants to accumulate 10,000 steps per day and report their pedometer step counts in a diary, saw no change between their pre and post aerobic fitness measured by a multi-stage shuttle run test (Tully & Cupples, 2011) . The six interventions that reported favourable effects on aerobic fitness were comparatively more structured in delivery in that they paid close attention to intensity and timing of activities, ensured progression throughout the programme, and had supervised sessions.
Different measures of body composition were used in six studies. Although the interventions varied in length, components, activities and participant demographics, the four studies which used Body Mass Index (BMI) found no significant difference between pre and post intervention (Boreham et al., 2005; Epton et al., 2014; Ho et al., 2013; Tully & Cupples, 2011) . The two studies that found a statistically significant increase in fat-free mass used more precise methods: skin folds (Brock & Legg, 1997) and underwater weighing accompanied with body fat estimation (Engels et al., 1995) . It is important to note that BMI was considered a secondary outcome in the earlier four, which focussed on engagement, whilst body fat percentage in the latter two studies was considered a primary outcome measure. Both of these studies conclude that female recruit training in the British army and low impact aerobic dance are effective in terms of changing body composition of young females (Brock & Legg, 1997; Engels et al., 1995) .
The most frequently reported psychological and behavioural measure was exercise-related self-efficacy (n = 5). Validated questionnaires were used to identify and extract different types of self-efficacy. For example, Beauchamp et al. (2007) used a six-item in-class self-efficacy scale, a four-item barrier self-efficacy scale, and a ten-item scheduling self-efficacy scale to examine the relationship between leadership types and the self-efficacy of participants enrolled in a 10-week group exercise intervention. Similarly, Bray et al. (2001) used an exercise self-efficacy questionnaire alongside a scheduling efficacy questionnaire, to examine the relationship between self-efficacy, instructor-efficacy and exercise class attendance. Both studies concluded that leadership styles and instructor-efficacy had stronger correlations with self-efficacy for exercise initiates than experienced exercisers. The remaining three studies found that their various interventions increased the exercise-related self-efficacy of participants post intervention (Brooks & Magnusson, 2006; Burgess et al., 2006; Connolly et al., 2011) .
About half of the studies reported measures of intervention compliance. Attendance, adherence, and/or attrition were reported by eleven studies (Boreham et al., 2005; Boreham et al., 2000; Bray et al., 2001; Brock & Legg 1997; Chatzisarantis & Hagger, 2009; Engels et al., 1995; Epton et al., 2014; Hanson et al., 2013; McPhee et al., 2010; Stear et al., 2003; Tully & Cupples, 2011) . Three studies reported attendance to intervention classes ( Bray et al., 2001; Engels et al., 1995; Stear et al., 2003; 63%, 92%, 36%, respectively), but the high attendance rate in Engels et al. (1995) was calculated after removing three of the 20-participant sample because they had not met the attendance requirements. One pedometer-based study reported adherence as the percentage of days participants recorded data in their training diaries over 6 weeks (85%; Tully & Cupples, 2011 ). Nine studies reported attrition rates as how many participants were excluded from the analysis compared with how many had begun the intervention. Among those nine, there was a mean attrition of 20.2 (SD = 15.1; Boreham et al., 2005; Boreham et al., 2000; Brock & Legg 1997; Chatzisarantis & Hagger, 2009; Engels et al., 1995; Epton et al., 2014; Hanson et al., 2013; McPhee et al., 2010; Stear et al., 2003 ). Overall, the study with the highest level of attrition (57.1%, Hanson et al., 2013 ) and the study with the lowest class attendance (36%, Stear et al., 2003 ) were two of the longest running interventions, at 24 and 67 weeks respectively, compared with the 16.3 weeks (SD = 20.3) of the remaining review sample.
Only three of the 21 interventions included a combination of physiological and psychological measures to evaluate their interventions (Connolly et al., 2011; Epton et al., 2014; Delextrat & Neupert, 2016) . Connolly et al. (2011) measured self-esteem (Rosenberg scale) and intrinsic motivation (attitudes towards PA) as well as muscular strength (dynamometer), flexibility (sit and reach), and aerobic capacity (20m shuttle run). This study found an increase in self-esteem, aerobic capacity, and upper body strength following a dance intervention but found no significant differences in flexibility or intrinsic motivation. Epton et al. (2014) used the theory of planned behaviour in a multi-layered online intervention with three evaluation points. They looked at four different health behaviours: PA, fruit and vegetable consumption, alcohol consumption, and smoking status (alcohol and smoke consumption were evaluated through biochemical measures taken from a hair sample). They also looked at BMI, and social-cognitive variables for each health behaviour. After 6 months, there were significantly fewer smokers in the intervention group but no other intervention effects. Interventions included in this review had favourable effects on psychological measures more frequently than on physiological ones. Of the interventions that took psychological measures, 92% reported an improvement in factors such as self-confidence, body image, physical self-perception and depressive symptoms, and of the interventions which took physiological measures, 73% reported an improvement in factors such as increased aerobic capacity, fat free mass and strength.
Interpersonal. Seven studies (33%) utilized measures associated with the interpersonal level of the ecological model by investigating social and/or interactional factors such as: leadership styles, perception of instructors, subjective norms and instructor-efficacy. Importantly, all seven of these studies investigated multiple levels of the ecological model. One study looked at all four levels of the ecological model (Moon et al., 1999) , three looked at the intrapersonal, and organisational and environmental levels in addition to the interpersonal level (Brooks & Magnusson., 2006; Cooke et al., 2013; Goodyear et al., 2014) , and a further three studies looked at the intrapersonal and interpersonal levels of the ecological model (Beauchamp et al., 2007; Bray et al., 2001; Chatzisarantis & Hagger et al., 2009) . For example, Chatzisarantis and Hagger (2009) used an intervention where pupils were taught by autonomy-supportive teachers or controls. They measured learning climate, motivational orientations, and leisure time activity to investigate how this intervention based on self-determination theory affected self-reported leisure time activity. They reported stronger intentions to exercise, and more frequent participation in PA in the autonomy-supported group than in the control group.
Organisational and environmental. Only four studies investigated organisational and environmental level factors; these studies considered at least three of the four levels suggested by the ecological model (Brook & Magnusson, 2006; Cooke et al., 2013; Goodyear et al., 2014; Moon et al., 1999 ). Cooke et al. (2013) and Goodyear et al. (2014) are two of the five studies which utilized theoretical frameworks: Theory of planned behaviour and Cooperative learning theory, respectively. Cooke et al., (2013) enrolled fourth-year medical students to look at the effect of a pedometer and goal-setting intervention on PA behaviour, and intentions to promote PA in future practice. The intervention primarily addressed the organisational aspect of the level as it was integrated as part of a 4-week course in primary care. On the other hand, Goodyear et al. (2014) addressed mostly the environmental aspect of this level as the school which hosted the intervention held specialist sports college status. The study looked at how the teaching of PE could be reconceptualised to give young females responsibility and ownership of their learning by adapting the learning climate. The environment allowed the lead researcher, a teacher at the school, to pilot a Cooperative Model with the use of flip cameras and role-play in an eight-week block of basketball. The remaining two studies at this level were both organisational and environmental. In one study, eleven intervention schools made organisational changes to their environment in pursuit of the Wessex Healthy Schools Award (Moon et al., 1999) . The schools took part in an audit which included a pupil’s health questionnaire and semi-structured interviews with key members of the schooling community. In the other, a newly appointed head of PE, supported by the head teacher trained staff and introduced a modified PE program to encourage greater participation. Modifications were made to the form of provision by employing part-time staff and increasing activity options, investments were made in terms of equipment and decoration of the sports facilities, and students redesigned the PE uniform.
Policy and legislation. Only one study included in this review looked at factors within the policy and legislation level of the ecological model (Moon et al., 1999) . Although more articles were identified in the selection process, only one met the inclusion criteria. Moon et al. (1999) considered all levels of the ecological model and was also one of the two studies which utilised mixed methods. The study was framed by the requirements of the Wessex Healthy Schools Award (WHSA), which is validated by the government agency OFSTED (Office for Standards in Education) and aimed to change health promotion policy and practice within intervention schools (intervention schools n = 11; controls schools n = 5). This mixed-gender study found that females aged 14 - 16 made the greatest progress in all aspects, including health-related knowledge, and behavioural attitude. The school audit also found an increase in widening the community, improvements in the environment and a 10% rise in respondents that felt well informed regarding the WHSA. These findings illustrate that multi-level approaches to intervention may have positive cumulative effects, which are observable at each level of the ecological model.
The purpose of this review was to synthesise evidence pertaining to the characteristics of PA interventions aimed at UK-based females aged 14 - 25, to explore the measures used to evaluate intervention impact, and to formulate recommendations for future PA intervention research through an ecological perspective. To our knowledge, this is the first study that has systematically reviewed interventions in this setting and population. The results generated several points of discussion, namely around the predominance of intrapersonal factors and quantitative methods in intervention research; and the setting, recruitment and activity options that constitute those interventions.
The results show that measures utilized to assess intervention effectiveness predominantly fall under the intrapersonal level of the ecological model. These measures are central to understanding how individuals or groups have been impacted physiologically and/or psychologically by an intervention. However, this approach provides very little insight into other effects or factors influencing the outcomes of the intervention. For example, interpersonal factors such as the influence of instructor-participant and participant-participant relationships, or environmental and organisational factors such as facilities and infrastructure, and their effects on PA adherence are not considered. A recent review agrees with this suggestion (Camacho-Miñano et al., 2011) . It identified two high-quality comprehensive interventions developed in line with ecological approaches (Pate et al., 2005; Webber et al., 2008) . Pate et al. (2005) increased levels of regular PA for high school-aged females through implementing a girls-only PE program. The intervention was accompanied by a supportive school environment, school health services, staff health promotion and both family and community-based activities.
The vast majority of the studies included in this review used quantitative methods to assess the impact of interventions. Literature suggests that historically this has been the case, and these studies can successfully assess direction and strength of trends but are unable to explain uptake, maintenance and attrition (Allender, Cowburn, & Foster, 2006) . Importantly, this review also identified a lack of qualitative research, which has the potential to highlight the experiences of young UK-based females’ engagement in PA interventions. A large number of systematic reviews which have focused on promoting PA in young female population have excluded qualitative studies (e.g. Biddle et al., 2014 ; Camacho-Minano et al., 2011; Pearson et al., 2014; Sallis et al., 2000 ), thus precluding the possibility to recognise the potential benefits of qualitative methodologies in understanding this population. Furthermore, the main focus has been on quantifying benefits in the short term (from pre to post intervention); but in order to produce sustainable behaviour change a holistic approach that includes mixed methods may be needed to capture PA habits over time (Bauman et al., 2012) . As per our findings, mixed method studies are also sparse, but they are potentially a richer representation of intervention effects because quantitative results can be better explained with concomitant qualitative data. Also for this reason these studies may be less likely to be unpublished due to lack of significant results (Rothstein, Sutton, & Borenstein, 2006) .
A substantial proportion of interventions reviewed took place in education, which implies that research within other settings is lacking and female populations such as stay-at-home mums and young professionals are excluded. Universities and schools were the most common settings. There are almost 160,000 female university students in the UK aged 17 - 25 (Department of Education, 2016) making universities a good setting for evaluating interventions. However, these young females are also thought to be some of the most active people in society who are least affected by potential barriers to PA associated with young populations (Maas et al., 2006) . On the other hand, schools have been acknowledged as an ideal setting for the promotion of positive health behaviours, and it is argued that schools should assume a leadership role in ensuring that young people engage in sufficient PA each day (Kahn et al., 2002; Pate et al., 2006) . However, a dependency on the schools as primary providers of PA is likely to contribute to the dramatic drop reported in female participation rates when participation in PA moves from an adult-managed to a participant-led activity (Sallis et al., 2000) . There has been a call for an increase in community-based interventions as opposed to school-based interventions. The main arguments are that, the reduction in PE curriculum time and the traditionally low importance attributed to PE may limit the potential of school-based interventions to influence students, and young females in particular (Trudeau & Shepherd, 2005; Pate et al., 2007) . Adequate PE in the way of time and resources, which attends to gender disparities, would attenuate the challenges that physical educators face in relation to girls’ PA. Interventions consisting of modified PE programs have had positive effects on PE engagement, and should therefore be considered when attempting to tackle specific barriers faced by this population (e.g. Chatzisarantis & Hagger, 2009; Goodyear et al., 2014; Brook & Magnusson, 2006). Additional evidence suggests that school-based interventions have been successful in increasing factors such as self-confidence, which may be used as a platform to acquire the psychological resources needed to engage in community-based activities (Brooks & Magnusson, 2006) .
This review found that school-based interventions are the most effective in influencing positive short-term (from pre- to post-intervention) psychological outcomes. However, the compulsory nature of enrolment and participation adopted within most schools may play a large role in influencing outcome and effectiveness measures but do not guarantee continuation. Psychological factors improved more frequently and strongly than physiological factors. On the one hand, interventions may not have been of sufficient frequency and duration for participants to accrue physiological benefits (Chatzisarantis & Hagger, 2009) . On the other hand, the psychological benefits they did accrue, such as self-efficacy and intention to exercise, have been shown to be determinants of exercise adherence which over a period of time have the potential to produce physiological benefits (Biddle et al., 2005) .
The activity options available to participants were not only very limited, but also largely restricted to one activity per intervention. A previous literature review suggests that interventions should increase choice and offer a wide range of non-competitive and innovative activities that promote enjoyment especially in females (Camacho-Miñano et al., 2011) . Barbeau et al. (2007) add that in developing such interventions, the characteristics of the population must be carefully considered. Kasser and Lytle (2005) go further to suggest that inclusive PA has many benefits, especially within a community setting, and advocate for activities which appeal and engage participants of different ages and backgrounds as this is believed to foster a sense of community, belonging and acceptance. Most relevant to community-based interventions are factors such as religion, socio-economic status and race (Barbeau, et al., 2007) . Culturally-tailored interventions should acknowledge, and be built upon cultural beliefs and practices that include culturally appropriate activities (Martinez, 2009) . In sum, the focus’ of the reviewed studies has made it difficult for the interventions to attend to factors across the levels of the ecological model. However, considering intrapersonal, interpersonal and environmental factors, exposure to a variety of activities for an intervention period with viable exit routes into community settings may be effective in causing long-term behavior change among females.
The ecological model has been successfully used to guide interventions in a variety of health behaviours including disease management, nutrition interventions and smoking cessation (Sallis, Owen, & Fisher, 2008) . Although not extensively researched in the context of PA, several studies in the area have concluded that interventions benefit from the consideration of multi-level factors captured by the ecological model. For example, in a German preschool setting De Bock, Genser, Raat, Fischer and Renz-Polster (2013) found that community-based approaches led by parents were able to promote physical activity and reduce sedentary behaviours in pre-schoolers. Additionally, Brown et al.’s (2009) study of American pre-schoolers and Sallis, Bauman and Pratt’s (1998) review of adults in industrialised countries concluded that environment and policy play an important role in increasing PA levels. Similarly, Wilbur, Chandler, Dancy, Choi and Plonczynski (2002) conducted focus groups with forty-eight women who identified policy and environmental factors as influential in their PA behaviours and decisions. Culture was also identified as important, and the authors concluded that environments and policies need to be culturally and socially sensitive to positively impact PA within specific female sub-groups. In combination, these studies demonstrate the successful implementation of interventions that consider factors across ecological levels, but they also illustrate the requisite for consideration of multi-level factors during intervention development and intervention research.
The limitations of the current review must be acknowledged. Firstly, the variability within studies in terms of measures, duration, and intervention type made statistical comparisons and grouping challenging, but by opting for a narrative systematic review we were able to include more studies and extract patterns across different methodologies. Secondly, the decision to only include UK-based interventions in this population limits the generalisability of our findings. Many interventions were excluded because they did not segregate gender in their analyses, or because they took place outside the UK. However, it was important for us to define the context of the interventions because we were looking for patterns of methodology and results within each of the levels of the ecological model.
The current review offers a detailed analysis of UK-based PA interventions aimed at females aged 14 - 25 years old. We have been able to highlight some patterns in the literature and make meaningful suggestions for design of future interventions and research: 1) To gain detailed understanding of the factors involved in behaviour and lifestyle change, study designs should implement diverse strategies that correspond with multiple levels of the ecological model. 2) More mixed methods, triangulation studies, and longitudinal evaluations would provide better evidence and lead to understanding of sustained intervention effects. 3) Future research should aim to investigate diverse female sub-groups including cultures and demographics so that interventions can be appropriately tailored. 4) Interventions should offer a wider range of activities and enhance participant input to increase enjoyment, and potentiate adherence and long-term behaviour change.
In conclusion, PA interventions designed considering the influence and dynamic interplay of multi-level factors suggested by the ecological model are likely to be valuable in promoting sustainable PA; such interventions aimed at young females are currently lacking. Future research should employ a variety of methodologies to evaluate intervention effectiveness.
PRISMA item 8
Search: Present full electronic search strategy for at least one database; including any limits uses, such that it could be repeated.
Electronic literature searches were conducted using five databases: SPORTDiscus, Medline®, Psycarticles®, Scopus®, and Web of science™.
Search strategy: Scopus®
1) Fields: Title/keywords/abstract
6) 1 or 2 or 3 or 4
7) Physical activity
11) 6 or 7 or 8 or 9
12) 5 and 10
17) 12 or 13 or 14 or 15
18) 5 and 10 and 16
19) Limit to: Journal articles
20) Limit to: English language