The global incidence rate of female breast cancer (BCa) has been on the rise since the 1970s even in the countries in Asia and Africa that had previously reported low rates. BCa is the most common invasive cancer in women, accounting for over 25% of all cancer cases  and affecting about one in eight women during their lives. The WHO has concluded that life expectancy, urbanization and western lifestyles  are the major risk factors for BCa.
BCa is a disease with genetic background, but genetics may only explain 5% - 10% of all cases  . Most BCa cases occur due to the mutations caused by the interaction between an environmental factor and a genetically susceptible host  .
Ageing, which may influence carcinogenesis, has been regarded as a prime contributing factor to BCa  . Tobacco smoking has been long postulated as one of the environmental factors to cause BCa  . For instance, the risk of BCa may be increased from 35% to 50% in female smokers. Anti-smoking campaigns have reduced the rate of smoking in women in the developed world  , but the BCa incidence rate remains much greater  than in the developing world, and the incidence rate in the developed world continues to rise  .
Several other alternative hypotheses about the relationships between BCa and contributing environmental factors have been explored in the past decades. Relaxed natural selection has been postulated to increase the BCa risk due to potential BCa genes/mutation accumulation  . Decreasing physical activity has been associated with the increase of BCa risk, although the mechanism of effect of exercise is not fully established. Supplemental to this conclusion, the Lancet Physical Activity Series Working Group  reported that BCa risk may be reduced when females become physically active  . High-fat  and/or high-alcohol  diet patterns have been related to BCa risk.
The incidence of BCa varies greatly around the world. The WHO has associated regional variations with country groups due to their different socio-economic levels  . Genetic differences between ethnic groups have also been implicated in the genesis of regional variations. Perhaps relaxed natural selection is involved through the accumulation of BCa genes or mutations due to modern medicine advancement, which allows early onset BCa patient to survive, but makes BCa genes inheritable to the next generation  .
Female reproductive behavior was initially postulated to be associated with BCa risk since it was greater among nulliparous Catholic nuns 300 years ago  . Specifically, the postulation that female childbearing reduces BCa risk was advanced in the 1920s and confirmed in 1970’s  . The underlying mechanism for this relationship is that pregnancy breaks menstrual cycles, which reduces breast exposure to estrogen. Studies have identified that estrogen may cause DNA damage and thus initiation of BCa  . Recent studies have shown that estrogen receptor (ER) positive BCa may make up approximately 70% of all BCa  and that child-bearing may decrease the risk of developing BCa by up to 50%  .
The incidence of BCa varies greatly around the world. Genetic differences between ethnic groups have been implicated in the genesis of regional variations. Perhaps relaxed natural selection is involved through the accumulation of BCa genes or mutations due to modern medicine advancement, which allows early onset BCa patient to survive, but makes BCa genes inheritable to their next generation   .
Professionals and laypeople are still intrigued with the mechanisms about how physical activities, diet patterns, genetic background and reproductive behaviour contribute to BCa from the perspective of physiology. However, a number of publications have reported that females with higher socioeconomic levels may be subject to higher risk of BCa   . Furthermore, as the directing and coordinating authority for health within the United Nations system, the WHO and its cancer research agent, the International Agency for Research on Cancer (IARC) also support the theory that female’s socioeconomic level is associated with BCa risk   . Therefore, females in the developed world and those at high socioeconomic levels in developing countries may have wondered: what is wrong to be at higher socioeconomic level?
The present study starts with measures of proximal causes of BCa, analysing how BCa incidence rate relates to birth rate, socio-economic factors, urbanization, overweight and ageing. It then assesses which underlying factors, from socio-economic factors, urbanization, overweight to ageing to birth rate, account for significant proximal risks and overall BCa incidence. Finally, it shows that birth rate plays the determining role in contributing to regional variations of BCa incidence rate.
2. Materials and Methods
2.1. Data Sources
The country specific variables were collected for this ecological study.
2.1.1. The GLOBOCAN 2012 Estimates of Incidence rate of Female BCa (C50) 
BCa incidence rate indicates the number per 100,000 females who were diagnosed with BCa in 2012. The rate was age-standardized using the World standard population to increase the comparability.
GLOBOCAN is a project conducted by the International Agency for Research on Cancer (IARC) of the WHO. This project provides contemporary population level estimates by cancer site and sex using the best available data in each population and nine comprehensive methods of estimation  .
2.1.2. The World Bank Published Data on Birth Rate, GDP and Urbanization
Birth rate indicates the number of live births per 1000 population occurring at midyear during the year 1992.
Socio-economic levels measure with GDP has been related to BCa incidence rate  . GDP is used as the index of socio-economic level and it is expressed in per capita purchasing power parity (PPP in current international $) in 2010.
Urbanization is expressed with the percentage of total population living in urban areas in 2010. Urbanization, representing a major demographic shift, entails lifestyle changes, including diet with more energy dense components, such as high fat and high alcohol in daily diet, and less physical exercise  . Therefore, urbanization has been postulated as a major BCa predictor  .
2.1.3. The United Nations Statistics Division Estimates of the Life Expectancy
Life expectancy, indexed as ageing in this study, has been considered as an attributable factor to BCa  . Women age 50+ enter menopause, which leads to fall in estrogen levels. Therefore, life expectancy (e50, 2005-2010) was extracted from abridged life tables (1950-2100) published online by the United Nations.
The WHO Global Health Observatory (GHO) has published data on the estimated prevalence rate of women who are overweight. The overweight prevalence is expressed as the percentage of the population (2010) aged 18+ with a BMI ≥ 25 kg/m2. Being overweight also has been postulated as a risk factor of BCa.
2.2. Data Selection
We used country specific BCa incidence rate, birth rate, GDP (index of socio-economic level), urbanisation, overweight prevalence (Western lifestyle) and life expectancy (ageing) for all countries where data were available. We matched BCa incidence rates and birth rate by country and we obtained a set of data consisting of 179 countries.
Each country was treated as an individual in the analysis. The numbers of countries included in the analysis of relationships with other variables may have differed somewhat because all information was not uniformly available for all countries.
2.3. Data Analysis
Various statistical analysis methods were applied in this study to explore the correlation between birth rate and BCa incidence rate.
Data robustness and variable distributions check
Scatter plots were used to explore and visualize the correlations between birth rate and BCa incidence rate. The strength and form of the relationship between BCa and birth rate were analysed using actual values of the two variables. For other analyses, variable values were logarithmically transformed to bring their distributions closer to normality.
To examine the correlation between birth rate and BCa incidence, the underlying contributing factors of BCa risk and the determining role of birth rate in regional variation, the analysis proceeded in four steps:
Pearson’s r and nonparametric correlations were used to evaluate the strength and direction of the correlation between all the variables.
The independent relationships between BCa and each of the five independent variables are explored with partial correlation of Pearson’s moment-product approach while we controlled for the other four variables. This allows the identification of the strongest correlation and its independency.
Standard multiple linear regression (enter) was performed to describe the relationships between the outcome variable and the explanatory variables. In order to highlight that birth rate is the major population-level contributor to BCa incidence, standard multiple linear regression (enter) was also conducted to calculate the correlation between BCa incidence and the risk factors when birth rate is included and excluded respectively.
The equation of the best fitting trendline (polynomial) displayed in the scatter plots analysis of relationship between birth rate and BCa incidence was used to calculate and remove the contributing effect of birth rate on BCa incidence rate, which allowed the creation of a new dependent variable, “Residual of BCa standardised on birth rate”. Means of the BCa incidence rate and the “Residuals of BCa standardised on birth rate” of all the countries were calculated for mean difference comparison. Countries were categorized as per the UN common practice of defining more developed and developing countries and WHO regions for investigating the regional variations based on mean difference.
Independent Samples T-test was conducted to compare the means of the two BCa incidence variables of the pairs of UN country groupings. Post hoc Scheffe (Oneway ANOVA) testing was performed to compare difference of multiple means between six WHO regions.
Scatter plots and calculation of means were performed in Excel® (Microsoft 2016). Pearson and partial correlations, multiple linear regression analysis, Independent Samples T-test and Post hoc Scheffe for mean comparison were conducted using SPSS v. 22 (SPSS Inc., Chicago Il USA). The original data were used for scatter plots and mean calculation of BCa incidence rate and “Residual of BCa standardised on birth rate”. To increase homoscedasticity of data distributions log transformed variables were used for correlation analyses. The significance was kept at the 0.05 level, but 0.01 and 0.001 levels are also reported. Standard multiple linear regression analysis criteria were set at probability of F to enter ≤ 0.05 and probability of F to remove ≥ 0.10.
Figure 1 shows that the relationship between the birth rate and BCa incidence rate is polynomial with a strong negative correlation (R2 = 0.5024).
Figure 1. The relationship between birth rate and breast cancer incidence rate.
The non-linear relationship between birth rate and BCa incidence variables identified in the scatterplots shows the strong correlation between birth rate and BCa incidence. This relationship was confirmed by the subsequent analyses of log-transformed data and in nonparametric analysis.
Worldwide, birth rate was significantly correlated to BCa incidence (r = −0.680 and rho = −0.723, p < 0.001 respectively in Pearson and non-parametric analyses) (Table 1).
Table 1 showed that not only birth rate, but also GDP, urbanization, overweight prevalence and ageing correlate significantly to BCa incidence rates in both Pearson and non-parametric analyses.
There was a strong and highly significant correlation between GDP and birth rate (r = −0.760 and rho = −0.797, p < 0.001 respectively in Pearson and non-parametric analysis).
The relationship between dependent variable (BCa) and each independent variable (birth rate, GDP, urbanization, overweight and ageing) was examined by controlling for the other four variables in a partial correlation analysis. Birth rate was the only independent variable to have a strong and significant correlation (r = −0.330, p < 0.001) with BCa independent of the other four variables (Table 2). None of the other four variables (GDP, urbanization, overweight and ageing) showed a correlation with BCa incidence independent of the other four variables despite the fact that each of them (GDP, urbanization, overweight and ageing) had a strong significant correlation to BCa incidence in Pearson r and non-parametric correlation analysis. This suggests that birth rate is the independent determinant of the secondary association between BCa incidence and environmental factors.
Standard multiple linear regression (enter) analysis was applied to further predict BCa incidence when birth rate, GDP, urbanization, overweight and ageing were used as the independent variables.
Table 1. Pearson r (above the diagonal) and nonparametric (below the diagonal) correlation between all variables.
The table describes the bivariate correlation between all the variables. ***p < 0.001; Country number: 171 - 179. Breast cancer incidence rate is from the International Agency for Research on Cancer. Birth rate, GDP and urbanization are from the World Bank. Ageing expressed as life expectancy (e50) is from the United Nations. Overweight prevalence is from the World Health Organization.
Table 2. Comparison of partial correlation coefficients between breast cancer incidence and each variable when the other four variables are controlled for.
The table describes the partial correlation between breast cancer incidence between each variable while the other four variables are controlled for. Controlled variable. Breast cancer incidence rate is from the International Agency for Research on Cancer. Birth rate, GDP and urbanization are from the World Bank. Ageing expressed as life expectancy (e50) is from the United Nations. Overweight prevalence is the World Health Organization.
When birth rate is excluded as one of the independent variables, GDP (β = 0.401, p < 0.001) and ageing (β = 0.300, p < 0.001) are the two significant predictors of BCa incidence. However, when birth rate was included as an independent variable, the correlations between BCa incidence and both GDP and ageing become very weak and no longer reach statistical significance (Table 3). This supports our previous suggestion that birth rate is the principal and independent determinant of BCa incidence in partial correlation analysis.
Table 4 shows that the mean BCa incidence rate was lowest in South-East Asia (26.31) and highest in Europe (63.60). The means of BCa in the other four regions are Africa (26.99), Eastern Mediterranean (40.77), Western Pacific (43.03) and Americas (46.98). A post hoc Scheffe analysis conducted on the multiple mean comparisons revealed that there were a number of significant mean differences in BCa incidence rates between different WHO regions (Table 4). Mean of BCa incidence in Africa was significantly lower than that in Americas, Europe and Western Pacific. Mean of BCa incidence in the Americas was significantly lower than that in Europe and Western Pacific. Mean of BCa incidence in Eastern Mediterranean was significantly lower than that in Europe. The mean
Table 3. Independent predictors of breast cancer incidence rate based on multiple linear regression modeling.
The table describes the multiple linear regression analysis results including and excluding birth rate as a predictor of breast cancer. df = 167; excluded variable. Breast cancer incidence rate is from the International Agency for Research on Cancer. Birth rate, GDP and urbanization are from the World Bank. Ageing expressed as life expectancy (e50) is from the United Nations. Overweight prevalence is the World Health Organization.
BCa incidence in South-Eastern Asia was significantly lower than that in Americas, Europe and Western Pacific whilst the mean BCa incidence in Western Pacific was significantly lower than that in Europe.
A subsequent ANOVA with post hoc Scheffe procedure performed on the means of “Residual of BCa standardised on birth rate” in different WHO regions showed no significant differences among and between regions (Table 4).
Interestingly, mean BCa incidence in the developed regions was significantly greater than that in the developing regions (mean difference = 9.75, p < 0.001). However, the difference between the means of the “Residual of BCa standardised on birth rate” in the developed region and developing region is weak and does not reach statistical significance (Table 4).
The results from post hoc Scheffe tests conducted on mean comparison between the WHO regions suggest that regional variations of BCa incidence may only reach statistically significant levels if the contribution of their respective birth rates is included. In other words, except for birth rate, the contribution of the other BCa predicting factors to BCa incidence may not be sufficient for the difference in mean rates to reach significance. This result is supported by the findings identified in our previous partial correlation (Table 2) and multiple linear regression (Table 3) that birth rate is the critical risk factor of BCa.
The worldwide trend of increased BCa incidence may have multiple aetiologies, which may act through multiple mechanisms. Our ecological analysis suggests that birth rate may be the determining factor of BCa incidence at the population level. This study also reveals that the effect of birth rate on BCa incidence is independent of the effects of socio-economic factors, urbanization, overweight and ageing.
The results of this study show that, a country with greater birth rate may have lower BCa incidence. This supports the observation from previous studies that higher parity is associated with a decreased risk of BCa based on observational
Table 4. Mean difference between WHO regions, and between UN developed and developing regions.
Mean difference comparison results are reported. *p < 0.05, **p < 0.01, ***p < 0.001. Breast cancer incidence rate is from the International Agency for Research on Cancer. Birth rate, GDP and urbanization are from the World Bank. Ageing expressed as life expectancy (e50) is from the United Nations. Overweight prevalence is the World Health Organization. Abbreviations: AF, Africa; AM, Americas; EM, Eastern Mediterranean; EU, Europe; SEA, South-East Asia; WP, Western Pacific.
approaches. This study used the ecological approach, which has an advantage over the observational studies in terms of obtaining more variables  for data analysis. For instance, we were able to use 5 variables, which allowed us to control for four variables, including the socio-economic factor (GDP), which has been used by the WHO to interpret the regional variations of BCa incidence  .
The prevalent interpretation that greater birth rate protects against female BCa  is that the interruption in the normal menstrual cycle during pregnancy and subsequent breast feeding is associated with an interruption in the normal cyclical production of oestrogen  , but an increase in oxytocin  . The public have been extensively educated for decades that oestrogen contributes to BCa as it fuels the growth of most breast cancer tumours  .
Oxytocin, produced during pregnancy, delivery and breastfeeding, may have a role in the control of mammary cell growth  and inhibiting proliferation of human BCa cells, which may offer BCa prevention and treatment  . These findings have driven a hypothesis that oxytocin may have therapeutic effects on cancer  . Similarly, Misra et al. (2012) reported that females with greater parity may reduce their long-term BCa risk because of multiple hormones released during pregnancy that generate genetic changes in the mammary glands which decrease BCa risk in mature breast cells  .
The WHO and its agent the IARC have endorsed the paradigm that BCa incidence is lower in less-developed countries but greater in the more-developed countries and this has been widely cited in a large body of literature to describe regional variations of BCa incidence  . This may lead to the impression that GDP is the main risk factor of BCa. However, this paradigm is not supported by the results of the three statistical analyses in this study. Firstly, birth rate, other than GDP, is the only predicting factor which is correlated to BCa incidence independent of all the other four confounders in partial correlation analysis. Secondly, in this study, once the effects of birth rate are considered in multiple linear correlation analysis, the correlation between BCa incidence and GDP and other variables disappears. Finally, it is the contribution of birth rate instead of GDP that accounts for the statistically significant regional variations.
In this study, birth rate was the principal determining predictor of BCa incidence and it may explain the correlation between GDP and BCa incidence. GDP shows a significant and strong correlation to birth rate in both Pearson r and non-parametric correlation analysis. This relationship is consistent with the theory of the demographic transition which proposed that a country or region may transition from high birth to lower birth rate when it is transforming to an industrialized economic system  .
There are several caveats, including the one conceptualized as the ecological fallacy  , to this study.
Firstly, each country is considered as a subject in this study. The country-specific data included in this study were aggregated, different from data collected from individual patients. Therefore, values for risk-modifying factors may not hold true for individuals to develop BCa.
Secondly, data aggregated and/or collected by the UN and its agencies (WHO, IARC and the World Bank) may include some random errors arising from methods of reporting incidence of BCa, reliability of diagnoses and possible administrative errors. For instance, data quality of the BCa incidence depends upon the quality and on the amount of the information available for each country. In general, data from developing countries are less complete than those from developed countries.
Finally, there are around 20 sub-types of BCa, such as ductal carcinomas and lobular carcinomas. This study only focuses on the hormone receptor-positive BCa. Recently, scientists at Boston University found that high parity was associated with an increased estrogen and progesterone receptor negative (ER-/PR-) BCa  . This suggests that high parity has dual effect on BCa, which our data analysis may not be able to explain.
Birth rate, instead of socioeconomic level, plays a determining role in worldwide BCa incidence rate and regional variations. Current BCa projection methods may estimate future rates of BCa poorly if they fail to incorporate the impact of birth rate.
WY conceived the idea, and IS, FR and MH consolidated the hypothesis. WY and MH conducted data analysis. All authors interpreted the data analysis result and provided suggestions for the manuscript writing. All authors reviewed, edited and approved the final manuscript.
Conflict of Interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that the manuscript has been read and approved by all named authors.
This research was supported by the Mäxi Foundation, Zurich, Switzerland.
All the aforementioned data were freely available from the official websites of the UN agencies. No ethical approval or written informed consent for participation was required.