Cotton is known as white gold and is the principal source of foreign exchange earnings in many countries. It contributes to the oil and seed sector as well as fiber industries  . Pakistan’s economy totally relies upon cotton production directly or indirectly   . Among the four cultivated species of cotton, Gossypium arboreum also called “true cotton”, is a specie native to the Indian sub- continent and presently used in introgressive breeding as a donor species for improving resistance to insect pests and disease especially CLCuD in upland cotton   . Since about 6000 BC, this crop has been present in Pakistan in a domesticated form and most of the biodiversity of this species occurs in this country  . G. arboreum is still grown on a small number of acres in Pakistan by small landholding farmers.
For sustainability and genetic improvement of a crop, genetic diversity can be defined as, “the amount of genetic variability among individuals of a population”. As information about genetic diversity among G. arboreum L. is scanty  , it is an area of interest. G. arboreum L. has many favorable traits like drought tolerance, resistance to insect pests etc., which are not found in the G. hirsutum L.  . The understanding of genetic variation within and between the populations of G. arboreum L. is not only valuable to create a theoretical base for conserving the Asiatic cotton germplasm resources, but also to identify and improve desirable and ideal traits such as fiber quality and unique plant features that can be exploited for modern cotton production  .
Pakistan is the fourth largest producer of cotton that has good fiber quality. However, due to some limiting factors including both biotic and abiotic stresses, its production has remained stagnant for the last two decades  . Among these, Cotton leaf curl disease (CLCuD) is one of the major limiting factors causing huge yield losses  . Many efforts have been made by the breeders to overcome this problem by developing resistance/tolerant varieties using different conventional breeding approaches, but these varieties become susceptible after two to three years due to mutations occurring in the viral strains causing CLCuD. Therefore, sufficient genetic variability, proper exploitation of the existing varieties through hybridization, polyploidy creation and introduction of new exotic germplasm are very important  .
The extent of genetic variation can be estimated using various biometrical techniques like principal component analysis   , cluster analysis  or Principal coordinate analysis  . Among these biometrical procedures, the main requirement of principal component analysis (PCA) is that each genotype can be assigned to only one group and it also reflects the significance of largest contributor to the total variability at each axis of differentiation  .
The present research, to estimate the genetic diversity among the exotic lines of G. arboreum L. imported from the USA using correlation, principal component analysis was thus conducted to identify suitable genotypes for resistance/tolerance against CLCuD as well as good morphological, yield, and fiber related traits. Moreover, to find out the relationship among these traits and identify superior genotypes for use in future breeding programs.
2. Material and Methods
2.1. Breeding Materials
The present research was carried out at the experimental farm of Central Cotton Research Institute (CCRI), Multan, Pakistan, which is located at 30.1978˚N latitude and 71.4697˚E longitude during cotton crop growing season 2011-2012. One hundred and nineteen accessions imported from the USA were evaluated. The sowing was done on June 14, 2011, which is a very late sowing time under the agro-ecological conditions of Multan. Each genotype was sown in two rows 2.67 m in length, and 75 cm apart having plant-to-plant distance of 30 cm. Normal cultural and agronomic practices were applied as recommended and data were recorded on ten consecutively undamaged and representative plants for each entry. Among the measured traits, number of monopodial branches, number of sympodial branches, node to first monopodial/sympodial branch, plant height (cm), number of bolls plant−1, boll weight (g), seed cotton yield (gp−1), lint %age, staple length (mm), fiber strength (G Tex−1) and micronaire value (μg Inch−1) were included. For lint %age, each dry and clean seed cotton sample was weighed and ginned separately with a 10 saw ginning machine and lint %age was calculated from the lint obtained from these samples. The incidence of CLCuD as a percentage was calculated using the formula as suggested by   (Table 1).
CLCuD incidence (%) = Sum of all disease ratings/total number of plants × 25.
2.2. Statistical Analysis
The data recorded were averaged and analysed for simple statistics i.e. mean, variance, range, frequency distribution, coefficient of variance and standard deviation using computer software (MS-Excel 2007). Correlation and principal component analysis (PCA) was performed on the recorded data for quantitative traits  . Before the correlation and principal component analysis (PCA), the mean of each parameter was standardized so as to avoid scaling difference effects. For all the pairs of accessions, Euclidean distance coefficients were calculated. The Euclidean dissimilarity coefficient matrices were utilized to estimate the association among the germplasm through (NTSYS pc v 2.1).
Basic descriptive statistics (mean, variance, range and standard deviation) of all the genotypes for morphological, yield and fiber traits were studied (Table 2). It was observed that maximum variation (1258.74) was recorded in plant height, which ranged from 48 - 210 cm followed by seed cotton yield (658.91) and number of bolls plant−1 with a range of 15 - 150 g and 6.71 respectively. Among the studied traits, lint % age showed a variance of (46.08) having a range of 4.2 to 44.5%. Similarly, number of sympodial branches and fiber strength also showed considerable variation (31.22), (15.62) which ranged from 14 - 41 and 16.7 - 39.9 G/Tex respectively. The traits like staple length (8.76), first sympodial node number (7.81), number of monopodial branches (5.50), micronaire value (1.08) and boll weight (0.23) have comparably less variation among the genotypes (Table 2).
3.1. Correlation Studies
Phenotypic correlation analysis showed significant association among the 12
Table 1. Symptoms and rating scale of cotton leaf curl virus disease (CLCuD).
Table 2. Descriptive statistics for various traits of interest in G. arboreum accessions.
Table 3. Simple correlation coefficient for morpho-yield, fiber traits and cotton leaf curl virus disease.
PL = Plant height, MB = Monopodial branches, SB = Sypodial branches, FSN = First sympodial node, BW = Boll weight, BPP = Bolls per plant, SCY = Seed cotton yield, LINT % AGE = Ginning out turn, SL = Staple length, MIC = Micronair, FS = Fiber Strength and CLCUD = Cotton Leaf Curl Disease. While, **and *are significant at 5% and 1% levels of probabilities respectively.
studied traits of the 119 genotypes (Table 3). The correlation analysis revealed that seed cotton yield was highly significantly and positively correlated with boll weight (0.490**) and number of bolls plant−1 (0.784**). Seed cotton yield was also positively and non-significantly correlated with plant height, sympodial branches, first sympodial node, micronaire and fiber strength. whereas it was non-significantly negatively correlated with monopodial branches and staple length. Similarly, plant height showed highly significant and significant positive correlations with sympodia branches (0.616**), lint percentage age (0.321**) and micronaire value (0.197*). The association of monopodial branches with sympodial branches (0.211*) and fiber fineness (0.190*) was also positive and significant. Lint percentage was highly significantly and positively correlated with plant height (0.321**), sympodial branches (0.263**) and micronaire value (0.355**). While a significant negative relationship was found between the micronaire value and fiber strength (−0.221*).
3.2. Principle Component Analysis
Based on twelve morpho-yield and fiber traits along with CLCuD rating, 119 genotypes were evaluated for principal component analysis. Among the 11 principal components (PCs), only four components were found having an eigen value greater than 1 which showed overall variability of about 65.88% among the studied G. arboreum L. genotypes for the total phenotypic variation (Table 4). The remaining 34.12% of the variation was shown by the other seven components evaluated for yield and fiber traits as well as for CLCuD rating. Maximum variability of 26.08% was shown by PC1 followed by PC2 (16.41%), PC3 (14.57%) and PC4 (8.83%) respectively (Table 4). The scatter diagram of the principal component analysis (PCA) as shown in Figure 1 for these exotic G. arboreum L. lines, shows a considerable level of genetic diversity among these
Table 4. Principle component analysis of CLCuD and other quality traits in some G. arboreum genotypes of cotton.
Figure 1. Scatter diagram of principal component analysis.
studied genotypes. A trait’s contribution towards variability among the PCs showed that maximum positive loading factor of PC1 was determined only by the staple length (0.135) (Figure 1), while the remaining 11 traits had a negative and minimum loading factor in the same PC (Table 4). The overall variation of the PC2 was 16.40% and in this PC, Seed cotton yield (0.625) exhibited maximum positive loading factors followed by number of bolls plant−1 (0.577), staple length (0.373), first sympodial node (0.360), boll weight (0.348), fiber strength (0.272) and CLCuD (0.087). While, negative loading was observed for plant height (−0.398), number of monopodial (−0.401) and sympodial branches (−0.392), lint %age (−0.379) and micronaire value (−0.392). PC3 shared 14.57% of the total diversity among the genotypes and were found to be positively contributed towards micronaire value (0.553) along with boll weight (0.391), seed cotton yield (0.300), number of bolls plant−1 (0.124), lint %age (0.038) and number of monopodial branches (0.036). Similarly, negative loading were contributed by the traits staple length (−0.711), fiber strength (−0.575), plant height (−0.423), CLCuD (−0.200), number of sympodial branches (−0.349) and first sympodial node number (−0.044). In PC4, agro-morphological variations with a total of 8.83% was elucidated by the genotypes. Among the traits, a positive and maximum contribution was that of boll weight (0.518) with the other studied traits contributing less. For example, number of monopodial (0.267) and sympodial branches and 0.184), seed cotton yield (0.120), plant height (0.079), CLCuD (0.023) and staple length (0.010). For a few traits like number of bolls plant−1 (−0.144), micronaire value (−0.369), lint percentage (−0.369) and first sympodial node (−0.604) negative and minimum factors were loaded (Table 4).
For any successful breeding program, selection of better genotypes having desirable traits and significant positive associations among these traits is of utmost importance  . The highly significant positive correlations of seed cotton yield with boll weight and number of bolls plant−1 in the present study depicted the direct impact of boll weight and number of bolls plant−1 on seed cotton yield in cotton genotypes. These results are in agreement with the previous findings   who also reported positive associations among yield and yield components in exotic lines of G. hirsutum genotypes. Remarkably a positive relationship of lint %age with micronaire value and a significant but negative association with staple length in this study are also in agreement with the previous findings of  . They also described a positive relationship between lint %age and micronaire value and a significant but negative association of lint %age with staple length in the imported U.S. upland cotton germplasm.
No associations with CLCuD were observed for any of the studied traits, as all the G. arboreum lines evaluated were scored resistant to CLCuD. As a species, G. arboreum L. is highly resistant to CLCuD. This could be due to the presence of a lower palisade layer, densely arranged midrib cortex cells and a relatively higher distance between the lower epidermis of the midrib and phloem  . These results are in contrast to the previous findings of  who reported a significant negative association in seed cotton yield with CLCuD, but this is likely due to the difference in the species studied. In present studies G. arboreum L. were used instead of G. hirsutum L. lines which were used in the research of  . As a species, G. hirsutum L. is highly susceptible to CLCuD.
Principal component analysis (PCA) is an important tool for the partitioning of the total variability into its components and for the preservation and utilization of the genetic resources as well  . The partitioning of cotton genotypes into different principal components (PCs) in this study was due to the morphological differences and not due to the geographical origin sites of these genotypes. In the present study, four principal components (PCs) out of 11 PCs showed a significant amount of variability 65.88% and genetic diversity within and between the exotic lines of G. arboreum L. Similar results were also found from the previous work of  who studied four PCs out 12 in 79 exotic accessions of G. hirsutum L. They reported 65.4% variability among the genotypes screening against CLCuD, yield and fiber attributed traits. It is commonly accepted that maximum heterotic effects occur through maximum variability  . Maximum contributions for these traits were found among the genotypes GS-4, GS-9, GS-8, GS-55 and GS-50. The presence of a sufficient amount of variability in these cotton genotypes offers an enormous capacity for selection and utilization to improve G. arboreum in future breeding programs   . Moreover, these lines would be the best candidates to use in an inter-specific hybridization program designed to improve tolerance to CLCuD and drought in G. hirsutum.
The results depicted here indicate that G. arboreum L. genotypes possess sufficient genetic diversity and significant positive associations among most of the yield related traits. Based on principal component analysis, variability was shown by the first four components, which contributed 65.88% of the total variation among the genotypes. Based on PCA, the genotypes GS-4, GS-9, GS-8, GS-55 and GS-50 should be utilized successfully in breeding programs based on their high positive loading factor for staple length (0.135) in PC1 and seed cotton yield (0.625) maximum number of bolls plant−1, boll weight, first sympodial nod, staple length and fiber strength in PC2. The correlation and principal component analyses enabled the recognition of genotypes having high potential yield and unique fiber characteristics. These can be used not only for interspecific hybridization programs to transfer CLCuD to G. hirsutum varieties, but also potentially selected and developed into G. arboreum (Desi) varieties for production under low input cultivation conditions, especially common for small holding' farmers.
This study was supported in part by the “Pak-US cotton productivity enhancement program” CCRI Multan Component ID 1198-02, of the International Cen- ter for Agricultural Research in the Dry Areas (ICARDA) funded by United States Department of Agriculture (USDA), Agricultural Research Service (ARS), under agreement No. 58-6402-0-178F. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the USDA or ICARDA. Furthermore the authors thank Dr. J. Scheffler for their critical review of the manuscript.
 Khan, N.U. (2013) Diallel Analysis of Cotton Leaf Curl Virus (CLCuV) Disease, Earliness, Yield and Fiber Traits under CLCuV Infestation in Upland Cotton. Australian Journal of Crop Science, 7, 1955-1966.
 Khan, N.U. and Hassan, G. (2011) Genetic Effects on Morphological and Yield Traits in Cotton (G. hirsutum L.). Spanish Journal of Agricultural Research, 9, 460-472.
 Kulkarni, V.N. (2002) Hirsutization of G. arboreum Cotton and Genetic Emendation of G. hirsutum for Sucking Pest Resistance. Ph.D. Thesis Submitted to University of Agricultural Sciences Darward, India.
 Ansingkar, A.S., Kadke, P.P., Borekar, S.T. and Bosle, S.S. (2004) Altering G. hirsutum Cotton to Cellule Level to Impart Multiple Sucking Pests’ Resistance through Inters Specific Hybridization. Proceeding of International Symposium on “Strategies for Sustainable Cotton Production—A Global Vision, Crop Improvement, 23-25 November 2004, University of Agricultural Sciences, Darward, Karnatka, 101-103.
 Mehetre, S.S., Aher, A.R. and Gawande, V.L. (2003) Induced Polyploidy in Gossypium: A Tool to Overcome Inter-Specific Incompatibility of Cultivated Tetra-Ploid and Diploid Cottons. Current Science, 84, 1510-1512.
 Rehman, M., Yasmeen, T., Tabbasam, N., Ullah, I., Asif, M. and Zaffar, Y. (2008) Studying the Extent of Genetic Diversity among Gossypium arboreum Genotypes Using DNA Finger Printing. Genetic Resources and Crop Evolution, 55, 331-339.
 Verma, S.K., Siwach, P. and Sethi, K. (2014) Genetic Improvement of Gossypium arboretum L. Using Molecular Markers: Status and Development Needs. African Journal of Agricultural Research, 9, 2238-2249.
 Saeed, F., Farooq, J., Mahmood, A., Malik, T.H., Riaz, M. and Ahmad, S. (2014) Genetic Diversity in Upland Cotton for Cotton Leaf Curl Virus Disease, Earliness and Fiber Quality. Pakistan Journal of Agricultural Research, 27, 226-236.
 Farooq, A., Farooq, J., Mahmood, A., Batool, A., Rehman, A., Shakeel, A., Riaz, M., Shahid, M.T.H. and Mehboob, S. (2011) An Overview of Cotton Leaf Curl Virus Disease (CLCuD) a Serious Threat to Cotton Productivity. Australian Journal of Crop Science, 5, 1823-1831.
 Esmail, R.M., Zhang, J.F. and Abdel-Hamid, A.M. (2008) Genetic Diversity in Elite Cotton Germplasm Lines Using Field Performance and RAPD Markers. World Journal of Agricultural Sciences, 4, 369-375.
 Li, Z., Wang, X., Yan, Z., Guiyin, Z., Wu, L., Jina, C. and Ma, Z. (2008) Assessment of Genetic Diversity in Glandless Cotton Germplasm Resources by Using Agronomic Traits and Molecular Markers. Frontiers of Agriculture in China, 2, 245-252.
 Bajracharya, J., Steele, K.A., Jarvis, D.I., Sthapit, B.R. and Witcombe, J.R. (2006) Rice Landrace Diversity in Nepal: Variability of Agro-Morphological Traits and SSR Markers in Landraces from a High-Altitude Site. Field Crops Research, 95, 327-335.
 Brown-Guedira, G.L. (2000) Evaluation of Genetic Diversity of Soybean Introductions and North American Ancestors Using RAPD and SSR Markers. Crop Science, 40, 815-823.
 Akhtar, K.P., Haider, S., Khan, M.K.R., Ahmad, M., Sarwar, N., Murtaza, M.A. and Aslam, M. (2010) Evaluation of Gossypium Species for Resistance to Leaf Curl Burewala Virus. Annals of Applied Biology, 157, 135-117.
 Ali, M.A. Nawab, N.N., Abbas, A., Zulkiffal, M. and Sajjad, M. (2009) Evaluation of Selection Criteria in Cicer arietinum L. Using Correlation Coefficients and Path Analysis. Australian Journal of Crop Science, 3, 65-70.
 Méndez-Natera, J.R., Rondón, A., Hernández, J. and Merazo-Pinto, J.F. (2012) Genetic Studies in Upland Cotton. III. Genetic Parameters, Correlation and Path Analysis. SABRAO Journal of Breeding and Genetics, 44, 112-128.
 Pecetti, L. and Damania, A.B. (1996) Geographic Variation in Tetraploid Wheat (Tritium turgidum spp. turgidum con var. durum) Landraces from Two Provinces in Ethiopia. Genetic Resources and Crop Evolution, 43, 395-407.
 Malik, W., Iqbal, M.Z., Khan, A.A., Noor, E., Qayyum, A. and Hanif, M. (2011) Genetic Basis of Variation for Seedling Traits in Gossypium hirsutum L. African Journal of Biotechnolog, 10, 1099-1105.
 Ashokkumar, K. and Ravi-Kesavan, R. (2011) Morphological Diversity and Per Se Performance in Upland Cotton (Gossypium hirsutum L.). Journal of Agricultural Science, 3, 107-113.