With the expansion of China’s opening up and its acceleration of internationalization, especially the development of “Belt and Road” initiative, large number of international students are being attracted to China for medical education, most of them selecting an English-medium Bachelor of Medicine & Bachelor of Surgery (abbreviated as MBBS) program. In recent years, major concern has been raised on the education quality of MBBS students. China Ministry of Education has issued some quality standards and adjusted annual enrolment plan for improving MBBS students’ education quality; besides, many strategies have been offered by researches on quality improvement   . However, there’s no quality evaluation system of authority for MBBS students, especially for graduating students. It’s unknown whether higher education quality has been achieved or international quality standards have been met for MBBS graduates. To figure out the quality of MBBS students, this research constructs an index system of quality evaluation for international MBBS graduates by applied a modified Delphi method on the basis of referring to some important standards of medical education home and abroad and core competences for clinicians, with the purpose of providing reference on evaluating the quality of MBBS graduates in China.
2. Materials and Methods
2.1. Selection of Expert Panel
According to the research needs and basic requirements of Delphi method, 20 experts were selected from the colleges and universities where MBBS students were recruited for expert consultation. A total of 18 experts completed two rounds of expert consultation. The criteria for selecting experts are: 1) having a certain understanding or done some researches on MBBS students; 2) having over-10-year working experience in the field of MBBS students’ management or teaching; 3) having middle-rank professional title and bachelor’s degree for the minimum requirements; 4) having great interest in this research and available time to participate in the whole consultation process.
2.2. Literature Review
By widely collecting and systematically searching and studying related literature from several online databases, the principles, methods and main basis and references of constructing the index system for quality evaluation of medical graduates majoring in MBBS were confirmed. A two-round modified Delphi method was applied to establish the index system in accordance with the basic rules of scientificity, systematicness, feasibility and comprehensiveness. Referring to essential standards about medical education home and abroad, which were “basic medical education WFME global standards for quality improvement (the 2015 revision)”, “WHO guidelines for quality assurance of basic medical education in the western pacific region”, “global minimum essential requirements (GMER)”, “medical education standard of Chinese undergraduate—clinical medicine (2016)”, “quality standards of higher education for international students in China (Trial)” and “interim provision for quality control standards in undergraduate medical education” in English for international students in 2007, and other competence requirements for clinicians, the initial index system was drawn up, consisting of 4 primary indicators and 29 secondary indicators. Besides, the questionnaires used in the first round of expert consultation were designed, including main three parts: demographic questions such as name, age, gender and years of working experience, initiate index system, scoring instruction form and marking sheet.
2.3. Delphi Method
Research shows that the more rounds of expert consultation, the less response rate is achieved , and other study shows that two-round Delphi consultations will obtain higher accuracy in terms of responses, for usually respondents tend to drop out after that two iterations . Therefore two rounds of expert consultation were planned to carry out from February to May 2019 online and in person. In the first round, experts were asked to write down their demographic information and to rate each indicator on importance, judgment basis and familiarity with the following rules: the importance of indicators was rated through five scales from least important to most important with a 1 - 5 Likert scale (1 = least important, 5 = most important); the judgement basis was divided into four parts to score, practical experience, theoretical basis, domestic and foreign literature and intuition with the value of 0.4, 0.3, 0.2, 0.1, respectively; the familiarity of indicators would be judged from five scales with the score of 0.1 - 0.9 (0.1 = least familiar, 0.9 = most familiar). Besides, experts were allowed to make comments on deleting, adding and modifying indexes. In the second round, an anonymous summary of results from the first round would be presented to the experts so they could compare these with their own views. And the importance, judgment basis and familiarity of modified indicators were asked to be scored with unchanged rules and some suggestions would be made on certain indicators if needed.
2.4. Data Analysis
The data from two rounds of consulting results will be collected by EpiData 3.1 with two individuals and be analyzed by SPSS 20. Expert response rate and authority coefficient will be calculated. Besides, interquartile range (IQR), mean (M), coefficient of variation (CV), standard deviation (SD), full score ratio (FSR) of every indicator’s importance scores will be used to analyse consensus and to screen indexes.
3.1. Panel of Experts
A total of 18 experts finished two-round consultation, including 10 male and largely aged from 40 to 59. Over 90% of these experts had senior professional title and further studying abroad experience. Seventeen participants earned master degree or doctoral degree and 11 had once taught MBBS students in English. More details can be seen in Table 1.
3.2. Response Rate
Normally, expert response rate can be replaced by the recovery rate of questionnaires, and the higher the rate can be, the more active and supportive the experts can be in this research. In the first round, a total of 20 questionnaires was
Table 1. Characteristics of the Delphi panel (n = 18).
handed out with a result of 18 effective responses, which meant the recovery rate of questionnaires was 90%. In the second round, eighteen questionnaires were sent out and all been received with the rate of 100%, suggesting that experts showed great interest and support in this project.
3.3. Expert Authority Coefficient
The expert authority coefficient is closely related to experts’ judgement basis and familiarity, and the higher the coefficient can be, the more trusted and reliable the consultation results can be . Another study shows that the research results can be of great authority if the coefficient rate is more than 70% . In this study, the expert authority coefficient was 91% and 93% in round 1 and round 2 respectively, demonstrating that the experts had great authority on this research and the consulting results were worth trusting and studying.
It’s been said that the most reliable way of defining consensus is the interquartile range (IQR) index , and an IQR ≤ 20% of the rating scale is considered to be a good level of consensus . In this study, the value of an IQR should be no more than 1 if great consensus is expected to be achieved as a result of the Likert 5 point being applied (0 ≤ IQR ≤ 1, with 0 corresponding to the strongest value, while the closer it gets to 1, the lower the consensus will be). The IQR values of primary indicators were all under 1 in both rounds. The IQR values of secondary indicators were all lower than 1 but the proportion of indicators with an IQR of 0 in the second round (86%) was significantly higher than that of the first round (79%), which was considered as the greater consensus had been achieved in the second round of consultation and there was no need to carry out a third round.
3.5. Indicators Screening
Critical value method was used to screen indexes, and the CV’s and FSR’s boundary values were calculated according to the scores of the importance of each indicator. All indicators with the CV greater than the CV’s boundary value or the FSR less than the FSR’s boundary value were excluded from the index system . Since there’s no need on deleting and adding the primary indicators from the professionals’ opinions, main adjustments rested on the secondary indicators. In the first round, 6 indicators were supposed to be taken out of the index system according to CV and FSR values (seen in Table 2) and boundary value table (seen in Table 3), but after a comprehensive consideration about experts opinions, only 1 index named “overview of Chinese traditional medicine” was actually excluded. In the second round, 2 indicators named “self-adjustment and adaptability” and “scientific research capabilities” were dropped out from the system with statistical data and experts’ advice. After two rounds of expert consultation, the index system for MBBS graduates on quality evaluation was finally established, consisting of 4 primary indicators and 26 secondary indicators.
3.6. Index Weight
Using experts’ scoring on the indicators’ importance in the second round, the weights of the primary and secondary indicators were given. The weights of primary indexes were 0.2544, 0.2513, 0.2544, 0.2399, respectively, and the weights of secondary indexes were demonstrated in Table 4.
Table 2. The values of CV and FSR of scoring importance on secondary indicators.
In this research, Delphi technique has been used to establish an index system, and the key to successfully implementing this method is the selection of experts, involving expert number and quality . A total of 18 experts participated in two rounds of consultation as suggested , with over 10 years of working experience in related field and pretty high professional title and academical degree, suggesting great authority and quality in these experts and reliability in research results. Besides, experts showed massive support and professionalism according
Table 3. Boundary value table for indicators screening.
Table 4. Indicators for quality evaluation of international MBBS graduates and their weights.
to response rate and expert authority coefficient, and the values of IQR demonstrated that they came to an agreement on every index, indicating scientific and reliable research results.
Universities and colleges should set teaching objectives for international students majoring in MBBS in accordance with training objectives and requirements for medical undergraduates in China, and carry out convergence training to meet the basic requirements for medical undergraduates in China when graduating from school as required by “interim provision for quality control standards in undergraduate medical education” in English for international students in 2007. However, there are no authoritative evaluation standards or system to tell whether the MBBS graduates have achieved those requirements. With the increasing amount of international medical students coming to China and the more attention paid to international students’ education quality, it’s essential to build an evaluation system of international medical students for the purpose of quality evaluation and improvement. In this study, the index system on quality evaluation for international MBBS graduates in China has been established, scientific and reliable, including 4 primary indicators and 26 secondary indicators that were all weighted, which can provide great value of reference and useful tool for evaluating the MBBS graduates’ quality and make great influence on teaching method reform and teaching quality improvement of MBBS students.
This system will be further studied on its indicators and then be applied to several medical universities and colleges for field experiment, with great expectation on improving and perfecting this system for further use in actual quality evaluation.
The index system on quality evaluation for international MBBS graduates in China has been formed by applying a modified Delphi method after two-round consultation, involving 2 primary indicators and 26 secondary indicators, which has certain scientificity and reliability, and can provide some reference value on evaluating MBBS students’ quality when getting graduated for medical universities and colleges and the third party educational institutions.
 Keeney, S, Hasson, F. and McKenna, H.P. (2001) A Critical Review of the Delphi Technique as a Research Methodology for Nursing. International Journal of Nursing Studies, 38, 195-200.
 Mertens, A.C., Cotter, K.L., Foster, B.M., Zebrack, B.J., Hudson, M.M., Eshelman, D., Loftis, L., Sozio, M. and Oeffinger, K.C. (2004) Improving Health Care for Adult Survivors of Childhood Cancer: Recommendations from a Delphi Panel of Health Policy Experts. Health Policy, 69, 169-178.
 Han, B., Yang, G.R. and Huang, G.Y. (2016) Research on the Construction of Performance Appraisal System of Head Nurses in Tertiary Hospitals Based on Principal Basis Dual Method. Nursing Research, 30, 1456-1460.
 Murphy, M., Black, N., Lamping, D., McKee, C., Sanderson, C., Askham, J. and Marteau, T. (1998) Consensus Development Methods, and Their Use in Clinical Guideline Development. Health Technology Assessment, 2.
 Rayens, M.K. and Hahn, E.J. (2000) Building Consensus Using the Policy Delphi Method. Policy, Politics, & Nursing Practice, 1, 308-315.