Experts now are the main executives in the assessment of teaching abilities of college faculty, and the evaluation results are widely used in teaching management. A common practice is that experts often differ considerably in rating different teachers’ performance. And sometimes, even for the same teacher, they give different scores. Some researchers attributed this phenomenon to the interaction of evaluation subjects, scales and external factors  , and they held that the emotions of expert raters will affect the evaluation results. In teaching evaluation, analysis, explanation and classification of the data collected of teachers’ classroom performance will be done by corresponding scoring standards, thus eliciting the cognitive evaluation from experts which will subsequently develop into emotions. But, experts’ emotions are not explicitly expressed but rather indirectly communicated through different Emotion Display Rules for different examinees  . Based on the above research, the study is intended to address the following two questions: what kind of emotions will be produced in the evaluation process? How will they affect the expert evaluation scores?
Literature study leads to the finding that most of the literature focuses on the effect of teachers’ emotions on teachers’ teaching performance and teaching effects. So far, no research has been found to study the effect of expert emotions on evaluation scores. According to Appraisal theories, the process of expert emotion evocation can be presented as “perception-appraisal (attribution)-emo- tion induction-emotion display-behavior”  . Specific settings and incidents will evoke various emotions  . Individuals adjust and regulate their emotions according to standards recognized and accepted by other people or the society, express their emotions through proper social rules, leading to behavior in accordance with social rules  . Through Interview, this research is conducted to study the effect of expert emotions on expert rating results, further exploring the particular way in which experts fulfill their evaluation duty and revealing the psychological attributes behind the evaluation scores  with the aim to improve the application of expert rating in practical teaching management.
1.1. Interview Design
Literature study of the effect of emotions on decisions shows that no research has been conducted to study the occurrence of emotions and their effect on expert evaluation scores in teaching evaluations. Interviewees’ emotions were divided into positive and negative emotions  . Variance analysis shows that qualifications of teachers regulate the relation of expert emotions and decisions  . With the Emotion Elicitation theory and Emotion Display Rules applied by the experts, the research studies the effect of teachers’ teaching abilities on expert emotions and the function of Emotion Display Rules in expert emotion and scoring regulation.
1.1.1. Selection of Interviewees and Interview Design
The interviewees include 6 expert raters with senior technical titles experienced in teaching ability assessment. The interview consists of six open questions, mainly including: 1) what is the highest score you ever gave to a teacher and how did you feel after attending his/her lecture? 2) Was it possible that you would give him/her a higher or a lower score? 3) What is the lowest score you ever gave to a teacher and did you feel after attending his/her lecture? 4) Was it possible that you would give him/her a lower or a higher score? 5) Do you still remember the classroom details given by teachers of average teaching ability and how did you feel about these teachers’ classroom performance? 6) What was brought to your mind when you were scoring these average teachers?
1.1.2. Categorization of Interview Content and Coding Collection
In order to distinguish positive emotions and negative emotions, the revised Chinese version of PANAS is employed   . This scale is verified to be applicable to the measurement of Chinese people’s emotions  . The classification of the raters’ emotions is conducted by their properties based on 9 positive emotions and 9 negative emotions used to describe emotions on the scale.
After the interview, coding and the grounded theory were used to code and summarize the interview. Part of the interview and coding are reported as follows: The highest score I ever gave is 95 points. I think this teacher gave excellent performance and he/she showed great teaching ability, was well prepared and worked very hard (a14). I felt encouraged (a11) and I felt he was respectable and praiseworthy (a12, a13). In fact, I could have given him a higher score, but other people may not give the same high score as I did. So I thought 95 was high enough (c1) to separate him from the others. The lowest score I ever gave was 80 points. In fact, I thought it should be much lower (c3). But I thought a too low score would be very discouraging (a33) because it is not so easy to give a good performance after all (a32) and the difficulty of a course should also be taken into consideration (a34). If one prepares well, the results may not be so bad with the exception of some irresponsible teachers (a31). But the score would not be too low (c3), which implies you are not doing well enough actually (b3). Most of the teachers possess average teaching ability, good enough generally speaking (a21). They are all serious and responsible but without distinguishing characteristics unfortunately (a22), the reason of which is various (a14, a34). It is very hard to tell who is better and who is worse, with external factors quite influential (b2) and high uncertainties (c2).
All the coding is summarized in Table 1 as follows.
2. Analysis of Coding of the Effect of Experts Emotions on Expert Rating Results
2.1. Different Teaching Abilities Induce Different Expert Emotions
Specific emotions consist of various emotion appraisal cognitions, with individuals of different features evoking different emotions. With the summary of the coding, it is found that teachers of high teaching ability often stimulate positive emotions from experts, such as greatly encouraged (a11), respect (a12), praise (a13) and internal attribution (a14). These emotions are categorized as “admiration”
Table 1. The Summary of the Coding.
 . Teachers of low teaching ability evoke various negative emotions from ex- perts, but with explicit affect valence, mainly including criticism (a31), sympathy (a32), target orientation and acceptance (a33) and external attribution (a34), all of these are categorized as “peace”. Teachers of medium level induce positive emotions from experts, including praise (a13) and approval (a21), and meanwhile internal attribution (a14) and external attribution (a34) are employed. There also exist slight negative emotions like regret (a22), all of these are categorized as “recognition”  .
2.2. Moderation of Emotion Display Rules on the Relation between Expert Emotions Rating Results
Experts employed different emotion display rules with different evaluation objects. As for “admiration”, experts displayed their emotions by “restraining positive emotions” (b1), setting up a “maximum score” based on the actual evaluation, resulting in a lower score in comparison with the actual score consistent with his/her actual teaching ability (c1). For “peace”, experts expressed their emotions by “restraining negative emotions” (b3), setting up a “minimum score” based on actual evaluation, resulting in a higher score in comparison with the score consistent with his/her actual teaching ability (c3). For “recognition”, experts expressed their emotions by “uncertainty” (b2), that is to say, uncertainty exists concerning whether the score should reflect the teacher’s actual teaching ability (c2).
2.3. Effect of Disparity between Teachers’ Abilities, Expert Emotions and Expert Emotion Display Rules on Rating Results
Different levels of teaching abilities induce different expert emotions while Emotion Display Rules regulate the relation between expert emotions and the evaluation scores. We will summarize this relation as “ability disparity-emotion-emo- tion display rules-score disparity”. As for the high score group, with the emotion display rule “restraining positive emotions” (b1), it’s very hard for a teacher to earn a higher score. One point difference in this group means greater teaching ability disparity. As for the low score group, with the emotion display rule “restraining negative emotions” (b3), it is easier for a teacher to earn a higher score, which means one score in this group representing less actual ability disparity in comparison with the high score group. However, in the medium score group, with the emotion display rule “uncertainty” (b2), one point difference in this group between two teachers means greater uncertainty in reflecting their teaching abilities. As a result, the division into three score groups represent the actual ability disparity between individuals whereas the exact score of a teacher is affected by expert emotions and emotion display rules.
With coding and analysis of the interview conducted on the effect of expert em- otions on expert rating results, conclusion can be drawn as follows:
3.1. Expert Display Specific Emotions in the Evaluation Process
As the research shows, the induction of expert emotions relies on particular objects. In the evaluation process, expert develop three different emotions as “admiration”, “recognition” and “peace” in response to different levels of teaching abilities. Each emotion involves 4 types of emotion evaluation cognitions, which shows that the specific expert emotion relates not only to experts’ personal preference and external factors but also individual teaching ability. With the influence of Chinese culture, experts wield different emotion display rules in the evaluation process. Raters often adopt the following three emotion display rules, including “restraining positive emotions”, “uncertainty” and “restraining negative emotions”. Consequently, experts generally display positive emotions with cautious display of negative emotions.
3.2. Expert Emotions Affect the Meanings of Different Score Groups
The impact of expert emotions on the evaluation scores varies with the difference between teachers’ teaching abilities. Compared with the low score group, one point’s difference in the high score group indicates greater disparity between teaching abilities. This explains that less teaching ability is needed to realize one point increase in the low score group whereas greater teaching ability is needed to fulfill one point increase in the high score segment. At the meantime, it is easier for one to move from the low-score group into the medium-score group than from the medium-score group to the high-score group. In the interview conducted in this research, experts adopted different emotion display rules in grading teachers of different teaching abilities. As a result, the scores of most teachers fall under 90 points. Meanwhile, experts would not give too low scores. So, the scores of most teachers fall around a certain score group, with one point difference representing different teaching ability disparity which is used by experts to show their implicit judgment of individual teaching ability.
3.3. Rating Results Reflect Different Teaching Abilities and Experts Emotions
According to Surveying Principles, with the hypothesis that experts’ personal preference has no influence on the evaluation results. The higher the original score, the greater the teaching ability was shown by the teachers being evaluated. This research shows that expert emotions have impact on the rating results and the effect varies for teachers of different teaching abilities. But, expert emotions are generally positive. Both positive and negative emotions are restrained as exemplified by the maximum score and minimum score set up by experts unconsciously. That’s why we see few high scores and low scores, most of which fall into a certain score group and with very little difference. So, the scores reflect teachers’ teaching abilities which are affected by expert emotions and their emotion display rules.
4. Implications for Teaching Management
Based on the research conclusion, suggestions as follows are put forward with regard to the application of evaluation results in teaching management:
4.1. Verification of the Validity of Expert Rating
This research shows that expert emotions induced in the evaluation process affect evaluation scores which not only represent teachers’ abilities but also indicate the influence of expert emotions. If in teaching management, expert rating is to be taken as a reference for praise or punishment decisions, the validity of expert rating should first be verified. Item Response theory, popularized in Educational Testing, can be borrowed to regulate or even remove the impact of expert emotions on the rating results using scientific methods  .
4.2. Train Experts for Rating Criteria
Evaluation experts are the major executive in conducting teaching evaluation. Experts can be trained on how to rate to improve the validity of grading  . At present, most experts are selected from front-line teachers. They are well-learned in their expertise but without special straining, showing lack of knowledge of how to understand and how to use the rating criteria in actual scoring practice. They don’t know how to weaken the influence of expert emotions on the rating results either. As a result, it is quite necessary to train experts for the rating criteria. The major measures may include: remind experts of possible emotions that may be induced by different teachers; analyze the effect of emotions on evaluation scores; help experts conquer the interference of some uncertain factors in the expression of emotions; build up a visual bank of simulated training tests  ; strengthen training of and feedback from experts.
Two weaknesses of this research remain as follows: Only six subjects are interviewed and the analysis of the coding shows lack of certainty and fail to cover all the cases. A small sample used in this research makes it impossible to do statistical analysis of the effect of emotions on expert rating results. In future research, more subjects should be covered and the sample size should be enlarged. Empirical studies should also be done with statistical methods on the influence of expert emotions on expert rating results to provide empirical evidence for the application of expert rating results in practical teaching management.
This research uses interview, coding and Grounded Theory to analyze the influence of expert emotions on rating results. Analysis of the coding demonstrates that different teaching abilities evoke different emotions from the expert raters; emotion display rules regulate the relation between expert emotions and the ra- ting results. Difference between teachers’ teaching abilities, expert emotions and the emotions display rules all have impact on the rating results. Based on the above analysis, we can conclude that expert emotions evoked in the evaluation process are mostly positive emotions. And expert emotions affect the teaching abilities reflected by different score groups. The rating results reflect teachers’ teaching abilities and also the expert raters’ emotions. Thus, when the rating results are to be applied in teaching management, it is necessary to verify the validity of the rating results and train experts on how to rate.
This study is supported by Hubei Education Science Program “Evaluation and promotion of College Teachers’ classroom teaching ability"(2015GB040).
 Sheng, Y.Y. and Yu. Q.S. (2015) An Optimization Research on the Rating Scale of College Teachers’ Classroom Teaching Abilities Based on Many-Facet Rasch Model. Higher Education Exploration, 9, 83-89.
 Baron, R.A. (1993) Interviewers’ Moods and Evaluations of Job Applicants: The Role of Applicant Qualifications. Journal of Applied Social Psychology, 23, 253-271.
 Chen, C.C., Chen, H.W. and Lin, Y.Y. (2013) The Boundaries of Effects on the Relationship between Interviewer Moods and Hiring Recommendation. Applied Psychology, 62, 678-700.
 Watson, D.C. and La, T.A. (1988) Development and Validation of Brief Measures of Positive and Negative Affect: The PANAS Scales. Journal of Personality and Social Psychology, 54, 1063-1070.
 Posthuma, R.A., Morgeson, F.P. and Campion, M.A. (2002) Beyond Employment Interview Validity: A Comprehensive Narrative Review of Recent Research and Trends over Time. Personnel Psychology, 55, 1-81.
 Uggerslev, K.L. and Sulsky, L.M. (2008) Using Frame-of-Reference Training to Understand the Implications Rater Idiosyncrasy for Rating Accuracy. Journal of Applied Psychology, 93, 711-719.