What does moral judgment development look like in college students? What does moral judgment look like in college students, e.g., Resident Hall Assistants, who have important roles within the campus community? In 1967,  proposed that residence halls were places that could facilitate learning outside of the classroom. Educating students outside of the classroom is typically an important mission of Housing and Residence Life departments. Part of making a residence hall a living-learning center falls on the shoulders of the RAs, and the challenges of being an RA are ever-changing. Resident Assistants (RAs), are often the first persons in the residence-hall setting who become aware of concerns, issues, or dilemmas on the floor they supervise or the building in which they work. These concerns and issues can range from students discussing roommate conflicts they are experiencing to students discussing their depression and desire to harm themselves   . How RAs respond to the myriad of interactions they come in contact with can have a varying impact on the dynamics and growth of the floor community  . Students who become RAs face moral decisions and are encouraged to be good ethical role models to students on their floor. Selecting current students to assume the RA role is one way that housing officers help meet the multitude of challenges within the residence halls  . If a training course for RAs can help them develop moral judgment and thus effectively face moral decisions and be ethical role models to students on their floor, this can help lead the way to a stronger, more ethical community   .
Over the past 25 years, multiple studies have focused on moral judgment development in college students. Researchers  completed a large review of more than 2600 studies. Based on that review, they suggested there is notable evidence that moral development occurs during an individual’s college years. In 2002,  reviewed 172 studies completed between 1980 and 2001. Of these 172 studies, 170 reported a relationship between higher education and moral reasoning development, or differences in moral reasoning based on higher education level. In most, if not all of these studies, the Defining Issues Test (DIT) was used to examine the relationship between participation in higher education and the development of moral reasoning. Rest and colleagues published the DIT based on Kohlberg’s stages of moral development  . The DIT is a multiple-choice instrument in which participants read a dilemma and rate 12 related items on a 5-point scale of importance; they then rank-order the rated items in terms of their importance to the dilemma. According to  , from 70 to 100 studies on average that utilize the DIT and DIT-2 have been conducted annually for the past decade. The DIT-2, created in 1999, provides both a P score and an N2 score. The P score represents the Principled level of reasoning  . The N2 index, a continuous index score obtained from the ranking data, helps researchers assess whether participant responses are inauthentic.
Few studies have examined the moral judgment development of a specific college population, Resident Assistants (RAs). As residence-hall student staff members, RAs are a vital part of the university community. They are often the first to become aware of critical issues and ethical dilemmas that involve residents on their floors and in their buildings. The everyday dilemmas that RAs face    reflect the more difficult situations for traditional college-aged students to handle. For RAs, interactions with students can range from discussions about roommate conflicts to discussions about depression and the desire of students to harm themselves   . How RAs think about and handle these issues and dilemmas is of great importance to the individual residents, the community in the resident halls, and the university as a whole.
RAs are encouraged to be good ethical role models to students on their floor. Even to apply, let alone to proceed through the selection process and become an RA, requires a special type of student. According to  , suggests that RAs are employees of the institution who are likely “overworked and underpaid.” Finding and training the right individuals to fulfill the responsibilities of these positions is an annual task for university housing officers.
The mental health of college-age students is drastically different from what it was a few decades ago. Students are going to college counseling centers with more severe mental health issues   . For instance, the 2013 National College Health Assessment found that in the previous year 31% of college students reported being depressed and experienced difficulty functioning, 7.4% seriously considered suicide, and 1.5% attempted suicide  . In this context, how RAs respond to the myriad interactions and concerns they address can have a varying impact on the dynamics and growth of the floor community. RAs are often the first persons in the residence-hall setting to become aware of these mental health concerns and issues―challenges and concerns that call for professionalism and moral judgement.
Selecting and training appropriate students to assume the RA role is one way housing officers meet the multitude of challenges within the residence halls  . Housing department staff want to foster ethical development in residents and professionally prepare RAs to lead the way to a stronger, more ethical community   . Yet to date, few studies have examined the outcome of RA training as it relates to moral judgement.
Almost thirty years ago,  completed a study that examined the impact of RA training on the moral reasoning of university students. The study, with a quasi-experimental, nonequivalent group design, utilized the pool of students enrolled in RA training courses (those selected to proceed through the RA selection process) and students not enrolled (those not selected to proceed through the selection process). The researcher  utilized the DIT to perform a pretest and posttest on both the control group and experimental group. His study suggested a key finding: Male students completing a residence-hall staff training course (as part of the experimental group) experienced an increase in their mean P score, from 40.317 to 42.519. However, females who completed the residence-hall staff training course (as part of the experimental group) did not show any increase in moral reasoning skills; there was a decrease in their mean P score, from 45.534 to 45.123.
The purpose of the current study, taken from dissertation research  , was to replicate the McKelfresh study. More specifically, the researcher in the current study examined the impact of an RA training course on students’ moral reasoning development when compared to the moral reasoning development of students who did not complete the course.
In this study, the researcher examined whether participation by students in an RA staff training course had an impact on their moral judgment development when compared to the moral judgment development of students who did not participate in the course. The research questions addressed were as follows:
1) Did moral judgment development pretest scores for students participating in the RA selection course differ from the scores of students not enrolled in RA selection course as measured by the Defining Issues Test-2 (DIT-2)?
2) Upon their completion of the RA selection course, to what extent did students participating in the course differ from students not enrolled in the course in terms of their growth in moral judgment development skills, as measured by the DIT-2?
3) Did pretest moral judgment development scores of male students participating in the RA selection course differ from the pretest moral judgment development scores of female students enrolled in the RA selection course, as measured by the DIT-2?
4) Did interactions occur between students’ gender, their class standing, and their enrollment or not in the RA selection course, interactions that are reflected in the students’ moral judgment development scores in posttest for the experimental group and in their general scores for the control group?
2.1. Participants and Procedure
The researcher collected data from the experimental group using the DIT-2 from all candidates who were proceeding through the RA selection course during the fall 2012 semester, and then again at the conclusion of the course in the spring 2013 semester. The Department of Housing and Dining Services at the university provided the list of students who were participating in the selection process. The control group consisted of a randomly selected sample from a list of 1000 freshman, sophomore, junior, and senior students, which was provided by the Executive Director of Research and Assessment within the Division of Student Affairs at the university. These students were not enrolled in the course and had never been RAs. These students completed the DIT-2 at the same time the experimental group completed the posttest.
The experimental group consisted of 43 students, nine male and 34 female, who were enrolled in the RA selection course and who had completed both the pretest and posttest via an email with a link to the Survey Monkey website. These students were all full-time students who were proceeding through the selection process to become RAs. They ranged in age from 17 years to 21 years, with an average age of 18.7 years. Of the 43 participants, 39 identified as Caucasian.
The control group consisted of 45 students, 15 male and 30 female, who responded to the email with the request to take the DIT-2 online using Survey Monkey during the middle of the spring 2013 semester. All of the control-group participants were full-time students who had never been RAs and were not enrolled in the RA selection course. Participant age for the control group ranged from 17 years to 44 years, with an average age of 21.1 years. Of the 45 participants, 38 identified as Caucasian. Table 1 displays the complete demographic frequencies and percentages by group.
In an effort to replicate the  study yet use the more current instrument, the researcher used the DIT-2, which has been shown to be easier for the
Table 1. Demographic frequencies and percentages by group.
participants to complete and yet has the same level of reliability as the DIT  . A research team  validated the DIT-2 by administering both the first and second versions of the instrument to the same participants, thus “balancing the order of presentation” (p. 648). The samples included students ranging from ninth grade to professional school. The researchers determined that the DIT was “highly correlated with DIT-2 (r = 0.79), and the 11 stories of DIT plus DIT-2 show[ed] a very high degree of internal consistency (Cronbach’s alpha 0.90)” (  p. 657).
The DIT-2 comprises five scenarios that respondents must answer questions about to produce an N2 score. Each scenario contains three main questions. The first question of each scenario inquires about personal choices regarding what the person in the scenario should do, and whether the respondent favors the actions of the person in the scenario. This question contains a 3-point Likert scale with items ranging from one extreme on the left, such as “Should give Mrs. Bennett an increased dosage to make her die,” to the other extreme on the right, such as “Should not vie her an increased dosage.” The second question asks the respondent to use a 5-point Likert scale to rate a series of 12 issues in terms of importance, with items ranging from Great on the left to No on the right. The final question asks the participant to rate the issues from the second question in order of importance. This question uses a ranking system of Most Important to Fourth Most Important, and not all selections from the second question are available. This process continues for all five scenarios  .
2.3. Internal Validity
The researcher took several steps to address internal validity  . First, the selection of the control group was random because the students from a randomly selected list of students self-identified for participation. Members of the experimental group also self-identified based on the condition that all participants were proceeding through the RA selection process. Although this approach was not random selection in the purest experimental terms, the fact that the students elected to proceed through the selection process on their own accord and were not required to do so was, in practical terms, close to random selection. Second, to address mortality, the researcher selected 66% of the control group to mirror the gender makeup of the experimental group. The experimental group of the current study reflected a slightly larger sample size in comparison to the experimental group of the previous study by  , which consisted of 35 students. Third, to combat the potential diffusion of treatment in the current study, the experimental group did not receive the email invitation that the control group received. Also, given the size of the institution, the sample sizes involved, and the confidentiality features built into the survey process, the chances of participants knowingly interacting with each other were very low. Finally, the research addressed concerns related to repeated testing through the timing of the pretest and posttest. The delivery of the pretest and posttest included a gap of approximately three and a half months for the experimental group. The control group took the test only once (posttest), so testing was not a threat to internal validity for this group.
3. Analysis and Results
Table 2 shows the breakdown of the research questions, the independent and dependent variable(s) associated with each question, and the statistical analysis for each question in the study. Using both descriptive and inferential statistical methods, the researcher analyzed the data by testing the five research questions, which are discussed in the Results section.
To determine whether there were any statistical significance outcomes for the research questions, the researcher performed independent t-tests, paired-sample t-tests, and univariate analyses of variances (ANOVAs). The results for each research question follow.
3.1. Research Question 1
For the first research question, Table 3 shows that the moral judgment development pretest N2 scores of students enrolled in the RA selection course (the experimental group) were significantly different from the scores of those students who were not enrolled in the RA selection course (p = 0.045). Inspection of the two group means indicated that the average pretest N2 score for students enrolled in the RA selection course (M = 40.48) was significantly higher than the score for students not enrolled in the RA selection course (M = 34.38). The difference between the means was 6.1 points on a 95-point scale. The effect size d was 0.44, which is slightly lower than a typical effect size in the behavioral sciences.
The N2 score for the pretest had significant results. There was a significant difference between the moral judgment development pretest N2 scores for students who completed the one-semester RA selection course and the scores of students not enrolled in the RA selection course.
Table 2. Research questions, variables, and appropriate statistics for analysis.
Note. RQ = Research Question; IV = Independent Variable; DV = Dependent Variable.
Table 3. Comparison of the pretest scores of students enrolled in the RA selection course and students not enrolled in the RA selection course (N = 43 Enrolled and N = 45 Not Enrolled).
3.2. Research Question 2
Table 4 shows the moral judgment development N2 pretest and posttest scores of students enrolled in the RA selection course, and of those students not enrolled in the RA selection course. As a note, there was no difference in scores of the control group because that group took the DIT-2 only one time. That singular score was used twice, in place of both the pretest and posttest scores. Significance was shown in both the N2 pretest and N2 posttest scores (p = 0.046 and p = 0.033). Effect size for the pretest N2 scores was d = 0.44, which is slightly less than a medium or typical effect. Effect size for the posttest N2 Scores was d = 0.65, which is between a medium or typical and a large or larger-than-typical effect.
Table 4 also shows that the intervention did have a positive impact on the experimental group compared to the control group in regard to the changes in N2 scores between the pretest and posttest. Specifically, the N2 posttest score shows a very strong significance and a solid effect size.
The N2 score for the pretest and posttest had significant results. There was a significant difference in the amount of growth in moral judgment development as measured by the N2 scores for students who completed the one-semester RA selection course and the scores of students not enrolled in the RA selection course.
3.3. Research Question 3
Table 5 shows that the moral judgment development pretest N2 scores of male students enrolled in the RA selection course were not significantly different from the moral judgment development pretest N2 scores of the female students who were not enrolled in the RA selection course (p = 0.068). Inspection of the two group means indicated that the average pretest N2 score for male students enrolled in the RA selection course (M = 42.12) was not significantly higher than the average N2 pretest score for the female students who also were enrolled in the RA selection course (M = 40.04). The difference between the means was 2.08 points on a 95-point scale. The effect size d was 0.15, which is smaller than a typical effect size in the behavioral sciences.
3.4. Research Question 4
To assess whether gender, class standing, or a student’s enrollment in the RA selection course seemed to have an effect on an individual’s N2 score, the
Table 4. Comparison of the pretest and posttest scores of students enrolled in the RA selection course and students not enrolled in the RA selection course (N = 43 Enrolled and N = 45 Not Enrolled).
Table 5. Comparison of the pretest Scores of male students enrolled in the RA selection course and female students enrolled in the RA selection course (N = 9 Male and N = 34 Female).
researcher conducted a univariate ANOVA. Table 6 shows the means and standard deviations for the N2 scores of the two genders for moral judgment development and for education level, and based on whether a student was proceeding through the RA selection course.
Table 7 shows there was no significant interaction between gender and education level on the N2 moral judgment development scores of study participants (p = 0.98). There also was no significant interaction between genders and whether or not a student was proceeding through the RA selection course in terms of the impact of those variables on the N2 scores for moral judgment development (p = 0.49).
Even though there was no significant interaction between gender and whether or not a student was proceeding through the RA selection course, Figure 1 shows the means plot, which demonstrates the visible interaction between the two variables, and which the medium to large effect size (d = 0.65) also illustrates  . In terms of the N2 moral judgment development scores of those students (p = 0.43), there also was no significant interaction between the educational level of students and whether or not students were proceeding through the RA selection course. Finally, in terms of the impact on the N2 moral judgment development scores of those students (p = 0.68), there was no significant interaction between gender, educational level, and whether or not students were proceeding through the RA selection course. There was, however, a significant main effect of the intervention (whether or not a student was proceeding through the RA selection course) on moral judgment development scores, F (1, 73) = 3.97, p = 0.05.
Table 6. Means, standard deviations, and n for N2 moral judgment development scores as a function of gender, education level, and RA selection course.
Table 7. Analysis of variance for N2 moral judgment development scores as a function of gender, education level, and RA selection course status.
The results of the N2 scores relative to this research question had the same nonsignificant outcomes. For these study participants, there was no interaction between gender, class standing, RA selection-course enrollment, and moral judgment development scores as measured by the DIT-2.
As noted, the purpose of this study was to replicate a previous study  and to examine whether there was a difference in the moral judgment development of
Figure 1. Means plot of N2 scores.
students who had enrolled and completed a one semester RA training class when compared to that development of a similar group of students who did not participate in the RA training class. Although much research has focused on the general college student’s moral judgment development   , little research has been completed on the ethical behavior  or moral judgment development of RAs in particular.
The results of the study indicate two statistically significant outcomes. Research Questions 1 and 2 both had significant outcomes. First, there was a significant difference between the moral judgment development pretest N2 scores of students enrolled in the RA selection course for one semester and the scores of students not enrolled in the RA selection course. Second, there was a significant difference between the moral judgment development pretest and posttest scores of those students enrolled in the RA selection course for one semester and the scores of the students not enrolled in the RA selection course. These findings suggest that students enrolled in the RA selection course have a predisposition for a higher level of moral judgment than students not enrolled in the course. It also suggests that the RA course has a positive impact on the moral judgment development for the students who complete it compared to those who are not enrolled in it. For Research Question 1, no specific research has been done on RAs in relation to their normative moral judgment development scores. The findings for Research Question 3 conflict with the previous study’s  findings, which did show differences in gains between the genders. Results from the current study suggested no significant differences in gains between males and females who enrolled and completed the RA training. Findings also did not show a significant difference between genders in moral judgment development posttest scores. Both of these findings―no statistically significant difference between genders as measured by the DIT and DIT-2―are in line with research that has shown that gender and an individual’s moral judgment development are not dependent on one other     . Finally, the outcomes for Research question 4 indicated no significant differences between gender, class standing, RA selection course enrollment, and moral judgment scores for both N2 and P scores. These findings are supported by  , and  , whose research showed no differences between freshmen and seniors’ moral judgment development scores.
This research had two limitations. First, the training as established by the course syllabus was specific to the research site. Currently, there are no general training guidelines and manuals for RA training to which all institutions subscribe. Second, the focus of this study was on the moral judgment development of students within the RA selection course at one institution. One aspect that makes higher education so special is the diversity in campuses across the country.
5. Conclusions and Future Research
Leaders within higher education have been concerned with the moral development of students since colonial times  . The days of house moms, curfews, and in loco parentis have evolved into coed, suite-style living with thematic housing opportunities and well-trained student staff who are responsible for the day-to-day management of a floor of residential students. “Although emphasis on moral development in relation to intellectual development [has] fluctuated over the years, current leaders stress the importance of moral development in today’s colleges and universities” (  p. 97).
This study adds to the limited body of knowledge of RA training courses and provides some insight into how a training course can impact an individual’s moral judgment development. The results indicate a significant difference in the moral judgment development of students who were enrolled in and completed an RA training course when compared to those students who were not enrolled in an RA training course. This is an interesting finding and suggests that, at minimum, students who enroll in and complete the RA training may demonstrate a higher level of post conventional moral thinking.
There are multiple possibilities for future research. A similar study could be completed at other colleges and universities of varying size and Carnegie Classification. Researchers also might conduct a similar study utilizing a mixed-method approach to gain a better understanding of exactly how individuals’ moral judgment development increases. Listening to their stories and explanations about how their training prepared them to be ethical professionals could be of interest, add valuable insights into RA training as well as suggest how to foster ethical reasoning and behavior in RAs. Another study could examine the impact of a universal syllabus for RA training that includes more than a 3-hour presentation and discussion of professional ethical issues as well as the importance of moral judgment to fulfilling the ethical obligations of the role. For instance, a recent study by researchers  suggests that having an ethics course could influence students’ beliefs about exhibiting ethical behavior. Gaining support for a project such as this may be a challenge, but could be achieved by having a group of professionals from across the country with a commitment to addressing the issue of ethics coming together to design the syllabus. Utilizing a national organization such as American College and University Housing Officers-International (ACUHO-I) might be one way to get more buy-in for such a venture. Finally, RA training directors across the country could consider using the DIT-2 to assess the impact of training which would facilitate a norm N2 score. Having this N2 norm score would allow researchers and higher-education administrators to examine the moral judgment development of their RAs and to assess the impact of their training on these young professionals. Partnering with the Center for the Study of Ethical Development at the University of Alabama would be important to establish a norm N2 score for RAs across the country because the Center is the clearing house for all aspects of the DIT-2.
 Deluga, R. and Winters Jr., J. (1991) Why the Aggravation? Reasons Students Become Resident Assistants, Interpersonal Stress, and Job Satisfaction. Journal of College Student Development, 32, 546-552.
 King, P. and Mayhew, M. (2002) Moral Judgement Development in Higher Education: Insights from the Defining Issues Test. Journal of Moral Education, 31, 247-270.
 Bailey, C.D. (2011) Does the Defining Issues Test Measure Ethical Judgment Ability of Political Position? The Journal of Social Psychology, 151, 314-330.
 Benton, S.A. (2006) The Scope and Context of the Problem. In: Benton, S.A. and Benton, S.L., Eds., College Student Mental Health: Effective Services and Strategies across Campus, National Association of Student Personnel Administrators, Washington DC, 1-15.
 American College Health Association (ACHA) (2013) ACHA-NCHA II American College Health Association National College Health Assessment. Reference Group Executive Summary, Spring 2013.
 Author (2014) Revisiting the Impact of a Residence Hall Staff Training Class on the Moral Judgment Development of College Students. Doctoral Dissertation, Colorado State University, Fort Collins, Colorado.
 Rest, J., Narvaez, D., Thoma, S. and Bebeau, M. (1999) DIT2: Devising and Testing a Revised Instrument of Moral Judgement. Journal of Educational Psychology, 91, 644-659.
 Derryberry, W. and Thoma, S. (2005) Moral Judgment, Self-Understanding, and Moral Actions: The Role of Multiple Constructs. Merrill-Palmer Quarterly, 51, 67-92.
 Finger, W., Borduin, C. and Baumstark, K. (1992) Correlates of Moral Judgment Development in College Students. The Journal of Genetic Psychology, 153, 221-223.
 Chung, C., Bebeau, M., You, D. and Thoma, S. (2009) DIT-2: Moral Schema Norms—The Updates from Recent Data Sets. American Educational Research Association Annual Conference, San Diego, 30 March 2009.
 Thoma, S. (2009) The Link between Moral Judgment Development, Voting Behavior and Political Attitudes across US Presidential Elections from Carter to Obama. Association for Moral Education Annual Conference.
 Asim, M., Chambers, C., González, R.O., Morote, E.S. and Walter, R.J. (2015) A Study about the Academic Integrity of Second-Year Aviation Students in US Higher Education. Journal of College and Character, 16, 169-179.