This project aims to teach students health knowledge in sun protection, elicit and increase students’ interest in sun protection, and, more specifically, influence their health behavior.
It is known that risky behavior belongs to the development of life history in childhood and adolescence, for example to experience one’s own limits or to express the affiliation with a peer group (Lohaus, 1993; Petersen, 2016) .
Primary preventive programmes start here and try to change health behavior positively even before damage to health has occurred. These programmes can be divided into three classes according to the target group: 1) mass media programmes, 2) community-oriented programmes and 3) school programmes. The mass media programmes, such as the “Slip! Slap! Slop! Slip on a shirt, slop on sunscreen and slap on a hat” (Cancer Council Victoria, 2018) , which was distributed in Australia in the 1980s and has since been expanded to include “seek shade and slide on sunglasses”, offer the advantage that a large population group can be reached. A variety of approaches, such as leaflets, TV/video or radio spots, shirts, stickers, comic books or flyers, are used to raise awareness of the dangers, to impart health knowledge and to encourage as many people as possible to adopt appropriate protective behavior. In addition, there are the community-oriented programmes. These usually take place in the direct risk area of the living environment. Sun protection programmes, for example, take place on the beach or in the open air. The third class is school health promotion programmes. Health promotion at school offers the great advantage that very young people can be taught about health-promoting behavior long before long-term damage to health occurs (Eid & Schwenkmezger, 1997; Petersen, 2016) .
As a part of health promotion, health education in school is a cross-curricular task and can be anchored in science education in elementary and secondary school. It covers many different topics, such as nutrition, drugs, or sun protection. Most school health programmes in Germany deal with nutrition and physical activity (Hintze, 2016; Kaiser & Albers, 2015) , less with sun protection. Although between 2 and 3 million non-melanoma skin cancers and 132,000 melanoma skin cancers occur globally each year (World Health Organization, 2019) .
However, the biggest challenge in health education is to influence health behavior. This raises the legitimate question of how this can be sustainably positively changed. The basis of health behavior is knowledge of contents from the field of health, i.e. knowledge about what is beneficial to health. It is already known that measures that work exclusively with deterrents are less effective (Eid & Schwenkmezger, 1997) . Instead, a creative approach to fostering students’ health literacy may be innovative and beneficial (Kaiser & Albers, 2015) . It was also shown that programmes lasting several hours are effective (Eid & Schwenkmezger, 1997) . In view of the fact that incorrect health behavior during early childhood can have severe consequences and forms behavioral routines, it is inevitable to start health education as early as possible and to continue it steadily.
2. Theoretical Framework
The study is based on two theoretical components: health education, using the example of sun protection, and humor studies.
2.1. Health Education—Sun Protection
High-risk health behavior is an increasing problem in our society. Especially in childhood and adolescence, such behavior is often observed (Lohaus, 1993; Robert Koch-Institut, 2018; World Health Organization, 2017) . Since we know that the basis for behavior in adulthood and behavioral routines are developing in early childhood, it is necessary to start health promotion as early as possible, for example in elementary school (Lohaus, 1993) . It is important to attend health education and continue it steadily.
The institution school offers the opportunity to reach many young people. In school, health education as part of health promotion can be taught target-group specific. That means that teachers have the possibility to use specific and motivating methods to foster students’ health literacy and to encourage them to adopt health conscious behavior (Eid & Schwenkmezger, 1997) .
Giest (2016) developed a plan for health education in (elementary) school which is well suited for use in this study due to its ease of implementation and verifiability. The basis of that plan for health-conscious behaviour is to teach students health knowledge.
Creative concepts may be a useful and innovative option for this (Kaiser & Albers, 2015) .
2.2. Humor in School
One promising creative method is to implement humor into learning materials. Students and teachers attach great importance to humor in the classroom (Dickhäuser, 2015) . The use of humor in the classroom is associated with a variety of positive consequences: humor can reduce stress and anxiety, evoke positive feelings, increase attention, motivation, and interest as well as positively influence the learning performance (Markiewicz, 1972, 1974; Powell & Andresen, 1985) .
For this purpose, a special kind of humor called subject-specific humor (SSH) was developed to integrate humor plan ably into the classroom (Dickhäuser, 2015; Petersen, 2016) . Subject-specific humor rests on two theories on humor, pedagogical humor (Kassner, 2002) and incongruity humor (Koestler, 1964) . SSH emphasizes the cognitive component of humor and consists of two reference systems linked incongruently: 1) a content of the curriculum 2) a common situation which is related to the content of RS I (Figure 1). It serves as the basis for cartoons from the fields of health or science.
These cartoons can be used in many different ways for teaching. For example, they can be used at the beginning of a lesson to arouse interest in a topic or to create a cognitive conflict. They can also be inserted at the end of a lesson to reflect on a topic.
Another possibility is to combine the cartoons with accompanying texts. In this way, seven different cartoons for several aspects of sun-protection (e.g. Figure 2) were combined with texts to create self-learning material.
Figure 1. Concept of subject-specific humor.
Figure 2. Cartoon with subject-specific humor.
3. Design and Methods
For our study, we used these picture-text self-learning materials for the topic sun protection. Central for the review was the development of learning gain. The study focused on answering the following research question:
RQ: Does self-learning material with or without subject-specific humor concerning the topic sun protection for fourth and sixth grade differ in effectiveness?
In a first step, self-learning material and test instruments were developed on the basis of Petersen (2016) . Subsequently, with the help of experts in elementary school teaching (N = 2), it was decided which of the seven newly created materials were suitable for use in fourth grade and which, in addition to these, could be used in sixth grade. Afterwards, the material was optimized for use in grade four in terms of content, wording, and form together with the experts.
A pilot study was conducted to check the material and the test instruments. It consisted of a quantitative and a qualitative part. In the quantitative pilot study, the material and test instruments were tested in two fourth and two sixth grades (N = 72, Ø Age 10.4, ♀ 48.6 %). It can be assumed that children at the age of 10 are capable of resolving incongruities so we chose test persons who were on average about ten years old. It was reviewed whether the material and the test instruments were suitable for the intervention in the two grades. The study design intended for the main study, an intervention study in a pre-post-follow-up-test-design, was also tested here. Additional attention was paid to whether the supplementary self-learning materials were suitable for the sixth grade and whether, despite the additional materials, the test time can remain unchanged An additional qualitative pilot study was carried out in a sixth grade of a Montessori secondary school. This served to optimize the content and form of the self-learning material, taking into account the students’ perspective. The students, who are taught according to Montessori pedagogy, are trained in the use of self-learning material. It can therefore be assumed that they can be regarded as experts in the processing of self-learning materials and can provide information on the suitability of the material from the students’ point of view. The results of both pilot studies allowed further optimization of the self-learning material as well as the test instruments for the main study.
In the main study (N = 258, Ø Age 10.9, ♀ 46.9%), the optimized self-learning material and test instruments were used in grades four and six. The self-learning material was administered in two different versions; one which included subject-specific humor and a second version that did not include humor. With this study, we answered the RQ by focusing on the differences between the experimental group and the control group. To answer the research question, ANOVAs were calculated with measurement repetition.
3.1. Description of the Sample
Since it is likely from the point of view of humor research (Wicki, 2000) that children at the age of 10 are capable of resolving incongruities, the choice of the test persons fell on students at this age. In total, data were collected from 352 students from seven fourth grades of elementary school and seven sixth grades of secondary school in North Rhine-Westphalia. Due to the four test times (Table A1 in Appendix), there was a relatively high sample mortality rate. After the adjustment of the data set, 258 complete data sets remain, which are included in the following analyses.
Of these 258 students, 110 attended grade four and 148 grade six. 46.9% of this sample are female students, the other 53.1% are male. 70.5% of the surveyed students stated that German was their mother tongue, 18.6% spoke German for more than six years. Only 10.9% of the students spoke German for less than six years.
The experimental/control group design tested in the quantitative pilot study was retained. The students were randomly assigned to the experimental group or control group. In order to avoid the formation of a mixed group, who receive the experimental material in the first intervention and the control material in the latter intervention, the students were assigned to the experimental or control group either via the seating order or the class list.
The study included four points of measurement. The dependent variables of content knowledge, interest, and attitude towards health behavior were collected. In addition, some demographic data and a test on verbal skills (KFT V3) were administered as control variables at the time of the pre-test (Table A1 in Appendix).
Following each self-learning material, the appropriate test instrument for the sense of humor and humor appreciation was used for control. The pretest took place one week before the first intervention. The second intervention followed a week after the first. Post-tests were carried out directly at the end of both interventions. Since the distance between post-testing and follow-up testing was shown to be sufficient in the pilot study, follow-up testing was again conducted four to six weeks after the second intervention.
3.2. Test Instrument
The content knowledge test consists of a total of 23 items for both grades (four and six) and seven further items for grade six (Table 1). 19 of the 23 items corresponded to the test instrument of the pilot study, 15 were the same in all tests and are therefore the anchor items. Four items were used in one part of the sample at the pre- and post-test time in the old version. In another part of the sample, the optimized items marked N (new) were used. At the follow-up test time, the entire sample could be tested with the new items.
Table 1. Assignment of the items of the content knowledge test to the individual self-learning materials.
SLM = self-learning material.
In order to be able to include all items in the analysis, the evaluation was carried out based on the Rasch-model (Adams & Wu, 2010) . The Rasch-model made it possible to depict all the values collected for content knowledge on a scale and to use these values to calculate a personal ability for each test person. In order to estimate the item difficulty of each item, the values of all three test points were included in the calculation of a common model. This makes it possible from a statistical point of view to compare the three measurement points with each other. Additionally, it is still possible to make a statement about the suitability of the two self-learning materials four and seven as additional material for sixth grade.
All items of the content knowledge test are multiple-choice items in single-select format. There is one attractor and two distractors each to keep the reading load low, especially for students in grade four. If one of the two distractors is ticked, the item is scored 0.
The content knowledge test was used at all measurement points. All items were assigned randomly ordered in a test booklet and tested at each time. During the pretest, the previous knowledge on sun protection was collected by means of the content knowledge test. In the posttest which was divided into two parts due to the two interventions, the newly acquired knowledge was assessed. Following the intervention, the items of content knowledge test matching the self-learning materials were requested. The follow-up test was intended to determine how much of the knowledge acquired through the interventions was still available after a few weeks.
Data from the Rasch-model (Table A2 in Appendix) was used to determine if the test quality criteria were met. The items used to calculate the persons’ abilities have an accepted Infit MNSQ below the value of 1.28 (Linacre & Wright, 1993) and a discrimination above 0.250 (Adams & Wu, 2002) . They thus meet the specified values and are suitable for estimating the persons’ abilities. An exception is item CK7. Despite the poor value of item discrimination (0.1038), this was included in the analysis because it queries important contents that are not covered by any other item. The item reliability of 0.98 is in a very good value and the person reliability of 0.64 is in an acceptable value. A better person reliability would be desirable. However, it was not possible to achieve this by excluding items or by forming three separate scales by subcategory (Table 1). It can be assumed that the reliability can be increased by the construction of additional items or one additional distractor per item.
Analyzing the distribution of the content knowledge items using the Wright-map (Figure A1 in Appendix), it can be seen that the items are evenly spread along the variable. Theoretically, the logit values can be distributed infinitely from minus to plus, but they usually range between −3 and 3 (Linacre & Wright, 1993) . There are a sufficient number of easy items of content knowledge below 0 on the logit scale and also sufficiently difficult items above 0 on the logit scale.
The additional items for sixth grade (items CK20 to CK26) are all on or above 0 on the logit scale as expected with the exception of item CK20. These items are therefore more difficult items, which confirm the decision of being able to use these items as supplements for the older students.
The focus of the reported results will be on the development of content knowledge. First, the development of expertise in the entire group will be considered. Afterwards, the development of expertise in the experimental group and the control group as well as the fourth and sixth grade will be examined and compared.
To distinguish the results, the values of the effect size are reported. These are differentiated as follows:
Effect size of η2 (Bortz & Döring, 2002) are graded as follows:
Small effect ≥ 0.01
Medium effect ≥ 0.06
Large effect ≥ 0.14.
4.1. Examination of the Entire Group
The first step was to analyze whether the material in general was suitable for increasing the content knowledge. To this end, the development of content knowledge throughout the whole group was considered. Since the development of content knowledge is reported here over three measurement points (pre, post & follow-up) in a group, an analysis of variance (ANOVA) with repeated measurement is suitable for evaluating the individual values (Braunecker, 2016) .
The measurement showed that there was a significant learning gain with a large effect over all three measurement points (N = 258, F(2;514) = 157.265, p < 0.001, η2 = 0.380).
The mean values and standard deviations illustrate that the students have only low knowledge in the field of sun protection (Table 2).
It is possible to note that knowledge increases significantly (p < 0.001) after the intervention and is just above the mean. The students also forget significantly (p < 0.001) from the post-test to the follow-up-test time. However, the expertise at the follow-up-test time is still slightly above 0 on the logit scale and, in comparison to the pre-test time, highly significantly above the previous knowledge with a large effect.
Table 2. Mean values and standard deviations of content knowledge of the total sample.
CK = content knowledge.
4.2. Comparison of the Grades
The second step is to analyze the development of content knowledge over the three measurement points within a grade to find out if the age of the students has an influence on their learning with the material. Therefore, an ANOVA with repeated measurement was carried out.
The development of content knowledge in grade four (n = 110, F(2;218) = 27.411, p < 0.001, η2 = 0.418) proves a significant increase in learning gain with a large effect. The mean values and standard deviations show that the students in grade four had less knowledge of sun protection before the intervention (Table 3). After the intervention, the content knowledge increases above logit 0 and then falls just below it at the follow-up test time. The comparison of the individual measurement times illustrates that the students in grade four learned significantly both from pre-test to post-test time (p < 0.001) and from pre-test to follow-up test time (p < 0.001). From the post-test time to the follow-up test time, there was a significant decline in content knowledge (p < 0.001).
The measurement of grade six over all three measurement points (n = 148, F(2;294) = 21.999, p < 0.001, η2 = 0.352) also shows a significant increase in learning gain with a large effect. The mean values and standard deviations represent that previous knowledge on the subject of sun protection can also be classified as rather low among students of sixth grade. After the intervention, they have a solid knowledge of sun protection (Table 3). An examination of the development of content knowledge shows that the content knowledge increased significantly from the pre-test to the post-test (p < 0.001) and significantly from the pre-test to the follow-up test (p < 0.001). A significant decrease in content knowledge (p < 0.001) was observed from post-test to follow-up test time.
A multivariate analysis of variance (MANOVA) was used to compare the respective test times between the grades since it can also be used to look at the post-hoc tests at the individual measurement points. The comparison of the two grades with the individual test times shows that there is a significant difference with a small effect (V = 0.053, F(3.254) = 4.703, p = 0.003, η2 = 0.053). A closer look at the individual measurement points reveals that the students in grade four have significantly lower prior knowledge (pre-test time) with a small effect (η2 =
Table 3. Mean values and standard deviations of 4th and 6th grade.
CK = content knowledge.
0.043) than the students in grade six (p < 0.001). This difference cannot be measured again either at the post-test or at the follow-up test time, which means that the expertise of the students of both grades after the intervention is at a comparable level.
4.3. Comparison of Intervention Group and Control Group
Finally, it is interesting to see whether it makes a difference with which material the students learn, i.e., whether the experimental or control material has an advantage. For this purpose, the results of the experimental and control group are evaluated separately and also compared.
As in chapter 6.2, the first step is to analyze the development of content knowledge over the three measurement points within a group by calculating an ANOVA.
The measurement in the experimental group over all three measurement points (n = 141, F(1,928, 269,922) = 94.015, p = < 0.001, η2 = 0.402) shows a significant increase in learning gain with a large effect.
The examination of the mean values and standard deviations shows a low level of prior knowledge and an increase in content knowledge to a basic knowledge after the intervention, which is still in the sufficient range even after a few weeks (Table 4). There was also a significant increase in learning from pre-test to post-test (p < 0.001) and from pre-test to follow-up test (p < 0.001). However, the comparison of post- and follow-up measurement indicates a significant decrease in expertise (p < 0.001).
The evaluation of the content knowledge over all three measurement points in the control group (n = 117, F(2;232) = 64.006, p < 0.001, η2 = 0.356) also shows a significant increase in learning gain with a large effect. The consideration of mean values and standard deviations also confirms a similar knowledge as in the experimental group (Table 4). The increase in learning gain from pre-test to post-test (p < 0.001) is significant. Here, the comparison of post-test to follow-up test time (p < 0.001) shows a significant decrease in content knowledge, too.
There is no significant difference between the two groups as statistically confirmed by a MANOVA (V = 0.001, F(3.254), = 0.045, p = 0.987 η2 = 0.002). None of the three measurement points shows significant differences or effects, either.
Table 4. Mean values and standard deviations of experimental group and control group.
CK = content knowledge.
The analyses show that the self-learning material is suitable for the retention of relevant content knowledge on sun protection.
It makes no difference whether the students learned with the experimental material or with the control material as the comparison of the experimental group’s and the control group’s development in learning gain showed. With regard to RQ, it must therefore be noted that no group differences can be measured.
The comparison of the two grades underlines the assumptions made before the test. Only the students in grade six had some small prior knowledge, whereas those in grade four had significantly lower prior knowledge. This small prior knowledge underlines again how important it is for students to obtain health knowledge, especially on the topic of sun protection, remembering that health knowledge is the basis for health-conscious behavior. It is interesting to note that the content knowledge of the students of both grades no longer differed significantly at the time of the post- and follow-up test. Students in grade four were thus able to reach the level of grade six and remain at this level.
6. Discussion and Outlook
In conclusion, it can be said that the developed material is suitable for teaching students of grades four and six knowledge in the field of sun protection. In this context it must be said that material with and without subject-specific humor is equally suitable for intervention and learning with material with subject-specific humor does not, as assumed, lead to a greater and, above all, more long-term increase in learning gain. As the results of the comparison of the grades showed, it would be desirable to develop further material for the older students to enable them to increase their knowledge. In addition, it should be noted that only material for sun protection and not for other health education topics has been tested.
Further studies should test supplementary material for younger or older students. In addition, material on other health education topics, such as nutrition, can be developed and evaluated. It would also be interesting to ask students whether the material with subject-specific humor is more attractive to them and whether they would therefore rather learn with this material than with material without humor. First results of further studies are already giving first indications that students prefer to learn with humorous material rather than with the control material.
One way of further enhancing the measured positive effect of self-learning materials for sun protection is to extend the intervention to further intervention points or to develop and test training in subject-specific humor. This training could explain the concept of subject-specific humor by using examples. It can make it easier for the students to understand the cartoons with subject-specific humor so that it can be assumed that disinterest or cognitive conflict are not caused by a lack of understanding of the cartoons. A test instrument for understanding subject-specific humor should also be developed. This will test whether the students understand the subject-specific humor and whether the understanding of the cartoons has an influence on the learning success.
Since the concept of subject-specific humor is a relatively new method for school education in Germany, there are very few research projects dealing with the concept, its effects and benefits. Research in this area should therefore be continued and expanded, as this work has already shown that subject-specific humor is suitable for intervention in health education in school and supplementary studies show that students would like to see more humor in school.
Table A1. Plan of the study.
Table A2. Characteristics to scale content knowledge.
CK = content knowledge, N = new.
Figure A1. Wright-Map of content knowledge scale.
 Cancer Council Victoria (2018). Slip! Slap! Slop! Original SunSmart Campaign.
 Kaiser, A., & Albers, S. (2015). Empirical Effectiveness Testing of Teaching Units on the Content of “Nutrition” in Elementary School Subject Instruction. Widerstreit—Sachunterricht, No. 21, 1-9.
 Kassner, D. (2002). Humor in Teaching: Meaning Influence Effects: Can School Performance and Vocational Qualifications Be Improved by Pedagogical Humor? Baltmannsweiler: Schneider-Verl. Hohengehren.
 World Health Organization (2019). Ultraviolet Radiation (UV)—Skin Cancers.