The scoring system used in any kind of game can have considerable influence on the satisfaction of players during gameplay. Scoring acts as a type of positive feedback and reward system capable of spurring players on toward greater challenges (Shneiderman, 1992) . Game designers have traditionally tended to adopt quantitative scoring systems as a means of enhancing the enjoyment of participants in their gameplay. However, scoring systems are becoming increasingly diverse, and the attitudes of game players toward these the systems employed can strongly influence the degree of satisfaction they feel toward the game as a whole. Clarifying the relationship between scoring systems and satisfaction in gameplay requires that one understands the design aspects and functions of scoring systems.
Scoring often serves as a bridge between games and players, and thereby provides an indication of the degree to which game players are intent on achieving the objectives of the game (Schell, 2008) . In other words, scoring is a way of measuring success (Rollings & Adams, 2003) . Because scoring stimulates players to act, game designers frequently design scoring systems as a system by which to guide players through the game (Adams & Dormans, 2012) . Burgun (2012) and Bates (2004) indicated that games require scoring systems to increase the likelihood that a game will be played repeatedly. In this way, scoring could be seen as a means of prolonging the life of games. The methods used in the presentation of scores can be obvious or subtle. In interactive story games, scoring is not perceived by players, despite the fact that the total score ultimately determines the outcome of the game (Crawford, 2013) .
Designers seeking to use a scoring system to connect players with games must possess a clear understanding of how the system can influence the satisfaction of the players. Such an understanding makes it possible to adjust the scoring system for optimal effects. Howard and Sheth (1969) determined that an outcome is considered acceptable only if it exceeds the value of the opportunity cost. Evans (1976) indicated that a decline in one’s sense of satisfaction reduces one’s willingness to use a system. Scoring systems are generally used as instruments of self-assessment and comparison, and can sometimes indirectly influence gameplay (Wang & Sun, 2011) . According to Malone (1981) , score-keeping is an important part of what makes playing a game a pleasurable experience. Therefore, it stands to reason that improving the design of scoring systems would lead to players feeling more satisfied with a game.
Increased diversity in scoring methods is making it increasingly difficult to identify the specific aspects of scoring that have the greatest impact on player satisfaction. Without the means to classify scoring systems, game designers are forced to implement scoring based largely on personal experience. This study was an attempt to categorize the scoring systems used in commercially distributed and relatively well-known games according to their functionality. Specifically, the process involved three steps:
1) Description of the functions of scoring systems.
2) Compilation of these functions into a questionnaire enabling the assessment of scoring systems by avid game players and game designers.
3) Conversion of the results using multidimensional scaling (MDS) to determine the potential psychological dimensions of the assessed scoring systems.
In the first step, analysis of scoring systems was conducted from the perspectives of game designers as well as game players. Experts in the field of gaming as well as individuals who frequent forums on game design were consulted to compile a database of descriptive characteristics related to scoring systems. In the second step, the resulting concepts were then used to produce a questionnaire, which was administered to game designers and avid game players with the aim of assessing the scoring systems in several representative games. In the final step, the design aspects of scoring systems were then categorized using MDS.
MDS is a dimension reduction technique used to convert data related to the distance (or dissimilarities) between pairs of individuals in a group into the configuration of the same individuals in space (perceptual maps). This is achieved while maintaining as much as possible, the relative relationships within the original data (in the form of distance or dissimilarity matrices). Following conversion, proximal distance between pairs of individuals can also be used to represent similarities/dissimilarities for use as inputs for MDS. Correlation coefficient matrices are one example of using the degree of correlation as a representation of similarity. The benefit of using MDS lies in its capacity to convert high-dimensional data related to the scoring systems into a low-dimensional configuration in space (perceptual map). Reducing the number of dimensions makes it possible to represent core design aspects of the scoring systems. Compared to principal components or factor analysis, both of which can also reduce dimensionality, MDS usually gives a model of smaller number of dimensions and its spatial configuration is generally easier to interpret. Furthermore, even though that sometimes one cannot assume a linear relationship between distances and dissimilarities, multidimensional scaling nevertheless provides a simple dimensional model that is easy for one to grasp.
This study evaluated questionnaire data on 12 scoring systems. Collectively these systems comprise 20 different scoring functions. These functions were converted into dissimilarity matrices. The mean scores related to system function, as provided by game designers and avid game players, were used as original data, with each function represented as a dimension. Thus, the data are multi- dimensional. The researchers then used SPSS (IBM_Corp., 2012) to derive correlation coefficient matrices for the scoring systems. Following conversion, the correlation coefficient matrices were input into MDS for processing. The researchers observed the elbow on the RSQ function (see Figure 3) and the stress scree plot (see Figure 4) to determine how many dimensions to use in interpreting the scoring system. The researchers named the dimensions, each of which represents a design function of a scoring system, after comparing the differences between their two most extreme points.
This categorization scheme provides valuable insight into the distribution of various scoring systems among the various types of games. After identifying the trend of each axis, the researchers can categorize all scoring systems by type. This will enable us to identify the design elements of scoring systems for different types of games. The researchers can also study the scoring design of highly satisfying games, searching for regularities in order to enable an exploration of the relationship between the core aspects of scoring systems and gaming satisfaction. Once the researchers understand the regularities of scoring systems for different game types, the researchers can test whether gaming satisfaction could be improved by adjusting scoring systems. Regularities identified in the design of scoring systems among the various types of games can be used to improve scoring systems or serve as reference for designers aiming to deviate from current norms and develop new gaming systems.
This study adopted metric MDS to enhance the objectivity of our exploration of the design aspects of scoring systems. Our investigation was conducted in two stages: 1) identifying the intended functions of various scoring systems and 2) assessing these functions in the context of existing games using a questionnaire. The results were then analyzed with the aim of categorizing the issues that must be considered in the design of a scoring system.
2.1. Rationale of MDS
This study employed multidimensional scaling (MDS), a dimension reduction technique to describe complex data using a minimal number of dimensions. Beginning with a similarity matrix, MDS can be used to uncover the hidden configurations―in other words, the dimensions―within a group of data. For an illustrative purpose, the researchers can take a color study as an example to demonstrate how similarity/dissimilarity matrices can be used to identify these configurations. A group of participants were asked to evaluate the similarity of 14 different types of spectral color (Ekman, 1954) . Each color was labeled based on nanometer wavelengths (W434, .., W674). The participants were then asked to compare the “qualitative similarities” between pairs of colors. Table 1 shows a modified version of the similarity matrix provided by Ekman. It has been transformed into an SPSS data set suitable for use by ALSCAL. The matrix contains
Adapted from “Dimensions of color vision” by Ekman, 1954 , Journal of Psychology, 38(2), 467-474.
the averaged judgments of 31 participants regarding the similarities of 14 colors (wavelengths). The higher the score, the more similar these colors appeared to the observers. The matrix was then input into SPSS for dimension reduction using MDS.
The resulting scree plot (Figure 1) clearly shows that the elbow falls between two dimensions, which means that spectral colors can be adequately described using two dimensions. The perceptual map (Figure 2) illustrates the distribution
of the 14 colors among the two dimensions, with a configuration resembling a color wheel. The differences between the extreme points of the two dimensions show that in Dimension 1, colors approaching W610 are closer to cyan, and colors closer to W490 are similar to magenta. In Dimension 2, colors approaching W555 are closer to yellow, and colors closer to W434 resemble blue.
Scoring systems are extremely complex. The researchers must first objectively ascertain the attributes of scoring systems, and then have game designers or players compare the similarities of attributes against systems. The distance between scoring systems is used to build the dissimilarity matrix. Next, MDS is used to configure each scoring system using a minimal number of dimensions. Lastly, the researchers evaluate the meaningfulness of each dimension by comparing the difference between its two most extreme points.
2.2. Functions of Game Scoring Systems
Most scoring systems are designed with a number of functions in mind, and many scoring methods are subtle in their effects. This made it exceedingly difficult to identify a set of scoring functions directly from the literature or game manuals. After discussion with a professor of Digital Media at the Georgia Institute of Technology who has over 20 years of industry and academic experience in game design, the researchers selected 35 commercially distributed and relatively well-known classic games (Table 2) based on game types and studied the functions of their scoring systems. In order to avoid biasing the game selection towards any particular region, the researchers chose iconic games that are extremely popular in both Asia and the U.S. Despite slight differences between these two regions, their gaming interests are very similar. After the initial analysis, the researchers eliminated games with highly similar scoring systems and identified a list of 15 functions. The researchers then posted these findings on a well-known website dedicated to game development, Gamedev.net, asking participants to comment on and make suggestions. The four participants in the discussion comprised three game designers (with 5, 25, and 27 years of industry experience, respectively) and one game reviewer (who had posted over 7000 reviews at GameDev.net). Inputs from these participants were integrated into the final list of scoring functions. Following a final revision and confirmation from the professor, our analysis provided a total of 20 scoring functions, as shown in Table 3.
2.3. Rating of Scoring Systems
・ Participants in questionnaire survey
Our participants were avid game players or designers, who were required to have a certain level of existing knowledge about these games. All of the participants should have more than five years of experience playing games. Game designers should have at least one-year experience in game design. This criterion required us to employ purposive rather than random sampling. The researchers posted an invitation on the PTT Game Design board to recruit game designers
Table 2. 35 classic games.
Table 3. Functions of scoring system.
and avid game players to fill out an online questionnaire survey (Table 3). PTT is a well-known bulletin board system (BBS) in Taiwan, with over 100 game designers participating in discussions on its Game Design board. To overcome the difficulties in recruiting willing participants, the researchers also invited some game designers from Softstar Entertainment Inc. (where the first author previously worked), and other game developers to take part in the survey. The researchers obtained data from a total of 34 participants, including eight avid game players and twenty-six game designers, eleven of whom had more than five years of experience and fifteen of whom had between one and five years of experience. All of the participants had more than five years of experience playing games.
2.4. Classification of Scoring Systems
・ Online questionnaire
The questionnaire shown in Table 4 was used for evaluating the 12 scoring system. The questionnaire was compiled on the online questionnaire system, my Survey. Each item of this questionnaire is a five-point Likert scale with which the participant was to evaluate one of the scoring system functions listed in Table 2. The participants were asked to estimate the proportions of the 20 functions in
Table 4. The questionnaire used for rating.
the 12 scoring systems, ranging from “does not include” to “strongly includes”. Representative examples were provided to prevent participants from being confused with regard to the subtleties of various scoring systems. The researchers also added links to videos for a number of the example games to illustrate the form and function of the various scoring systems.
・ Multidimensional scaling
This study employed the built-in MDS function (ALSCAL) of SPSS for the processing of the data in Table 5 into distance matrices showing dissimilarities among the various scoring systems. This made it possible to proceed with dimension reduction and the computation of perceptual maps.
Table 5. Assessment results of game scoring system. A: Top score system, B: Health system, C: Evaluation system, D: Experience point system, E: Abilities system, F: Talents system, G: Resources system, H: Moral-calculus system, I: Trade system, J: Plot scoring system, K: Pong scoring system, L: Timer system.
3. Results and Analysis
3.1. Assessment Results of Game Scoring Systems
Table 4 presents assessments made by the 34 participants with regard to the 20 scoring system functions and how they pertain to the twelve game scoring systems (A to L). Each cell contains the mean score awarded by the participants for one function associated with scoring systems. For example, when “Does not include” is selected, the Goal function of the Top score system earns 0 points. In contrast, the assessments of “Does not include”, “Maybe includes”, “Slightly includes”, “Includes”, and “Strongly includes” receive 25, 50, 75, and 100 points, respectively. The total scores awarded by participants for the Goal function of the Top score system was then divided by 34, resulting in 87.65, which represents the weight of the Goal function in the Top score system. The first column lists the functions of the scoring system with the scoring systems listed across the top.
3.2. MDS Data Analysis
・ Scree Plot
The MDS process requires that the user determine a reasonable number of dimensions for the perceptual maps, based on the patterns displayed in the scree plots. Figure 3 and Figure 4 present the scree plots based on RSQ and stress. As
Figure 3. RSQ scree plot.
Figure 4. Stress scree plot.
the stress scree plot does not show a clearly distinguishable elbow, the researchers used the RSQ plot to determine the number of axes. The researchers found that three axes could be used to explain approximately 90% of variance. Any benefit to be gained from using more dimensions would be overshadowed by the increased complexity in interpreting data. Therefore, the researchers employed three axes to explain the scoring systems in order to provide sufficient explanatory power.
・ Perceptual maps
In the Figure 5 and Figure 6, the perceptual maps present the distributions associated with the scoring systems in three-dimensional space (Table 6). For the sake of convenience, the researchers deconstructed the maps into two-di- mensional figures: first axis-second axis, first axis-third axis, and second axis- third axis (X, Y, and Z). The axes were designated according to their two most extreme points. A group of analysts, each equipped with five or more years of experience in game development, was then assembled to propose names for the axes according to differences between the two extreme points. Analyst 1 is the author of this paper and has experience developing multiple MMORPG at Softstar Entertainment. Analyst 2 is a game designer at Interserv International Corporation, and has developed Internet community games as well as game apps. Analyst 3 is a game designer at IGS and has developed many types of arcade games. Following in- depth discussion, a consensus was reached. The first axis involves the plot scoring system and the Pong scoring system, the main difference between them being the perceivability. The two most extreme points on the second axis were the trade system and the timer system, with the greatest difference in controllability. The two most extreme points on the third axis were Top score system and the health system, which differed most in achievement. Thus, following a group discussion, the three axes were named Perceivability, Controllability, and Relation to Achievement.
Perceivability indicates the level of awareness players have of their scores. Controllability refers to the degree of control assigned to players with regard to the scores they receive. Finally, relation to achievement refers to the importance of the score to the players. Each type of scoring system provides specific means by which players can connect with the game. The three dimensions are analyzed in detail in the following Discussion section.
Table 6. The 3D coordinates of all scoring systems used in this study.
Figure 5. Perceptual map for Dimension 1 and 2.
Figure 6. Perceptual map for Dimension 1 and 3.
This study used multidimensional scaling (MDS) to identify the following three aspects of scoring systems that should be considered in the design of games: perceivability, controllability, and relation to achievement.
This dimension refers to the extent to which players are aware of the existence of the scoring system. This affects how immersed players become in the game and what gaming strategies they develop. Highly perceivable scoring systems are usually used in games that require decision-making strategy. Players make decisions that reduce/increase their scores in order to maneuver themselves into more advantageous positions. Games with strong story appeal, on the other hand, employ less perceivable scoring systems, to prevent players from shifting their focus from the storyline.
・ Highly perceivable
A perceivable scoring system means that players can see and refer to their scores, which are usually displayed as numerical values on screen. Players can adjust their behavior based on their cumulative scores, which indicate their performance. Score-oriented games usually have highly visible scoring systems. In baseball games, for instance, the final victory is determined by the scores of each team. Players must understand their scores in order to devise offensive or defensive strategy. Puzzle games like Tetris (Pajitnov & Pokhilko, 1984) are also designed to encourage players to pursue higher scores. Current and personal best scores are displayed on-screen so that players can comprehend their position at a glance.
・ Barely perceivable
Games with this rating have scoring systems that are not easily visible to players. This approach is mainly used to prevent concern over scores interfering with the experience of the game. It is intended that players make decisions based on what is shown on screen, with each choice having a corresponding point value. Scores are then tallied in the background and the results used to determine the progression of the player through the game.
In the interactive narrative Heavy Rain (Sony, 2010) , no scoring system is explicitly explained or blatantly obvious. Rather, the scores are tallied in the background according to the decisions made by players throughout the game. Even though the scores are not easily perceived by players, the total score plays a crucial role in determining the outcome of the story. Open world games, in which players have scope to engage in destructive behavior, often have an embedded, albeit invisible, ethics system to encourage players to take responsibility for their actions. The development of the player through the game is based on the ethics score, which is increased by constructive behavior and decreased by destructive behavior. Many role play and virtual romantic games have non-perceivable scoring systems based on intimacy and attraction. The game is designed to encourage the player to observe and react to changes in the behavior of the other virtual characters without being able to view the attraction score.
Scoring systems can also be categorized according to the amount of control players can exercise with regard to their scores. Controllability affects the freedom of players to manipulate their own score and whether they can employ multiple game strategies. In highly controllable scoring systems, scores represent a quantity of resources that can be converted into other resources of an equivalent value. Scoring systems with low controllability usually have fixed feedback mechanisms; although there is limited scope for players to change or convert scores, the feedback indicates whether an objective has been achieved.
・ High controllability
Adams and Dormans (2012) listed four functions of economies: production, consumption, transfer, and consumption of resources. Scores can be used to indicate a quantity of economic resources or converted from one resource into another. In strategy games such as StarCraft (Blizzard, 1998) , scores represent quantities of resources in the form of units. Players are able to combine resource units to produce other resources of an equivalent value. This concept is also applied in The Sims (Maxis, 2000) , in which currency can be converted into furniture of equal value.
The skill points earned by players in some RPGs, such as World of Warcraft (Blizzard, 2004) , can be freely distributed by players in order to influence the professional development of their character, which can have a significant influence on their capabilities later in the game. This method is also common in role- playing sports games such as the MLB 2K (2K, 2005) series. Following each game, the system rewards players for good performance by endowing them with skill points, which can be applied by the player to enhance the skills they seek. Bartering systems in games such as The Sims can also be considered a type of point distribution system, in which players determine the means by which to allocate their money in the purchase of virtual products.
・ Low controllability
Some aspects of scoring allow limited participant control. These are generally determined according to the designers who implemented them, such that the players must passively accept these factors as a predetermined mechanism. For instance, players in a basketball game can only score between one and three points for each shot. In speed-based games, timers present fixed values that apply to all players, meaning that players are unable to manipulate time at will. In puzzles games such as Tetris, different numbers of tiles correspond to different point values. Players have less opportunity to control these scores as they are generated through predefined feedback mechanisms.
4.3. Relation to Achievement
Scoring systems can be divided according to their objective meaning to players. These objectives may be goal of a game or the psychological objectives of the players. This dimension affects the lifespan of the game. A player who no longer feels challenged to achieve something in a game is significantly less willing to continue playing the game. The greater the level of achievement offered by the scoring system, the greater the level of challenge.
・ Highly correlated to achievement
Some scoring systems do not influence the progress of the game, but rather indicate the personal achievement of a player. For example, gaining a high score is not the primary goal in Tetris; however, many players attach greater importance to gaining a high score than to achieving the objectives of the game. The desire to obtain a higher score sometimes presents a challenge that players cannot resist. The rating system in Dance Dance Revolution (Konami, 1998) evaluates the dance moves made by players during the game. Each move corresponds to a certain number of points, and players can receive higher scores by adjusting their movements and the precision of their timing.
Many pay-per-use gaming systems use this type of scoring system to encourage players to play a game repeatedly in order to obtain a higher score. This concept is also implemented in gaming consoles to extend the lifespan of games that would otherwise be played only once or twice. Players spend more time trying to break their own records or the scores of others and thereby gain a sense of achievement. This characteristic of encouraging participants to play repeatedly can also be found in other types of scoring systems. One example is the skill points in RPGs, which encourage players to try out characters that feature different skills.
・ Barely correlated to achievement
Scoring systems that have a low relation to achievement are generally binary in nature, such as the health system in Street Fighter (Capcom, 1987) , in which a player ends up either alive or dead. Achievement related to these scoring systems is not viewed as an objective target. For example, the resources in Starcraft can only be converted into other valuable resources. Accumulating these resources is necessary; however, the sense of achievement is experienced as a secondary benefit.
4.4. Three Dimensions in Scoring System Design
The results above are meant to clarify for game designers the aspects of scoring systems that should be considered in the design of games. The researchers also present the distribution of various types of scoring system in the most common gaming categories. For instance, the scoring system in RPGs is based mainly on controllability and relation to achievement, whereas interactive narrative games adopt a more subtle scoring approach that is less perceivable to players but exerts significant influence on the outcome of the game. Racing games use a system of timing, which presents low controllability but higher relation to achievement and perceivability. Clearly, scoring systems differ in their design aspects. The researchers believe that the degree of influence each of these design aspects has within a scoring system influences the satisfaction of players. By optimizing the proportion of each of these aspects in a scoring system, the researchers can increase their value.
Scoring systems largely determine how long players stay in a game and represent the most obvious form of feedback with regard to the choices made by a player or the player’s performance. As such, the scoring mechanism has an essential impact on player satisfaction. Scores can be tangible or intangible, and exist in any form within a game. They can be presented in the form of numbers, text, or images or be entirely hidden from the players. The feedback provided by a scoring system also varies from game to game. Regardless, the purpose of scoring is to quantify the performance or status of players. If the scores that are awarded fall short of player expectations, players can feel disconnected from the game, which undermines player satisfaction. In contrast, when players are able to connect the scores they receive with the values that the scores represent, they are more likely to continue challenging themselves in the game. Unfortunately, game designers are often over-dependent on the scoring system, which frequently leads to an excessive number of scoring systems in a game, some of which are neglected by players. Such scoring systems are easily neglected by players, but designers still hope that they can prolong the duration of gameplay. In this case, it is even more crucial to consider what a scoring system means to players, whether players can connect with the game through the scoring system, and how satisfied players are in the scoring system. Excessive numbers of scoring systems can confuse players, such that the scores lose the meaning that designers had hoped to achieve. When game designers consider the relationship between scoring systems and player satisfaction, they generally focus on the balance among game parameters. Receiving rewards for their actions helps to increase the satisfaction felt by players; however, the researchers believe that the purpose and presentation of scoring systems also affect the gaming experience of players. Game designers must understand the feelings of players towards scoring systems to enhance the feeling of connectedness with the game.
With regard to research limitations, the researchers were unable to employ random sampling (purposive sampling was used instead), due to the requirement that participants have a certain level of existing knowledge about the games. Also, the range of our study does not cover all types of scoring system used in contemporary games. It is difficult to include more than 15 types of stimuli in MDS. More stimuli require participants to evaluate more items, which affects the quality of research. Therefore, the researchers focused on globally popular, commercially marketed games when selecting scoring systems.
Our results provide a valuable reference for game developers in the design of scoring systems, allowing them to consider beforehand the experiences they want to convey via the scoring system. The results show that the various scoring systems are evenly distributed among different types of games, which means that scoring systems do not simply give feedback to players but also have their own substantive uniqueness in different types of games. As for the niches that various scoring system combinations may have in the functional implications of different types of games, this requires further investigation. Future researchers could delve into the distribution of these design aspects in the scoring systems of games that bring greater satisfaction in order to determine the correlation between player satisfaction and various scoring systems. This could also help to reveal patterns in the scoring systems used in particular types of games. The exploration of scoring systems from these aspects could further our understanding of how scoring systems influence players, whether they have significant interaction effects with game type, and whether certain game types can use scoring systems with certain dimensions.
We would like thank Pro. Celia Pearce for her collaboration during preliminary investigations at Georgia Institute of Technology.