This study aims to test the validity of the Morpheme Congruency Hypothesis (Jiang, Novokshanova, Masuda, & Wang, 2011) and the Failed Functional Features Hypothesis (Hawkins & Chan, 1997) . Briefly put, when learning a second language (L2), the ability to ultimately attain a morphological category is dependent on whether the category also exists and functions in a similar capacity within the first language. For instance, if the L1 and L2 both encode for gender or number at the morphological level, then learners of the L2 will likely be able to fully acquire the L2 morphemes (e.g., L1 French & L2 Spanish). In contrast, if the L1 lacks the L2 morphological category, then it is expected that there would be deficits in its acquisition. A classic example would be the number agreement errors produced byL1 Chinese and Japanese learners of L2 English (e.g., the plural -S & the 3rd person -S). According to this account, despite both languages having plural morphemes (e.g., -men in Chinese & -tati in Japanese), because both languages lack a robust plural marking system, ultimate attainment of the plural marking system in English should not be possible.
While there is evidence supporting this claim (Jiang, 2004, 2007; Jiang et al., 2011) , other findings (Song, 2015; Wen, Miyao, Takeda, Chen, & Schwartz, 2010) cast doubt on its claims. Accordingly, the current investigation aims to determine to what degree native-speaking Japanese learners of English in Japan are sensitive to number agreement errors in a Japan context. It is our argument that Japanese learners of English have a greater degree of difficulty detecting violations in number agreement for the null morpheme in comparison to the overt plural -S morpheme.
1.1. Inflectional Morphology
The English language is classified as being mostly analytic with little morphology in comparison to languages with richer inflectional morphology such as the agglutinative language Japanese. Despite Japanese utilizing morphosyntax more regularly, both the English and Japanese language code for nominal plurality using inflectional suffixes. However, English plural morphology is obligatory for regular nominal marking while in Japanese, the morphological suffixes -tati/ra are optional, are reserved to denote + HUMAN or + ANIMATE, and are more restrictive than English plural morphemes (Hosoi, 2005; Kurafuji, 2004) . As seen in examples (1a) and (1b) below, English numerical inflections differ from Japanese such that the null form denotes singularity whereas in Japanese, the null form is unmarked for singularity and plurality.
(1a) English Numerical Inflections
singular = student - f = student plural = student -s = students
(1b) Japanese Numerical Inflections
singular/plural = gakusei - f = gakusei plural = gakusei -tati = gakuseitati
Not only do the languages differ regarding morpheme usage, they also differ in their rules for number agreement between feature-checking dependencies. When a nominal expression is in the scope of a numerical quantifier, number must agree between the two dependencies in English. In Japanese, on the other hand, this is not exactly the case. Interestingly, plural quantifiers are typically considered to not require or even allow the plural morpheme when the noun is in scope of a quantifier (Kurafuji, 2004) . However, the two languages do share the specific aspect that neither allows for a plural morpheme suffixed to a noun within the scope of a quantifier denoting singularity. In summary, not only are English and Japanese incongruent for the usage of plural morphology, they are also incongruent for feature-checking.
(2a) English Number Agreement
This(singular) student(singular) This(singular) *students(plural)
These(plural) students(plural) These(singular) *student(singular)
(2b) Japanese Number Agreement
Hitori-no‘one’(singular) gakusei‘student’(singular) Hitori-no(singular) *gakuseitati(plural)
Futari-no‘two’(plural)? gakuseitati‘students’(plural) Futari-no(plural) gakusei(plural)
Sono‘that’ gakusei‘students’(singular/plural) Sono gakuseitati(plural)
1.2. Second Language Acquisition Theory
For second language acquisition (SLA), while it is agreed that morpheme acquisition is an imposing task for L2 learners, there are two opposing arguments for L2 morphosyntactic acquisition. On one side, some researchers consider that L2 learners are limited in their inherent ability to acquire and process a second language, and as a result, ultimate attainment of L2 morphosyntax is unlikely. For instance, the Fundamental Difference Hypothesis (Bley-Vroman, 1990) states that while L1 acquisition is guided by Universal Grammar (UG), adult L2 acquisition is not which imposes difficulties on L2 acquisition and leaves learners with an incompleteL2 grammar (Clahsen, Felser, Neubauer, Sato, & Silva, 2010; Meisel, 1991; Schachter, 1988) .
Similarly, the Shallow Structure Hypothesis (Clahsen & Felser, 2006a, 2006b) states that L2 learners are limited in their capability to perform feature-checks between dependencies, retrieve and integrate a dependency within working memory and processes syntactic movement for non-local arguments. In other words, L2 learners have an impaired morphosyntactic representation and instead must rely on semantics to understand an L2 sentence. In a more specific manner, both the morpheme congruency hypothesis (Jiang et al., 2011) and the failed functional features hypothesis (Hawkins & Chan, 1997) predict that L2 learners are unable to acquire a morphosyntactic category not found in the L1 and thus would have an incomplete representation of said morpheme.
While the above arguments are based in differing frameworks, they agree that L2 learners have an impaired representation of morphosyntax which prevents learners from achieving native-like processing. As such, these arguments make the claim that errors and lack of sensitivity to ungrammatical forms result from issues of L2 linguistic competence rather than issues of real-time performance (Jiang, 2004) .
In contrast to the impaired linguistic representations, others (Montrul & Slabakova, 2003; Schwartz & Sprouse, 1996; White & Genesee, 1996) instead argue that adult L2 learners do have UG mediated based learning and ultimate attainment of a novel morphological category is achievable. A major claim of this position is that production errors merely reflect deficits in performance and that an L2 learner has, more or less, an unimpaired representation of morphological features and their agreement rules on the abstract level.
Prévost and White (2000) posit that errors elicited are often systematic in nature. For instance, while Prévost and White (1999) found that L2 learners of French and German overproduced non-finite verbs forms in finite contexts, finite verb forms in non-finite contexts were approximately absent. As such, their Missing Surface Inflection Hypothesis (Prévost & White, 2000) suggests that L2 learners have greater difficulties in the mapping of morphological features under task pressure (i.e., spontaneous production), thus causing L2 learners to rely on default forms. This in turn creates increased variability for non-default morphemes. Importantly, however, they argue that this increased variability does not necessarily represent the abstract representation of morphological forms and thus L2 learners should have higher performance ability during offline tasks. Additional support for performance deficits is derived from studies observing greater limitations in working memory for L2 learners (Keating, 2010; McDonald, 2006) . Consequently, errors and insensitivity to grammatical violations might better represent learners’ performance limitations rather than their competence or lack thereof.
Moving on to the acquisition of morphological categories, previous studies (Bailey, Madden, & Krashen, 1974; Dulay & Burt, 1974a, 1974b; Goldschneider & DeKeyser, 2001; Krashen, 1977; Krashen, Sferlazza, Feldman, & Fathman, 1976) have outlined the sequential pattern of morphological categories for both child and adult L2 learners and revealed a similar pattern, which has been successfully applied to many languages. Importantly, the plural -S morpheme is regarded to be acquired relatively early during the development process. While Krashen (1977) claimed that this order is not subject to L1 interference, that claim is now regarded as inaccurate, and the morphological acquisition order is influenced by a learner’s L1 and exposure to the target language (see Ellis, 2002; Larsen-Freeman, 1976 ).
In a similar vein, some researchers make the argument that for particular L1 groups, observed difficulties or benefits during acquisition can be attributed to L1 transfer or interlingual differences/similarities (Andersen, 1984; Bryant, 1984; Chen, Shu, Liu, Zhao, & Li, 2007; Luk & Shirai, 2009; Yip & Matthews, 2000) . Due to the absence of a robust plural morphological system, L1 Chinese/Japanese speakers learning English appear to have additional difficulty acquiring the plural -S morpheme. Even English as a second language (ESL) learners have difficulty correctly using plural -S even after years of living in an English-speaking country (Schmidt, 1983) . Accordingly, Japanese learners of English might be incorrectly using their L1 grammar when processing L2 English. This is expected to make the acquisition of English plural -S morpheme more difficult, but not unattainable.
Evidence of Japanese learners of English increased difficulty with acquiring the plural morpheme comes from Tono (2000, 2009) who revealed that Japanese ESL learners’ distribution of errors suggests that their developmental stage of acquisition is delayed for the plural morpheme in comparison to other languages and morphological categories. However, it might be difficult to tease apart if this delayed acquisition results from morphological incongruence, language transfer or both.
The distribution of errors revealed by Tono (2000, 2009) , however, might have further relevance to the lack of sensitivity to violations and errors produced by L2 learners. Returning to the subject that learners’ errors are a result of variability and real-time mapping difficulties (Prévost & White, 2000) , Hopp (2010, 2012, 2016) has demonstrated that lexical and grammatical variability can impair both L1 and L2 speakers’ predictability for morphological agreement during processing. Using the visual-world paradigm of eye-tracking, Hopp (2016) observed that by introducing violations of gender agreement in German, native speakers relied less on morphological gender cues to predict a noun which mirrored how non-native speakers are typically observed to process L2 German. Accordingly, an additional argument can be made that L2 learners lack of sensitivity to violations in agreement might also partially result from their high exposure to L2 errors, thus reinforcing lexical and grammatical variability in the processing of an L2. Consequently, Tono’s (2000, 2009) finding would suggest that Japanese learners of English would have a greater amount of variability for the plural -S morpheme.
In the following section, several case studies in relation to agreement will be discussed. We highlight relevant studies on the processing of the plural -S morpheme in English by speakers from an incongruent language.
1.3. Previous Studies
Jiang et al. (2011) set out to replicate the previous claims (Clahsen & Felser, 2006a, 2006b; Clahsen et al., 2010; Hawkins & Liszka, 2003; Jiang, 2004, 2007) that there are limitations imposed on learners when acquiring and processing a new language. Specifically, they aimed to demonstrate that in order to acquire an L2 morpheme, the L1 must have a congruent morpheme. To validate this claim, they investigated the processing of the English plural morpheme by Japanese and Russian learners of the English. Importantly, Russian shares a congruent number morpheme while Japanese lacks a similar category. See below for an example of their stimuli.
(3a) Grammatical Plural Agreement
She picked a few of her dresses and left quickly.
(3b) Ungrammatical Plural Agreement
She picked a few of her *dress and left quickly.
Following the design of Jiang (2004; 2007) and using self-paced reading, Jiang et al. (2011) found that their Russian group demonstrated native-like sensitivity to the violation of (3b) while their Japanese group failed to reveal any difference. As such, they concluded that morphological incongruence restricts the acquisition of a novel L2 morphological category.
While Jiang and colleagues maintain that morphological incongruence is a roadblock to full-acquisition, others instead argue that it merely hinders or provides no additional benefit to acquisition. Emphasizing this issue is two studies by Gillon-Dowens, Vergara, Barber, and Carreiras (2010) and Gillon-Dowens, Guo, Guo, Barber, and Carreiras (2011) who investigated the processing of L2 Spanish by advanced L1 English and Chinese speakers respectively. The importance of these two L1 groups is that while English is morphologically congruent to Spanish in the domain of numerical morphology, it is incongruent regarding gender agreement. Chinese, in contrast, is incongruent for both morphological aspects. Using an electroencephalogram (EEG) to measure the event related potentials (ERP) of these learners, they found that L1 English speakers had increased sensitivity to violations of number agreement in comparison to gender agreement while L1 Chinese learners of Spanish had equal sensitivity to the violations. Consequently, while morphological congruency did not impair the sensitivity to violations of a new morphological category, it was shown to increase sensitivity where the languages agreed with one another. Thus, it can be considered a boon rather than a hinderance to acquisition.
Returning to the issue of morphological congruency for the plural -S morpheme in English, Wen et al. (2010) identified several possible issues of Jiang (2004; 2007) which Jiang et al. (2011) was based. First, they noted that the method to measure the participants’ English proficiency might have been problematic such that Jiang’s participants might not actually have been advanced English speakers. Wen et al. (2010) speculated that the issue of number agreement sensitivity in English by Chinese and Japanese ESL learners might be limited to advanced learners of English. When classifying the participants’ English ability, Jiang used TOEFL scores and self-assessments of English knowledge to categorize participants into an advanced group. Wen et al. (2010) instead insisted that internal tests should be used to classify participants by their English proficiency because they are a more accurate representation of English knowledge than self-assessment. Proficiency is without a doubt an important factor when determining if second language learners will show sensitivity to violations in the L2 grammar, therefore it is necessary to confirm the proficiency of the ESL learners. Sagarra and Herschensohn (2011) demonstrated the effects of proficiency among Spanish language learners coming from a non-gender language background. They found that intermediate learners were sensitive to violations of gender in Spanish, while beginners had no such effect. Consequently, having accurate categories of group proficiency might have relevance to the investigation of morphological incongruence. Furthermore, Wen et al.’s (2010) claim would suggest that only advanced learners of English from a morphological incongruent language can acquire the plural -S morpheme in English.
The second issue that Wen et al. (2010) note about Jiang (2004) is the lack of adjacency between the quantifier and target noun in the partitive structure. As such, they believed that the lack of significance observed in earlier studies might reflect the limitations in working memory by L2 speakers rather than a lack of grammatical sensitivity. The difficulties of long-distance dependencies for L2 speakers are well known. For example, Keating (2009) using eye-tracking found that nonadjacent gender dependencies are not easily noticed by learners of Spanish, whereas adjacent gender disagreements results in inflated reading times. Moreover, as argued by the shallow structure hypothesis (Clahsen & Felser, 2006a, 2006b) , dependencies that are structurally distant cannot be fully processed by second language learners. In fact, even for native speakers, a nonlocal dependency is thought to incur a processing cost in relation to a more local dependency (Gibson, 2000) . Accordingly, a reasonable argument would be that number agreement between long-distant dependencies might be especially difficult for L2 speakers.
Wen et al. (2010) addressed both of these issues in their self-paced reading study which investigated Chinese and Japanese ESL learners’ sensitivity to the plural -S morpheme. Instead of relying on TOEFL scores and self-assessment ratings, Wen et al. classified their ESL participants into intermediate and advanced groups using a C-test. Moreover, the structural and linear distance between the noun and its quantifier was reduced, thus providing a better opportunity for ESL speakers to notice the violations. Instead of using partitive structures, Wen et al. (2010) instead relied on simple demonstrative phrases. In contrast to Jiang (2004, 2007) and Jiang et al. (2011) , they also investigated conditions with an ungrammatical plural -S as in (4b).
Jill sold this beautiful house to her niece every evening.
*Jill sold this beautiful houses to her niece every evening.
Jill sold these beautiful houses to her niece every evening.
*Jill sold these beautiful house to her niece every evening.
Wen et al. (2010) found that their advanced Chinese and Japanese ESL learners had increased response times for the violations in number agreement for sentences (4b) and (4d) in comparison to their grammatical counterparts, i.e., native-like behavior. In contrast, their intermediate group were instead observed to be insensitive to the ungrammatical conditions. Their finding reveals that Chinese and Japanese speakers learning English can acquire plural morphemes in English on an implicit level which directly challenges Jiang’s (2004, 2007) and Jiang et al.’s (2011) claim. This would suggest that a Japanese ESL learner would show sensitivity to plural -S once they have acquired the morpheme rule. As a result, any error in spontaneous speech by these advanced learners might reflect a performance deficiency.
Song (2015) reinvestigated the issue combining the designs of Jiang et al. (2011) and Wen et al. (2010) . Using both partitive structures and demonstratives, Song tested advanced Korean ESL learners’ sensitivity to the plural -S morpheme in English. However, unlike Wen et al. (2010) , Song did not test intermediate learners and did not use items containing an ungrammatical plural morpheme.
(6a) Simple Plural-Grammatical
Kevin memorized those long Latin words in just ten seconds.
(6b) Simple Singular-Ungrammatical
*Kevin memorized those long Latin word in just ten seconds.
(6c) Partitive Plural-Grammatical
Mary donated many of her books to the public library.
(6d) Partitive Singular-Ungrammatical
*Mary donated many of her book to the public library.
Also using self-paced reading, Song found that these Korean ESL learners were sensitive to violations in number agreement for both structure types, similar to the native English baseline. A key aspect of this study was that the items also contained non-adjacent dependencies between the quantifier and target noun. As such, Song’s study revealed that for learners whose L1 has an incongruent morpheme to the target L2 language, native-like processing is feasible even for nonlocal dependencies.
We, however, find issue in the above studies in that they either support or refute the morphological congruency hypothesis using participants they argue to be advanced ESL learners. Especially, Wen et al.’s (2010) and Song’s (2015) claim that intermediate ESL learners should not be sensitive to number morphemes in English might be problematic. Succinctly put, neither study’s results should relate to the general population of their participants. This is because the average learner would prospectively be a non-advanced English learner and would have had limited exposure to natural or native-like English. While it is important to validate whether ultimate attainment of an incongruent morpheme is feasible and advanced ESL learners would surely provide the best opportunity to investigate this issue, we believe that investigating an intermediate learner in Japan would have intrinsic benefits to understanding the process of acquisition for a morphological incongruent category in the L2. This is because many Japanese students are intermediate, especially among university students in Japan. Simply, it is now known that despite morphological incongruence, learners potentially can acquire the new morpheme category and that having a morphological congruent category benefits the acquisition of the L2 morpheme. In other words, while morphological incongruence may impose greater limitations for a specific learner group, it does not necessarily prevent acquisition. However, previous studies are limited such that their results are not applicable to the EFL context. Consequently, it is empirically important to retest this issue using an intermediate EFL group to determine if advanced proficiency and native like exposure or prerequisites to attain an incongruent morpheme in the L2, the purpose of the current study.
1.4. Current Study
The current study reinvestigates the issue of morphological incongruence between L1 and L2 morphological categories, specifically, whether JEFL learners are sensitive to the null and plural number morphemes in English. While previous studies have relied heavily on the self-paced reading method, the current study instead utilizes a new and emerging method to capture online sentence processing, the Lexical Maze Task (Forster, 2010) . In contrast with previous studies, the current study does not aim to investigate the processing by advanced ESL speakers and instead aims to reveal sensitivity to violations in number agreement by typical EFL speakers in Japan.
Fifty-three native-speaking Japanese learners of English were recruited from Nagoya University in Japan. However, one participant was eliminated from the analysis for not following the task procedure (N = 52, Female = 19). All participants were students of the university at the time of the study. All signed informed consent prior to the experiment and were compensated for their participation. Because most of the participants were first year undergraduate students at the same university, the group’s factors were relatively homogeneous. Ages ranged from 18.3 years of age to 26.3 years with a mean of 19.4 years. The period of their English learning ranged from 6 years to 15 years of approximate years of English learning with the mean period of English learning being 7.7 years. Only 5 of the 52 participants had lived in an English-speaking country for a period longer than 1 month and did not exceed 6 months. According to the average means, these Japanese learners of English should be denoted as English as a foreign language (EFL) learners and not English as second language (ESL) living in Japan (JEFL/JESL).
At National Universities in Japan, there is a section on the entrance examinations for English. All participants recruited had passed this examination and were not currently enrolled in remedial English classes. It is important to note, that despite the high value placed on English education in Japan, this is often limited to English education for the purpose of entering a University or job-hunting rather than for communicable proficiency. As such, we estimate that these participants at the very least had basic knowledge of the English language and were not likely to have high proficiency in English outside the scope of grammar and reading. Because most of our first-year students at Nagoya University had not taken the TOEIC or the TOEFL test prior to the experiment, these measures were not recorded because they were simply not available. Instead, we conducted a short online English test to gauge participants basic English lexical knowledge. It is important to note that this test only measured approximate English knowledge. The online test was composed of 25 questions and each question was weighted as a single point. The mean score was 15.2 points, ranging from 8 points to 19 points. Because no participant scored higher than 19 points and had limited exposure to native-like English, we estimate that the majority of the participants had intermediate L2 English ability.
In comparison with Jiang (2004, 2007) , Jiang et al. (2011) , Song (2015) and Wen et al. (2010) , this study does not have a concrete advanced learner group and is not an ESL study. Instead, these learners might be closer to an approximate intermediate EFL learner group. As such, these learners are closer to the general population of JEFL learners in comparison to advanced JESL learners.
The experimental stimuli were created in a 2 (Grammaticality: Grammatical vs. Ungrammatical) x 2 (Number: Singular vs. Plural) design: this dog, these *dog, these dogs, this *dogs. Importantly, the items were not classified by the numerical quantifier (i.e., the demonstrative) preceding the noun such that all nouns following “this” would be denoted as singular or plural for “these”. Instead, the numerical value of the noun was determined by the presence or absence of the plural -S morpheme, and grammaticality was determined by whether the numerical value of the morpheme agreed with the demonstrative. See below for an example of the four conditions for a single stimulus item.
(7a) Grammatical Singular (Null)
The chef bought this apple for the pie.
(7b) Ungrammatical Singular (Null)
The chef bought these apple for the pie.
(7c) Grammatical Plural (-S)
The chef bought these apples for the pie.
(7d) Ungrammatical Plural (-S)
The chef bought this apples for the pie.
In total, 40 experimental items were created in a counter-balanced design such that each participant would only see one version of each item for their experimental session. Additionally, half of the items contained the target noun at the fifth region or word of the sentence and the other half contained it at the second word. This was to ensure participants could not easily predict a locus of ungrammaticality. The length of the sentences was controlled by this designation. For items containing the target noun at the second region, six words were used. For sentences with the target at the fifth region, eight words were used. All target nouns used contained regular morphology. Target noun frequency and length were both controlled. The frequencies of the experimental target words were taken from The SUBTL Word Frequency Database (Brysbaert & New, 2009) . For the lexical maze task, all nonwords were taken from the ARC Nonword Database (Rastle, Harrington, & Coltheart, 2002) , and all nonwords matched the length of the target word. For plural target nouns, an additional-S was added to the nonword. Prior to the experimental proper, four practice items were given. All practice items were grammatical sentences.
The above items differ from Jiang et al. (2011) and Song (2015) such that no partitive constructions were used (i.e., many of the apples). While the current study utilized the strategy of using the demonstrative “this” and “these” to denote quantity, the current study differs from Song (2015) and Wen et al. (2010) such that the target noun was directly adjacent to the quantifier region. This was to ensure the best possible chance for the JEFL learners to detect the error in number agreement.
2.3. Apparatus and Procedure
Participants were seated in front of a 60 Hz 15.6 LED laptop screen, attached with a CHRONOS response box (Psychology Software Tools, Pittsburgh, PA). The stimuli and response times were presented and recorded by E-PRIME 3.0 (Psychology Software Tools, Pittsburgh, PA), presented on a white screen using size 40 Courier New font.
Participants were instructed that they would be participating in a lexical maze task, an enhanced lexical decision task which also measures the incremental processing of a sentence (Forster, 2010) . This task is thought to improve on the self-paced reading method as it allows for a more localized measure of processing and does not require the added use of comprehension questions. Thus, participants read a sentence via the successful completion of a series of lexical decisions. Two words were displayed on the screen at a time, one at the center-left and the other at the center-right (internally randomized). However, only one word was a permissible English word with the other being a nonword. Each word of the sentence was displayed in sequential order and each word was paired with a non-word; yet, the first word of the sentence was always paired with ++++ to indicate the start of a new trial item. To complete the task and form a sentence, the participant was required to correctly select each word of the trial by pressing corresponding buttons on the button box. If they made a mistake at any point in the trial, the trial was immediately stopped, and the next trial would subsequently begin after an onscreen message which displayed this information. See Figure 1 for an example of the procedure.
Before starting the experiment proper, four practice items were provided to familiarize the participant to the experiment protocol. After the completion of the practice, participants were then given the opportunity to ask questions concerning the procedure. The experiment took approximately 15 - 20 minutes to complete. After the completion of the experiment, participants took a rest before starting the online English test which took approximately 10 - 20 minutes to complete.
Figure 1. Lexical maze task procedure.
2.4. Analytical Methods
The results were analyzed using linear mixed effect (LME) modelling (Baayen, Davidson, & Bates, 2008) within R (R Core Team, 2017) . The lme 4 package (Bates, Mächler, Bolker, & Walker, 2014) was used for the LME analysis, the lmer Test package (Kuznetsova, Brockhoff, & Christensen, 2017) was used to provide models with p-values using Satterthwaite’s approximation for the degrees of freedom, the package LMERC onvenience Functions was used for the within-model data trimming, the package effects (Fox, Weisberg, Friendly, Hong, Andersen, Firth, & Taylor, 2018) was used for the plots, and the package emmeans (Lenth, Singmann, Love, Buerkner, & Herve, 2018) was used to calculate the estimated adjusted means of the LME model.
The fixed effects were comprised of the factors Grammaticality, Morpheme, Proficiency, Trial Order, Item Frequency and Item Length. The random effects were comprised of Subject and Item, consisting of both random intercepts and slopes. Sum contrast coding was used to obtain the main effects for each factor. For Grammaticality, the condition Grammatical was coded as −0.5 and Ungrammatical was coded as 0.5. Similarly, for Morpheme, Null (i.e., singular) was coded as −0.5 and Plural (i.e., plural -S) was coded as 0.5. Proficiency, Trial Order, Item Frequency and Item Length were all coded as continuous factors, and they were all transformed using natural logarithm, centered and then standardized prior to the analysis. Using a Box-Cox analysis, it was found that the natural logarithm transformation was ideal for the response times.
Prior to the analysis, any item with an incorrect response prior to the target item was eliminated from the data set. Following this, all items with impossible response times for the target item were eliminated: below 200 ms and above 5000 ms. This collectively removed 142 data points (6.82%) from the set. Data outliers were trimmed based upon ±2.5 standard deviations of the estimates from the model. This eliminated 33 data points (1.70%) from the set.
Considering that the participants performed near ceiling level and errors were distributed approximately equally among conditions, an analysis for item accuracy was not conducted. As such, only response times at the target noun was explored.
Table 1. Estimated means.
Note. Estimated means are generated from the LME model.
Table 2. Linear mixed effect model.
Figure 2. Grammaticality and Morpheme effect charts.
Figure 3. Grammaticality: Morpheme effect plot.
Figure 4. Grammaticality: Proficiency.z effect plot.
The results revealed that for the condition Grammaticality, ungrammatical target items (i.e., this *dogs & these *dog) had significantly longer response times in comparison to grammatical ones (i.e., this dog & these dogs), p = 0.0235. For the condition Morpheme, items containing the plural -S morpheme were shown to take longer to respond to than the singular null morpheme, p = 0.028. It is important to note, however, that this might not likely relate to the increased character length as Length.z was used to adjust the estimated means of the model and was found to be significant, p = 0.005. One possibility is that the overt plural morpheme encompasses additional processing work. This might include the morphological decomposition of the morpheme and the processing of the root word or the triggering to check the number agreement between morpheme and the preceding quantifier.
The Frequency.z of target word was found to significantly alter response times such that items with higher frequencies had shorter response times than those
Figure 5. Grammaticality: Morpheme: Proficiency.z effect plot.
with low frequencies, p = 0.002. The fixed factor Trial.z was also shown to have significant differences in response time such that participants became faster in their responses as their experimental session continued, p = 0.004. The factor Proficiency.z also revealed significant differences among participants, p = 0.002.
Despite the simplicity of the online test, participants who answered more accurately were able to respond faster at the target word than those with lower scores.
Moving on to interaction effects, it was shown that interaction between the Grammaticality and Morpheme factors was not significant, p = 0.413. Thus, this suggests that both the differences between “this dog” vs. “these *dog” and “these dogs” vs. “this *dogs” had approximate effects for both variables. Also, the interaction between Morpheme and Proficiency.z was not significant, p = 0.909. Therefore, regardless of English proficiency, participants had approximate behavioral reactions to the presence and absence of the English plural -S morpheme.
Importantly, the interaction between Grammaticality and Proficiency.z was significant, p = 0.024. Looking at Figure 4, response time differences within Grammaticality only manifested when participants had higher English proficiency. Accordingly, this interaction effect demonstrates that at some level, having a high English proficiency is crucial for the sensitivity to violations in number agreement.
However, this interpretation might be problematic as the three-way interaction of Grammaticality, Morpheme and Proficiency.z was also significance, p = 0.006, which revealed counterintuitive findings. Looking at the plot in Figure 5, for items with the plural -S suffixed, the above finding was still accurate. Specifically, participants with higher English proficiency were sensitive to the violation of “this *dogs”, and participants with lower English proficiency did not reveal substantial differences between conditions. In contrast, when looking at the null morpheme conditions, the opposite pattern was observed. For the items “this dog” and “these *dog”, participants with low English proficiency were sensitive to the ungrammatical usage of the null morpheme in “these *dog”. Yet, this sensitivity diminished as proficiency increased. Interestingly, in Jiang et al. (2011) and Song (2015) , the only ungrammatical items used were those whose target noun ungrammaticality utilized the null morpheme.
This study set out to investigate the sensitivity of the English null and plural number morphemes by Japanese learners of English living in Japan. Specifically, we aimed to determine the representation for numerical inflectional morphology JEFL learners have during online processing. The results indicated that JEFL learners are sensitive to the disagreement in number agreement between feature-checking dependencies supporting the findings of Song (2015) and Wen et al. (2010) and refuting the claims of the morphological congruency hypothesis (Jiang et al., 2011) . Furthermore, the results demonstrated that having an advanced English proficiency is not necessarily a prerequisite for native-like performance. For both the ungrammatical singular (i.e., these *dog) and plural (i.e., this *dogs) items, the JEFL participants revealed increased response times in comparison to their grammatical counterparts. However, further investigation revealed that this effect was modulated by proficiency such that participants with higher proficiency were more sensitive ungrammatical inflections. Yet, this was shown to be only true for ungrammatical plural nouns, and in contrast, participants with lower English proficiency had greater sensitivity to the ungrammatical null inflection. These points will be discussed in the following subsections.
3.1. The Processing of the Morphemes
The results of this study effectively demonstrated that Japanese ESL learners showed sensitivity to the plural -S morpheme. The plural morphology, however, elicited a processing cost despite the response times being adjusted within the model, thus indicating that there is a behavioral response to plural morphology regardless of its grammaticality. This is similar with Jiang (2004) who also found that grammatical plural nouns can elicit increased response times using self-paced reading. As Jiang noted, while there is an obvious difference in length between a grammatical plural item and its ungrammatical singular counterpart, this effect might not originate from item length. While Wen et al. (2010) and Song (2015) both used residual response times factoring out the factor of word length (see Ferreira & Clifton, 1986 ), it does not appear Jiang (2004) took such an approach. As such, this difference might not have been found if adjusted or residual response times were instead used in Jiang’s study.
Instead of discounting the differences found in length, we contend that these JEFL learners might have been undergoing the processing of morphological decomposition. Therefore, participants processed the morpheme separately from the root noun, i.e., two separate units were being processed. If these Japanese participants utilized a mechanism to process a plural morpheme separately and incrementally in English, an extra unit of morphology should take longer to process than a null value.
It is possible that that at this stage of L2 acquisition, the learners could not take full advantage of morphological decomposition (Clahsen & Felser, 2006a, 2006b; Baayen, Dijkstra, & Schreuder, 1997) which resulted in additional processing costs for the inflectional morphology prior to the checking of number agreement. Portin and Laine (2001) and Portin, Lehtonen, and Laine (2007) argued that their Swedish-Finnish bilinguals had developed a reading strategy that caused inflectional morphology to be processed longer than a full-set reading. Portin and Laine (2001) proposed that other bilinguals might use a similar strategy. Because we contend that the presence of plural morpheme does not originate from length alone, this would be a more favorable proposition for the inflectional morphology processing by JEFL learners in the current study.
For the processing of a null morpheme, L2 learners might have deficits detecting a null morpheme due limited perceptual salience of the phonologically absent morpheme. Accordingly, it should be challenging for JEFL learners to process and assign a numerical value based from it (Goldschneider & DeKeyser, 2001; Jiang, 2007) . As stated, in English the null form denotes singularity for regular nouns whereas it is undifferentiated between plurality and singularity in Japanese. Accordingly, it would not be unreasonable to assume that JEFL learners might be improperly applying L1 Japanese morphosyntactic rules for null morphology on to L2 English morphemes.
As a result, there might be two probable scenarios for their processing of null morphology. They either (1) process the null morpheme but fail to correctly assign it as singular due to L1 transfer or (2) they might fail to detect it. However, the later possibility seems to be inaccurate because participants revealed sensitivity to an ungrammatical null morpheme (these *dog). Despite this sensitivity, the first possibility might nonetheless be partially accurate. Looking at Figure 3, the difference between the grammatical null condition and ungrammatical condition appears to be less than the difference between the grammatical null condition and the ungrammatical plural. A similar numerical difference was also found in Wen et al. (2010) for the same conditions. Considering that the intended numerical assignment for the ungrammatical plural would be a null morpheme as indicated by the determiner, it appears that detecting the violation in number agreement might be easier for JEFL learners when there is overt morphology.
Accounts of SLA theory that would agree with the above results is lexical variability (Hopp, 2016) and the missing surface inflection hypothesis (Prévost & White, 2000) . Because learners might suffer from real-time processing demands, they might have ultimately over-relied on the default null form when moving from the quantifier to the ungrammatical target noun. In tandem, because these learners in general would be suspect to have greater lexical variability for plural nouns (i.e., these dogs & these *dog are relatively acceptable due to their prevalence in L2 speech), there would be less predictability for a plural target appearing after a plural quantifier. However, Prévost and White’s (2000) hypothesis would posit that these L2 learners are not suspected to have lexical variability for singular nouns (e.g., this dog). As such, there should be greater predictability for a nominal with null number marking.
In fact, the ungrammatical plural noun (e.g., this *dogs) having the longest response times agrees with aspects of the morphological congruency hypothesis (Jiang et al., 2011) as well. While it is incorrect in the argument that ultimate attainment is not possible, it might find support in the argument that the congruent feature shared between the languages (i.e., a singular noun within the scope of a singular quantifier is not permitted to have a plural morpheme) evoked the greatest response among the conditions. This finding is congruent with the studies of Gillon-Dowens et al. (2010) and Gillon-Dowens et al. (2011) who found increased sensitivity among L1 English learners of L2 Spanish for number agreement in comparison to gender agreement and L1 Chinese learners of L2 Spanish. These collective results indicate that when an aspect of a morphological category is congruent, there should enhanced sensitivity to its violation. However, one counter argument to this is that the plural morpheme is perceptually salient while the null morpheme is not. As such, the enhanced sensitivity to this violation may instead reflect the degree of salience.
In summary, regardless of the approach, it is possible that detecting violations in agreement between null and a quantifier is difficult for JEFL learners. However, this argument becomes problematic when considering the interaction of proficiency with the grammaticality of the null morpheme, which will be discussed below. In the following section, we will describe how the above findings relate to English proficiency and the English learning context.
3.2. ESL vs. EFL and Proficiency
Looking at Figure 4, it can be seen that participants with increased English ability not only had faster response times but ultimately revealed differences between Grammatical and Ungrammatical conditions. This finding would support the previous claims of Song (2015) and Wen et al. (2010) that sensitivity to an incongruent morpheme might be limited to those with higher L2 ability. While this is a reasonable and attractive argument, the argument begins to fail once the interaction between grammaticality, morpheme and proficiency is taken into account.
Interestingly, looking at Figure 5, the more proficient participants in this study did not reveal native-like behavior for ungrammatical null inflections denoting singularity. In fact, it seems participants with lower proficiency were more native-like in this regard. While we cannot offer conclusive rational for this, we do provide several possibilities.
One possibility is that as proficiency increased for JEFL learners, their lexical variability also increased. Lexical variability has been addressed as a crucial factor for both L1 and L2 processing (Hopp, 2016; Prévost & White, 2000) . Succinctly put, for L2 speakers, as they are exposed to more errors for feature-checking dependencies, their sensitivity to these errors are attenuated. This is explained as their lexicon accounting for the variability in the exposure to and production of ungrammatical morphemes. According to learners’ error corpora (Tono, 2000, 2009) , Japanese learners of English produce a substantial amount of null-inflection errors (e.g., these *dog). Accordingly, it may be a reasonable inference that learners more engrossed in the English learning process would both produce and be exposed to these errors at a higher rate than learners with low English proficiency. While this is speculative, it might be the case that JEFL learners lose their sensitivity to ungrammatical null morphology over their English learning process in Japan.
This line of reasoning also provides further insight into previous studies. In both Wen et al. (2010) and Song (2015) , advanced ESL participants were recruited. Moreover, in Wen et al. (2010) , the advanced group had native-like behavior while the intermediate group did not. Considering that advanced ESL learners would be more likely to have higher exposure to native-English, the distribution of grammatical and ungrammatical exposure to errors might readjust itself to a native-like setting, thus allowing the sensitivity of errors to be observed. However, to confirm this suspicion, future studies should explore the longitudinal performance of either L1 Chinese, Korean or Japanese speakers starting from an EFL environment and moving to an ESL setting to determine the degree proficiency and ESL exposure benefits the online processing of incongruent morphological categories.
Despite Japanese learners of English having been previously thought to be unable to acquire the plural -S morpheme in English, this study effectively reveals that these learners have sensitivity to violations in number agreement for both plural -S and the singular null form, replicating recent research. Importantly, because these learners had limited exposure to native-like English, this study demonstrates that neither living in an English-speaking country nor having high or communicable proficiency is required for the behavioral arousal to an ungrammatical incongruent morpheme. Thus, this study casts doubt on the morphological congruency hypothesis in the specific regards that sensitivity to these errors should not be observed. However, we do believe that morphological incongruence does play an important role for the acquisition of a morphological category in a second language such that congruency may benefit the acquisition process. Overall, this study supports the missing surface inflection hypothesis (Prévost & White, 2000) in regard to JEFL learners being able to acquire the plural inflections in English. Regarding language instruction, while instructors should be aware that errors are not necessarily a representation of a learner’s grammaticized knowledge, errors produced can increase the lexical variability among learners thus exacerbating the issue. As such, we believe that these errors should not be ignored, and that nativelike-exposure can facilitate the process of eliminating these performance errors.
This study was funded in part by the Japan Society for the Promotion of Science (JSPS) Grand-In-Aid for Overseas JSPS post-doctoral research fellows granted to Michael P. Mansbridge, Grant Number P18004. We would like to thank Dr. Rinus G. Verdonschot of Hiroshima University, Hiroshima, Japan for his advice during the programming of the E-Prime experiment.
 Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items. Journal of Memory and Language, 59, 390-412.
 Baayen, R. H., Dijkstra, T., & Schreuder, R. (1997). Singulars and Plurals in Dutch: Evidence for a Parallel Dual-Route Model. Journal of Memory and Language, 37, 94-117.
 Bailey, N., Madden, C., & Krashen, S. D. (1974). Is There a “Natural Sequence” in Adult Second Language Learning? Language Learning, 24, 235-243.
 Brysbaert, M., & New, B. (2009) Moving beyond Kucera and Francis: A Critical Evaluation of Current Word Frequency Norms and the Introduction of a New and Improved Word Frequency Measure for American English. Behavior Research Methods, 41, 977-990.
 Chen, L., Shu, H., Liu, Y., Zhao, J., & Li, P. (2007). ERP signatures of subject-verb agreement in L2 learning. Bilingualism: Language and Cognition, 10, 161-174.
 Clahsen, H., Felser, C., Neubauer, K., Sato, M., & Silva, R. (2010). Morphological Structure in Native and Nonnative Language Processing. Language Learning, 60, 21-43.
 Ellis, N. C. (2002). Frequency Effects in Language Processing: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in Second Language Acquisition, 24, 143-188.
 Fox, J., Weisberg, S., Friendly, M., Hong, J., Andersen, R., Firth, D., & Taylor, S. (2018). Package “Effects”. R Package Version 4.0-3.
 Gibson, E. (2000). The Dependency Locality Theory: A Distance-Based Theory of Linguistic Complexity. In A. Marantz, Y. Miyasita, & W. O’Neil (Eds.), Image, Language, Brain: Papers from the First Mind Articulation Project Symposium (pp. 95-126). Cambridge, MA: MIT Press.
 Gillon-Dowens, M., Guo, T., Guo, J., Barber, H., & Carreiras, M. (2011). Gender and Number Processing in Chinese Learners of Spanish-Evidence from Event Related Potentials. Neuropsychologia, 49, 1651-1659.
 Gillon-Dowens, M., Vergara, M., Barber, H. A., & Carreiras, M. (2010). Morphosyntactic Processing in Late Second-Language Learners. Journal of Cognitive Neuroscience, 22, 1870-1887.
 Goldschneider, J. M., & DeKeyser, R. M. (2001). Explaining the “Natural Order of L2 Morpheme Acquisition” in English: A Meta-Analysis of Multiple Determinants. Language Learning, 51, 1-50.
 Hawkins, R., & Chan, C. Y. H. (1997). The Partial Availability of Universal Grammar in Second Language Acquisition: The “Failed Functional Features Hypothesis”. Second Language Research, 13, 187-226.
 Hawkins, R., & Liszka, S. (2003). Locating the Source of Defective Past Tense Marking in Advanced L2 English Speakers. In R. Van Hout, A. Hulk, F. Kuiken, & R. J. Towell (Eds.), Language Acquisition and Language Disorders (Volume 30, pp. 21-44). Amsterdam: John Benjamins Publishing Company.
 Hosoi, H. (2005). Japanese-Tachi Plurals. In R. T. Cover, & Y. Kim (Eds.), Proceedings of the Annual Meeting of the Berkeley Linguistics Society (Volume 31, pp. 157-168). Berkeley: Berkeley Linguistics Society.
 Jiang, N., Novokshanova, E., Masuda, K., & Wang, X. (2011). Morphological Congruency and the Acquisition of L2 Morphemes. Language Learning, 61, 940-967.
 Keating, G. D. (2009). Sensitivity to Violations of Gender Agreement in Native and Nonnative Spanish: An Eye-Movement Investigation. Language Learning, 59, 503-535.
 Keating, G. D. (2010). The Effects of Linear Distance and Working Memory on the Processing of Gender Agreement in Spanish. In B. VanPatten, & J. Jegerski (Eds.), Research in Second Language Processing and Parsing (pp. 113-134). Amsterdam: John Benjamins.
 Krashen, S. D., Sferlazza, V., Feldman, L., & Fathman, A. K. (1976). Adult Performance on the SLOPE Test: More Evidence for a Natural Sequence in Adult Second Language Acquisition. Language Learning, 26, 145-151.
 Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 82.
 Lenth, R., Singmann, H., Love, J., Buerkner, P., & Herve, M. (2018). Package “Emmeans”. R Package Version 4.0-3.
 Luk, Z. P. S., & Shirai, Y. (2009). Is the Acquisition Order of Grammatical Morphemes Impervious to L1 Knowledge? Evidence from the Acquisition of Plural-s, Articles, and Possessive’s. Language Learning, 59, 721-754.
 McDonald, J. L. (2006). Beyond the Critical Period: Processing-Based Explanations for Poor Grammaticality Judgment Performance by Late Second Language Learners. Journal of Memory and Language, 55, 381-401.
 Meisel, J. (1991). Principles of Universal Grammar and Strategies of Language Learning: Some Similarities and Differences between First and Second Language Acquisition. In L. Eubank (Ed.), Point Counterpoint: Universal Grammar in the Second Language (pp. 231-276). Amsterdam: John Benjamins.
 Montrul, S., & Slabakova, R. (2003). Competence Similarities between Native and Near-Native Speakers. Studies in Second Language Acquisition, 25, 351-398.
 Portin, M., & Laine, M. (2001). Processing Cost Associated with Inflectional Morphology in Bilingual Speakers. Bilingualism: Language and Cognition, 4, 55-62.
 Prévost, P., & White, L. (1999). Accounting for Morphological Variation in Second Language Acquisition: Truncation or Missing Inflection? In M.-A. Friedemann, & L. Rizzi (Eds.), The Acquisition of Syntax (pp. 202-235). London: Routledge.
 Prévost, P., & White, L. (2000). Missing Surface Inflection or Impairment in Second Language Acquisition? Evidence from Tense and Agreement. Second Language Research, 16, 103-133.
 R Core Team (2017). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
 Rastle, K., Harrington, J., & Coltheart, M. (2002). 358,534 Nonwords: The ARC Nonword Database. Quarterly Journal of Experimental Psychology, 55, 1339-1362.
 Sagarra, N., & Herschensohn, J. (2011). Proficiency and Animacy Effects on L2 Gender Agreement Processes during Comprehension. Language Learning, 61, 80-116.
 Schmidt, R. W. (1983). Interaction, Acculturation and the Acquisition of Communicative Competence. In N. Wolfson, & E. Judd (Eds.), Sociolinguistics and Language Acquisition (pp. 137-174). Rowley, MA: Newbury House.
 Tono, Y. (2000). A Computer Learner Corpus Based Analysis of the Acquisition Order of English Grammatical Morphemes. In L. Burnard, & T. McEnery (Eds.), Rethinking Language Pedagogy from a Corpus Perspective (pp. 123-132). Frankfurt: Peter Lang.
 Tono, Y. (2009). Corpus-Based Research and Its Implications for Second Language Acquisition and English Language Teaching (pp. 155-173). A New Look at Language Teaching and Testing English as Subject and Vehicle.
 Wen, Z., Miyao, M., Takeda, A., Chu, W., & Schwartz, B. D. (2010). Proficiency Effects and Distance Effects in Nonnative Processing of English Number Agreement. In K. Franich, K. M. Iserman, & L. L. Keil (Eds.), Proceedings of the 34th Boston University Conference on Language Development (pp. 445-456). Somerville, MA: Cascadilla Press.
 White, L., & Genesee, F. (1996). How Native Is Near-Native? The Issue of Ultimate Attainment in Adult Second Language Acquisition. Second Language Research, 12, 233-265.