It is hard to imagine how we can live without numbers. The identity of a number can be used to express multiple meanings like physical size, quantity or degree. Messages from different dimensions conveyed by a single-digit number are complex and people need to focus on those most practical and relevant ones.
Since the classic word-color Stroop paradigm was applied into cognitive experiments in 1935 (Stroop, 1935) , then its numerical versions came into play. In the numerical Stroop tasks (Dadon & Henik, 2017 ; Henik & Tzelgov, 1982 ; Windes, 1968 ; Zhou et al., 2007  ), participants were often required to decide which of two numbers was physically or semantically larger (or smaller), while ignoring the comparison from the other dimension. If two numbers shared the equivalent relationship in both physical and numerical magnitudes, this kind of situations will be termed as congruent trials, otherwise as incongruent ones. Congruent trials mean the comparison between the numbers and the physical sizes accord in these trials (e.g., 6 - 8; the numerically smaller number presented in smaller physical size); incongruent ones means the relationship between the numbers and the physical sizes do not accord (e.g., 6 - 8; the numerically smaller number presented in bigger physical size).
Participants normally failed to avoid the interference from the other dimension which led to slower responses and more errors in incongruent trials. These results indicated that numerical value and physical size of the same digits were closely connected. When it came to the underlying causes of this effect, some researchers deemed that numerical and non-numerical information shared same mental representations during number processing (Buijsman & Tirado, 2019 ; Dehaene, Bossini, & Giraux, 1993 ; Dehaene, Dupoux, & Mehler, 1990  ) and it was hard to focus on single dimension.
While others preferred the theory of separate but interactive systems. When information of numerical value or physical size was being processed, it was possible that the two processes were separate but that their outputs interfered or that the two were not kept separate by the system (Henik & Tzelgov, 1982) .
Numerical Stroop paradigms were widely applied in perception which meant that semantic and physical messages were attended rather than were kept in memory system. There still was an unsolved problem that whether attended numerical information could be automatically encoded and kept in working memory system.
In order to solve this problem, we designed a working memory version of numerical Stroop task. Participants were required to keep a number in memory for a while. Then when the other number appeared, they were asked to decide which one was physically or numerically bigger.
According to the hypothesis of shared representation or holistic processing (Buijsman & Tirado, 2019 ; Dehaene, Bossini, & Giraux, 1993  ), if during the process of encoding numerical and non-numerical information, shared representation was applied, both dimensions could interfere the other which meant we could find Stroop effect in both physical size and numerical value judgments even in working memory tasks.
As for separate but interactive representation theory, there would have been no Stroop effect since the superiority of numerical meaning or physical size may come out. If one of the comparisons from numerical or physical magnitude was superior to the other, it would be highly possible that no Stroop effect could be found in working memory system.
The current research aimed to figure out whether numerical information could interfere with physical judgments and whether physical information could interfere with numerical judgments. In Experiment 1, participants were required to choose the physically larger number. If numerical information could affect the physical magnitude, reactions would be more accurate and faster in congruent trials than incongruent ones. In addition, a manipulation of duration was applied in order to figure out whether this effect was stable or not. If this effect was stable, it could be still significant even after 4000 ms. In Experiment 2, participants were required to choose the numerically larger number. If physical information could affect the numerical magnitude, reactions would be more accurate and faster in congruent trials than incongruent ones. And if this effect was stable, it could be still significant even after 4000 ms.
2. Experiment 1
A group of participants (N = 15) was recruited from Hangzhou Normal University and signed informed consent. They received payment for participating and were naïve to the purpose of the experiment. All of them had normal or corrected-to-normal vision and no attention deficits. They were required to choose the bigger digit based on its physical size.
2.1.2. Design and Procedure
Stimuli were presented using E-Prime 2.0 (Psychology Software Tools, Pittsburgh, PA) on a 17-inch CRT monitor with a resolution of 1024 × 768, and a refresh rate of 100 Hz. The viewing distance was approximately 50 cm.
The stimuli were eight Arabic numbers: 1, 2, 3, 4, 6, 7, 8, 9, the same as those employed by Dadon and Henik (2017 ; see also Kadosh, Henik, & Rubinsten, 2008; Leibovich, Diesendruck, Rubinsten, & Henik, 2013). We applied three numerical distance (1, 2, and 5) and three physical distance to combine these numbers. Each of the numerical distance included two number pairs and each of the physical distance is corresponding to a font pair (see Table 1).
The stimuli in the task were the same as in the congruent and incongruent condition. In the congruent condition, a number pair was presented in succession which shares the same comparison of both numerical and physical size. In the incongruent condition, two numbers differed in the comparative dimensions of numerical and physical sizes.
Table 1. The different combinations of the numbers and physical sizes according to distance.
There were three levels of interstimulus intervals (ISI), which were 1000 ms, 2000 ms or 4000 ms. Participants were instructed to respond as quickly and accurately as possible by pressing one of the two keys (“F”-when the memory cue was physically bigger, “J”-when the test item was bigger).
Table 2 means of (individual) median reaction time (ms) and accuracy (%). The parentheses represent standard error of means.
2.2.1. Accuracy Analysis
Participants’ mean accuracy was 84.2%. A 2 (congruency: congruent vs. incongruent) × 3 (ISI: 1000 ms, 2000 ms and 4000 ms) repeated-measure ANOVA on the accuracy was performed. As predicted, higher accuracy was found in congruent conditions than incongruent ones [F(1,14) = 51.74, p < 0.001, = 0.787].
In addition, a main effect of ISI was observed [F(2,14) = 6.804, p = 0.004, = 0.327]. Specifically, accuracy in the condition of 1000ms ISI (M = 0.861, SD = 0.020) was higher than that in 4000 ms (M = 0.819, SD = 0.018), p < 0.001.
However, the interaction between congruency and ISI did not reach significance [F(2,14) = 0.519, p = 0.601, = 0.036].
2.2.2. Reaction Time Analysis
Trials with wrong responses were removed.
We conducted a 2 (congruency: congruent vs. incongruent) × 3 (ISI: 1000 ms, 2000 ms and 4000 ms) repeated-measure ANOVA on the median of reaction time. There were faster reactions in congruent conditions than incongruent ones [F(1,14) = 6.611, p = 0.022, = 0.321].
The main effect of ISI was also observed [F(2,13) = 12.955, p < 0.001, = 0.481]. The difference between 1000 ms (M = 785.033, SD = 37.383) and 2000 ms (M = 745.217, SD = 38.916) were not significant (p = 0.079). But participants
Table 2. Means of (individual) median reaction time (ms) and accuracy (%). The paren-theses represent standard error of means.
responded faster in the condition of 4000 ms (M = 712.350, SD = 35.849) than 1000 ms (M = 785.033, SD = 37.383, p = 0.002) and faster than 2000 ms (M = 745.217, SD = 38.916, p = 0.013).
The effect of the interaction between congruency and ISI still did not reach significance [F(2,13) = 0.467, p = 0.637, = 0.067].
The results of Experiment 1 indicated that semantic information of a number was attended and kept in WM system even after four seconds duration.
3. Experiment 2
Another group of participants (N = 15) was recruited. All procedures were identical to these of Experiment 1 except participants were required to choose bigger numbers based on numerical value.
Median RT and accuracy for all conditions in Experiment 2 are reported in Table 3.
3.2.1. Accuracy Analysis
Participants’ mean accuracy was 96.27%.
But accuracy did not vary significantly between the congruent and incongruent conditions [F(1,14) = 0.006, p = 0.940, < 0.001]. Neither the main effect of ISI [F(2,14) = 0.018, p = 0.982, = 0.003] nor interaction between congruency and ISI [F(2,13) = 1.025, p = 0.386, = 0.136] did not approach significance.
3.2.2. Reaction Time Analysis
A 2 (congruency: congruent vs. incongruent) × 3 (ISI: 1000 ms, 2000 ms and 4000 ms) repeated-measure ANOVA on the median of reaction time was employed.
We found that RT varied significantly between different durations [F(2,13) = 48.051, p < 0.001, = 0.881]. But the effects of congruency [F(1,14) = 0.006, p = 0.940, < 0.001] and interaction [F(2,13) = 0.746, p = 0.493, = 0.103] did not reach significance.
More specifically, participants reacted fastest under the circumstance of 4000 ms (M = 565.383, SD = 46.023) among the three level of ISI, slowest under 1000
Table 3. Means of (individual) median reaction time (ms) and accuracy (%). The parentheses represent standard error of means.
ms (M = 662.300, SD = 45.611). Besides the difference between the condition of 2000 ms (M = 598.817, SD = 44.408) and 4000 ms reached marginally significance, p = 0.056. And the differences between the condition of 1000 ms and 2000 ms, and between 1000 ms and 4000 ms were displayed, p s < 0.001.
The results of Experiment 2 showed that physical information could not affect numerical judgements in WM. But this could be attributed to that physical appearance of the digit was not attended.
4. General Discussion
These experiments provided converging evidence that numerical Stroop effect existed in working memory which was caused by the interference from semantic value rather than physical information.
We found this Stroop in WM tasks when participants were required to make physical size judgments and that was consistent with previous studies. This Stroop effect in WM tasks indicated that number processing in working memory was automatic. Due to the imbalance of the impacts of numerical and physical magnitude on the other one, we could infer that semantic information of a number was superior to those messages of physical appearance during the processes in working memory.
4.1. Evidence for the Separation between Attention and Working Memory
According to Chen’s work (Chen, Swan, & Wyble, 2016 ; Chen & Wyble, 2015a , 2015b , 2016  ), attended information would not be encoded into working memory system even when they were pop-out and simple like colors and numbers. Our studies went deeper in figures stimulus. We focused on the possible Stroop effect caused by physical and numerical dimension of a number.
The results showed that attended physical information could not be encoded into WM automatically and could not interfere with the judgments of numerical value either. While attended semantic messages seemed more powerful since they could resist the Stroop effect from physical size.
In this way, our studies may be considered as another evidence for separation of attention and working memory.
4.2. Evidence for the Hypothesis of Separate but Interactive Representation
If the numerical and physical information shared mental representation, bidirectional Stroop effects should be existed. But current studies indicated otherwise.
What we should notice that based on our paradigm numerical Stroop effect caused by physical information was not observed which indicated that the potency was too weak to be found rather than this effect never existed compared to that of numerical information.
In this case, separate but interactive representation could work because of asymmetric impacts from semantic or physical dimension to the other. If individuals applied separate mental representations during the number processing and semantic magnitude showed superiority to physical magnitude, it was rational that numerical value could interfere with physical size and otherwise could not. Therefore based on our findings, we preferred the hypothesis of separate representation.
4.3. Possible Lists of Important Attributes from Different Dimensions
Location information of a visual target was automatically encoded, even when the participant did not expect to report while salient attribute information, such as color, or unmasked digit identity, was poorly encoded into memory, even when this information was made task-relevant (Chen & Wyble, 2015a , 2015b , 2016  ). It could be inferred that location information was more fundamental and important than color and identity of the visual cue.
Based on our findings, semantic value seemed more powerful than physical size of the digit. Therefore, we supposed that there may exist a list of attributes of the same visual stimulus based on their significance or superiority during the process of working memory. We already knew location, color, identity, numerical value and physical size could be on the list (Chen et al., 2016 ; Chen & Wyble, 2015b , 2015a , 2016  ). But the rank of these attributes is still unclear, which means that more researches can be conducted focusing on the priority of attributes priority during processing.
According to objected-based encoding theory (Z. Gao, Zhang, Shen, Zhao, & Tang, 2013 ; Zaifeng Gao et al., 2016 ; Rees, Kreiman, & Koch, 2002 ; Shen, Tang, Wu, Shui, & Gao, 2013  ), a lot of attributes of the number could be kept in working memory at the same time while they differed in the priority of processing and their influence. Numerical value could interfere physical judgments after 4000 ms while physical information could not exert an influence on numerical tasks, which is consistent with the previous notion that number processing is automatic (Giraux, 2014 ; Gutiérrez-Martínez, Ramos-Ortega, & Vila-Chaves, 2018  ). Future studies could be conducted to figure out the brain structure during the process of numerical working memory.
 Henik, A. and Tzelgov, J. (1982) Is Three Greater than Five: The Relation between Physical and Semantic Size in Comparison Tasks. Memory & Cognition, 10, 389-395. https://doi.org/10.3758/BF03202431
 Zhou, X., Chen, Y., Chen, C., Jiang, T., Zhang, H. and Dong, Q. (2007) Chinese Kindergartners’ Automatic Processing of Numerical Magnitude in Stroop-Like Tasks. Memory and Cognition, 35, 464-470. https://doi.org/10.3758/BF03193286
 Buijsman, S. and Tirado, C. (2019) Spatial-Numerical Associations: Shared Symbolic and Non-Symbolic Numerical Representations. Quarterly Journal of Experimental Psychology, 72, 2423-2436. https://doi.org/10.1177/1747021819844503
 Dehaene, S., Bossini, S. and Giraux, P. (1993) The Mental Representation of Parity and Number Magnitude Access to Parity and Magnitude Knowledge during Number Processing. Journal of Experimental Psychology: General, 122, 371-396. https://doi.org/10.1037/0096-34184.108.40.2061
 Dehaene, S., Dupoux, E. and Mehler, J. (1990) Is Numerical Comparison Digital? Analogical and Symbolic Effects in Two-Digit Number Comparison. Journal of Experimental Psychology: Human Perception and Performance, 16, 626-641. https://doi.org/10.1037/0096-15220.127.116.116
 Chen, H., Swan, G. and Wyble, B. (2016) Prolonged Focal Attention without Binding: Tracking a Ball for Half a Minute without Remembering Its Color. Cognition, 147, 144-148. https://doi.org/10.1016/j.cognition.2015.11.014
 Chen, H. and Wyble, B. (2015) Amnesia for Object Attributes: Failure to Report Attended Information That Had Just Reached Conscious Awareness. Psychological Science, 26, 203-210. https://doi.org/10.1177/0956797614560648
 Chen, H. and Wyble, B. (2015) The Location But Not the Attributes of Visual Cues Are Automatically Encoded into Working Memory. Vision Research, 107, 76-85. https://doi.org/10.1016/j.visres.2014.11.010
 Chen, H. and Wyble, B. (2016) Attribute Amnesia Reflects a Lack of Memory Consolidation for Attended Information. Journal of Experimental Psychology: Human Perception and Performance, 42, 225-234. https://doi.org/10.1037/xhp0000133
 Gao, Z., Zhang, Q., Shen, M., Zhao, G. and Tang, N. (2013) Object-Based Encoding in Visual Working Memory: A Life Span Study. Journal of Vision, 13, Article No. 11. https://doi.org/10.1167/13.10.11
 Gao, Zaifeng, Yu, S., Zhu, C., Shui, R., Weng, X., Li, P. and Shen, M. (2016) Object-Based Encoding in Visual Working Memory: Evidence from Memory-Driven Attentional Capture. Scientific Reports, 6, Article No. 22822. https://doi.org/10.1038/srep22822
 Gutiérrez-Martínez, F., Ramos-Ortega, M. and Vila-Chaves, J. ó. (2018) Executive Efficacy on Stroop Type Interference Tasks. A Validation Study of a Numerical and Manual Version (CANUM). Annals of Psychology, 34, 184-196. https://doi.org/10.6018/analesps.34.1.263431