OJML  Vol.11 No.5 , October 2021
A Study on the Use of Hesitation Markers in Varied-Level EFL Learners’ L2 Speaking Process
Show more
Abstract: This article investigates hesitation markers used by EFL learners at different proficiency levels in L2 speaking process, taking examinees in the IELTS Speaking Test as examples. The hesitation markers, such as silent/filled pauses and other lexical fillers (e.g., repetitions, self-repairs, smallwords, and reformulations), in transcribed speech data are qualitatively analyzed. By comparing the low-proficiency and the high-proficiency speakers’ performance, this study aims to provide some enlightenments to language educators and researchers engaged in teaching and learning, as well as help students improve their oral proficiency. In foreign language teaching, the teaching of smallwords should be strengthened, so as to help students achieve the coherence and fluency.

1. Introduction

Spoken language, in its purest form, is unrehearsed and spontaneous (see e.g., Burns & Joyce, 1997). Interactants build speech as they go along, in a process of on-line planning. As a result, there are times in a conversation when the speaker is inevitably hesitant and does not know what to say next (or how to express it). A hesitation marker is a linguistic form that appears in environments in which speakers have difficulties in retrieving lexical information during speech production.

Wiese (1984), for example, mentions filled pauses (e.g., uh, mhm), repetitions, corrections and drawls. But the literature also recognizes a number of smallwords which, among other functions, allow the speaker to “buy time” (e.g., well, I mean or vague words such as stuff or things like that).

However, since vocalizations, false starts, repetitions and other “smallwords” such as well or I mean, “do not contribute essentially to the message itself” (Hasselgren, 2002: p. 150), they tend to be disregarded. Not only are they disregarded by hearers, in fact, but also, by specialists of language (Maclay & Osgood, 1959). Talking about smallwords, Hasselgren (2002: p. 168) refers to an “essential but hitherto largely neglected body of language”.

What’s more, the development of academic speaking abilities is an area of concern to course organizers, teachers, and students, and there is a need for research into the topic especially at discourse level.

In this way, this paper aims to explore hesitation markers used by English learners at different proficiency levels in IELTS Speaking Test. This study will adopt a very broad definition of hesitation markers, which covers (silent and filled) pauses, drawls, truncated words, repetitions, as well as a representative selection of smallwords of hesitation which commonly appear. The following research questions will be addressed in the study: What are the characteristics of hesitation markers used by English learners at different levels? Are IELTS examinees’ speaking bands relevant to the use of hesitation markers? Based on detailed transcriptions of authentic speech, it is possible to study hesitation phenomena with a precision and reliability that were practically unattainable before.

2. Literature Review

2.1. The Pragmatic Function of Hesitation in Speech

The function of hesitation is crucial as a conversational strategy. Since speech is dialogic in nature (Burns & Joyce, 1997: p. 13) , it is important that a speaker should indicate that s/he needs a moment’s reflection, but is still “in control” of his/her turn. Hesitation markers, by signalling a small delay, ensure that the speaker can keep his/her turn in the conversation and is not interrupted by the other participants. Even silent pauses have been demonstrated to play a part in the structure of the message and to contribute to its internal cohesion (Romero Trillo, 1994). For foreign (or second) language learners, hesitation is even more crucial. In their search for a formulation which is acceptable in the foreign language, they are likely to experience many planning problems and, therefore, need techniques that enable them to gain time while they are trying to solve these problems (Chambers, 1997; Temple, 2000).

In conclusion, hesitation markers can buy time for your speech to catch up with your thoughts, or to fish out the right word for a situation. And they do not just benefit the speaker—a filled pause lets your listeners know an important word is on the way. Hesitation markers direct the flow of conversation, and some studies suggest that conscientious speakers use more of these phrases to ensure everyone is being heard and understood. For example, starting a sentence with “Look…” can indicate your attitude and help you gauge the listener’s agreement. “I mean” can signal that you’re about to elaborate on something. And the “like” can perform many functions, such as establishing a loose connection between thoughts, or introducing someone else’s words or actions.

These markers give people a real-time view into your thought process and help listeners follow, interpret, and predict what you are trying to say. They are not just useful for understanding language—they help us learn it, too. For adolescents and adults learning a second language, filled pauses smooth out awkward early conversations. And once they are more confident, the second-language learner can signal their newfound fluency by using the appropriate hesitation phenomenon. Because, contrary to popular belief, the use of filled pauses does not decrease with mastery of a language.

2.2. The Function of Hesitation Markers in Measuring Fluency

In research on language learning and use, hesitation markers are often used as measures of fluency by the length of pauses (Raupach, 1987; Mehnert, 1998; Kormos & Dénes, 2004) and by the number of pauses per c-unit, per t-unit, or per minute (Mehnert, 1998; Skehan, 2001; Bygate, 2001; Tavakoli & Skehan, 2005) . Another way to look at pauses is the location of pauses. Butterworth (1980) believes that micro-planning is made at clause boundaries for lexical selection, which leads to juncture pauses. Beattie (1980) suggests that temporal cycles of hesitant fluent phase emerge in speech.

Hesitation markers such as repetitions and self-repairs are also considered when measuring fluency, Skehan (2001) takes reformulation, replacement, false starts, and repetition as disfluency indicators. However, Shehadeh (1999) sees these markers caused by output difficulties, self-initiated clarification attempts, instead of disfluency. Some research focuses on self-corrections or repairs in learners’ language for various purposes, not just to measure disfluency, to see the difference between proficiency levels, problem-solving mechanism, and self-repairs through task repetition (Lynch & Maclean, 2001).

Hesitations also indirectly influence other measures of fluency. Some research includes hesitation markers (filled pauses, repetitions, and self-repairs) in the number of words/syllables to measure fluency (Towell et al., 1996; Kormos & Denes, 2004), while other research excludes them (Mehnert, 1998; Foster, 2000; Skehan, 2001; Bygate, 2001; Tavakoli & Skehan, 2005) .

The various ways of looking at language use seem to have caused this phenomenon, as Foster et al. (2000) note that “different researchers may wish to deal with actual linguistic material within the false starts and corrections in different ways, depending on their interests” (p. 368). When we see language ability as language knowledge, these hesitations in learners’ language use are typically seen as time-creating devices to search for words/form not yet automated. Therefore, these hesitations, such as repeated words or part of a word, incomplete words before repaired, and filled pauses, are not computed as part of articulated words.

On the other hand, the use of time-creating devices can be considered to be a type of communication strategy. Dörnyei (1995: p. 71) claims: “… we were particularly interested in one aspect, the ability to fill the time with talk, which contrasts with a characteristic feature of L2 speech… in which the learner keeps grinding to halt, pauses for lengthy periods, and often gets so lost that the interlocutor loses patience, or a complete communication breakdown occurs. In measuring speech rate, fillers, lexicalized hesitations…and repetitions are considered to be part of fluent speech…”.

Dörnyei examined how teaching communication strategies affect the speech rate, which includes these features as fluency markers. A study of comparing results between including and excluding these markers in the number of words/syllables to examine fluency is needed. Whether these hesitation markers are computed as part of words/syllables or not, they are likely to be uniformly computed as either positive or negative features of fluency. Hesitations, however, could have various functions in communication.

Another way of seeing fluency might be by how many accurate and target-like collocations an L2 speaker can produce through on-line planning. Using chunks reduces an L2 speaker’s cognitive burden from paying attention to both the form and meaning of language. Foster (2001) reports that nonnative speakers are processing language more through rules than routines, when compared with native speakers. The use of prefabricated chunks (one type of collocations) as a time-creating device instead of hesitations would be a sign of fluency (Dörnyei, 1995).

On the contrary, Fulcher (2003) has doubts about counting constituents to measure fluency. He argues the initial problem that emerged from counting pauses or repetitions stemmed from the fact that the number of pauses did not automatically translate into a perception of reduced fluency. Fulcher (2003: pp. 100-101) examined speech data through discourse analysis and found several types of pauses in different situations: “end-of-turn pauses, content-planning hesitation, grammatical-planning hesitation, and addition of examples, counter examples, or reasons to support a point of view”. Fulcher related the use of these types of pauses with examinees proficiency levels. Low proficiency examinees are not always producing more pauses than higher proficiency examinees: both examinees in level 1 and level 5 used end-of turn pauses, though the reasons of the use are different; content planning hesitation increases as the level goes up to level 4, but in level 5 it goes down to even under level 1; and Level 2 and 3 have both grammatical-planning hesitation and addition of example, counter-examples, or reasons to support a point of view, but the latter is not likely to occur in low proficiency learners language. Just computing pauses or pausing time is not likely to be a perfect way to measure fluency.

Hesitation strategies appear in speech in the form of filled or unfilled pauses, paralinguistic markers like nervous laughter or coughing, or signals which are used to justify units in the coming utterances in which the speaker struggles to produce. The main functions of these forms of hesitation strategies have been associated with speech planning or accessing speech difficulties.

Previous studies on hesitation strategies used by beginner or advanced L2 learners revealed that beginners mostly leave their hesitation pauses unfilled which cause their speech to sound disfluent, and advanced learners tend to use various fillers in order to sound like native speakers.

2.3. Categories of Hesitation Markers

According to Gilquin (2008), the hesitation markers can be divided into three main categories, namely (silent and filled) pauses, smallwords and a miscellaneous category.

Silent pauses, which are defined as gaps in the utterance, are probably the most basic way of dealing with problems of formulation. Not knowing what to say, the speaker just remains silent. As pointed out by Fillmore (1979), silent pauses are multifunctional, since they both have a rhetorical function and serve as a marker of disfluency. Because it is almost impossible to identify with any certainty cases where the pause merely has a rhetorical function, however, all silent pauses were taken into account in the analysis.

Alternatively, pauses can be filled by vocalizations, sounds such as er or erm. Discussing uh and um (the American spelling variants of er and erm, respectively), Clark and Fox Tree (2002: p. 75) note that they are “characteristically associated with planning problems”, being used by speakers to announce a delay in speaking.

A number of smallwords can also be used to signal hesitation, such as kind of. Other examples include well, defined by Fuller as “a delay device when the speaker is not sure how to respond”, I mean, which can be used “when pausing to think about what you are going to say next” (Longman Dictionary of Contemporary English), and vague words such as stuff and or something, which can be used to fill knowledge gaps or lexical gaps. Most of these words are multifunctional (Schiffrin, 1987; Aijmer, 2002). You know, for instance, may be used “when you need to keep someone’s attention, but cannot think of what to say next” (Summers, 1995: p. 781), but it may also function as “a speaker appeal for hearer cooperation in a discourse task” (Schiffrin, 1987: p. 63). Often, several pragmatic functions are performed simultaneously by one and the same word and it may prove extremely difficult to disentangle them, despite the help of the surrounding context. By contrast, some smallwords have, next to their pragmatic function(s), a non-pragmatic meaning which can clearly be identified. Kind of, for example, may also be used with a nonpragmatic function, as a synonym of “type of”. Whenever a smallword was used with a non-pragmatic meaning, hence ruling out the function of hesitation, it was discarded.

The miscellaneous category includes drawls (i.e. syllable lengthening), truncated words and repetitions, which are all signals that the speaker is hesitating (see for example, Fox Tree & Clark 1997 on drawls, Temple (2000) on truncations and Wiese, 1984 on repetitions).

3. Methodology

The IELTS speaking test was chosen as the data source. IELTS is developed to provide a fair and accurate assessment of English language proficiency. Test questions are developed by language specialists from Australia, Canada, New Zealand, the UK and the USA. IELTS test content reflects everyday situations. It is unbiased and fair to all test takers from all backgrounds. IELTS official uploaded a series of videos about IELTS speaking test, by giving access to detailed transcriptions of these authentic speeches, it is possible for us to study hesitation phenomena with a precision and reliability that were practically unattainable before. The data were collected from YouTube. 8 IELTS speaking test samples, which were uploaded by IELTS Official were selected, among them 2 got band 6.0, 2 got 7.0, 2 got 8.0, 1 got 8.5, and 1 got 9.0. We chose two test takers from each level to make the data more representative. These online videos were transcribed. We paused or went backward from time to time to contemplate on specific clips to make sure the selected data are correct. After the transcription, the hesitation markers will be selected and classified, and the patterns of hesitation markers used in IELTS speaking test will be explored by conversation analysis.

The IELTS speaking test takes between 11 and 14 minutes and has three parts. In the first section (4 to 6 minutes) candidates are invited to talk about themselves and their interests and to answer questions on familiar topic areas. In the second section (3 to 4 minutes) the candidates talk about a topic suggested on a cue card. The candidate must speak for between one and one and a half minutes with a few examiner questions at the end. In the third section (4 to 5 minutes), the candidate has the opportunity to discuss issues of a more abstract nature. These issues or topics are thematically linked to part two. For example, if the part two question asks for a description of a favorite teacher, then part three will be a discussion of issues related to education. If part two is concerned with a holiday or interesting place, then part three will also be related to travel or tourism, and so on. A wide range of speaking skills are assessed, including: the ability to communicate opinions and information on everyday topics and common experiences and situations by answering a range of questions; the ability to speak at length on a given topic using appropriate language and organizing ideas coherently; the ability to express and justify opinions and to analyze, discuss and speculate about issues.

In this study, we focus on the Part 3, because it is relatively more difficult than Part 1 and Part 2. It is easier for us to find differences between the hesitation markers used by low-proficiency and the high-proficiency test-takes.

The scores on the IELTS speaking test are arranged in bands from 0 to 9, with 9 being the highest. Detailed information will be illustrated in Figure 1, which was downloaded from

Figure 1. IELTS speaking band descriptors.

4. Results

The following are analyses of hesitations in transcribed speech data. 8 samples who have different bands are closely examined and analyzed.

Note: T stands for the examiner, and S stands for the examinee. + means short silent pause; ++ means long silent pause; shadow means filled pause; means drawls; Underline means smallwords; bold letters means repetitions; bold italics means self-repairs.

4.1. Example 1 Li Band 6

Li who is from China talks about “hobbies”.

Excerpt 1

T: We’ve been talking about an interest that you enjoy and I’d like to discuss with you one or two more general questions related to this. Let’s consider, first of all, the social benefits of hobbies. What are some of the ways that having a hobby is good for a person’s social life?

S: um::, I think + sometimes + umh: people umh: need umh: are some casual social life that if they have a hobby, actually they could probably umh: for example, umh: connect stamps. They could use these to make new friends and: em:: could share the feeling with them umh: and help them to make new friends. umh: I think+ probably in this way, umh: it could increase his social life.

From Excerpt 1, we can find that Li produces lots of short silent pauses and filled pauses, and she also makes some repetitions. However, she does not use smallwords and self-repairs. She is able to keep going and is willing to give long answers, but coherence is occasionally lost through hesitation while she searches for words and ideas. She uses a good range of connecting words and markers (actually; in this way; I think the most important reason; as an example; as we know).

Vocabulary is the strongest feature of her performance. She is able to discuss topics at length and demonstrates some awareness of style and collocation (contemporary society; casual activities; temporarily forget; a moment just for yourself; time and resources). While she does make errors, these do not interfere with communication (for your healthy).

Her grammatical control is less strong, although she does produce some complex structures, such as subordinate clauses, accurately. Her control of verb tenses is variable and she has recurring difficulty with subject/verb agreement (you shouldn’t to be too addict; they’re too focusing on; he need to). Despite these errors, her meaning is usually clear.

She uses a range of pronunciation features but with variable control. Her rhythm is at times affected by syllable-timing but stress and intonation are used to some good effect (our life is not just for working—we should enjoy our lives as well). Some individual words and sounds are mispronounced, particularly “th”, but this has no significant impact on intelligibility and she can generally be understood without effort.

4.2. Example 2 Stephen Band 6

Stephen is also from China, and talks about “hobbies”.

Excerpt 2

T: What about hobbies um:: + that are very they don’t need other people? What about hobbies that are quite solitary?

S: It takes a spare time I guess er: because sometimes umh: like watching movies or:: playing computer games, it’s both of them are good ways to kill the time.

T: But do you think it improves their social involvement?

S: Yeah. + Sure. + Cause if you umh: if you don’t know what to do, maybe you’ll make trouble for the, to the society once you have got something to do. Maybe you just stay at home umh: because umh: I don’t know that there are some people + they don’t know umh: what to to umh: there is a spare times + they go out and drink and then + make troubles + for the society.

Stephen uses many silent/filled pauses and repetitions, and he also uses some self-repairs. The only smallword occurs in this excerpt is I guess. He is willing to speak at length but there are moments when coherence is lost as a result of repetition, self-correction and hesitation and he is unable to answer the question about why people need a hobby. He is able to use a variety of markers to link his ideas (first of all; I guess; like; it depends; at least; so), although these are not always used appropriately. Limitations in his performance are evident when he falls back on fillers such as how to say; how do you say.

He has a wide enough vocabulary to discuss topics at length (China opening up to the world; cut down the working shifts; more work opportunities), but while he uses some natural colloquial expressions (some other guys; that’s sweet), there are also some collocation errors (broaden your friendship; kill the spare time; in the past times; make more troubles). These rarely cause comprehension problems.

He produces a mix of short and complex sentence forms with a variety of grammatical structures. However, his overall grammatical control is variable and errors recur (you are make trouble to the society; people like spend; in the past…people work more…there is a period; may go travel round; we have also get), although these do not impede communication.

His pronunciation is generally clear and he divides the flow of his speech into meaningful word groups with good use of stress and intonation (normally we work eight hours a day, five days a week—that’s forty hours in total). Generally, he can be understood, but occasionally some words are hard to catch because of mispronunciation of sounds (bose for “both”; yoursels for “yourself”; cupper years for “couple of years”; zen for “then”; word for “world”).

4.3. Example 3 Alexandra Band 7

Alexandra who is from Colombia talks about “Famous people”.

Excerpt 3

T: Do you think of a celebrity accepts those endorsements? Do you think they lose their honesty?

S: +I don’t think that they will lose like honesty, but um: maybe it will change them. They will have to + accept the environment that they are living into right now. So they if get famous and + they are not selling their soul to the devil. I don’t think that they are the best I think that human and, if they like it better for them.

We can see that Alexandra uses less filled pauses than Li and Stephen. She tends to use more short silent pauses. She seldom uses drawls. In this excerpt, she does not use smallwords and self-repairs.

She speaks quite fluently and gives appropriate and extended responses. She makes good use of a range of markers and linking words (first; actually; I think so; for example; in a lot of ways; that’s why). There is some hesitation, but it is mainly content-related as she seeks to clarify her ideas before expressing them. Coherence is not affected by these slight pauses.

Vocabulary is a strong feature of her performance and she uses a wide range, including some less common, idiomatic and colloquial items (lose your privacy; selling their soul to the devil; getting dumped; it depends on the target; we need a rest from the serious stuff). However, there are also a few examples of error and inappropriate word use (a small news; end of the relax evening; free dresses).

Her grammar displays a good range of both simple and complex structures that are used flexibly and a number of her sentences are error-free. However, there are some noticeable errors in areas such as articles, prepositions, subject/verb agreement and verb tense (if someone recognize you; if people follows; you will like them fail; it won’t be happen like this).

Although she has a noticeable accent, her pronunciation is generally clear and easy to follow. Stress and intonation are used well to enhance meaning (You don’t have to pay for a lot of stuff. They will give free dresses and free stays in the hotels). She has a tendency to use syllable-timing, which prevents her sustaining appropriate rhythm over longer utterances. She also has occasional problems with sounds (jung for “young”), but this has only minimal effect on intelligibility.

4.4. Example 4 Alexandra Band 7

Hendrik is from Germany, and he also talks about “famous people”.

Excerpt 4

T: How about movie stars, are they also famous in Germany?

S: umh: Yeah, they are also famous + but not the German movie stars or music stars. We don’t have a:: big + yeah at the branch of this more American first at the embassy yeah American movie stars. But it’s hard to + yeah: um: to think that they are our Walmart or anything like that they are living on the other side of the leg and yeah, + the are just in the television

Hendrik uses a lot of yeah in his speech, and he also produces many silent/filled pauses and drawls. He repeats himself sometimes. However, he uses the or anything like that, which is often used by the native speakers.

He can maintain the flow of speech without noticeable effort and there is no loss of coherence. He uses a variety of linking words and markers (I would say; that’s a good question; as I said; as long as), but he overuses the filler (yeah) and sometimes referencing is inaccurate (for the one or the other reasons).

He uses a wide range of vocabulary, including some less common and idiomatic items and effective collocation (easy to blame; global warming; financial crisis; he stands for something; can’t stand the pressure). However, sometimes he lacks precision in his choice of words and expressions (Greek instead of “Greece”; on the other side of the lake; environmentally people/things; a big branch).

His grammar displays a good range of both simple and complex structures. Many of his sentences are error-free but he makes some mistakes in subject/verb agreement (people who wants; the people who admires him), articles (the normal person) and relative pronouns (everything what happens).

His pronunciation is clear and easy to follow. He uses both sentence stress and intonation effectively to convey meaning (you can’t blame a soccer player but it’s easy to blame the politicians). He does have a noticeable accent, however, and his mispronunciation of a few words results in occasional loss of clarity (wole model for “role model”; wong for “wrong”; serf the planet for “serve the planet”).

4.5. Example 5 Khush Band 8

Khush is from India, and talks about “famous people”.

Excerpt 5

T: Very often the media reports on the most trivial things in important issue. But people are fascinated by those things. Why are people fascinated by?

S: Because these people like to listen to gossips and trivial things. They’re not interested um: in the bigger part of what’s happening, they have a simple life, they would like to live with their way. They are not bothered what is happening within the country. There are only few people you see are really interest in knowing about the country and you know, getting all the information and reacting.

Khush speaks fluently, she seldom uses pauses and repetitions, and she says you know in her speech, which is more native. She speaks fluently and is able to give quite long and detailed responses without any loss of coherence. Hesitation is usually content-related and only occasionally to search for language. She uses fillers (you know; I mean) to cover this. Linking words and markers are used very naturally (that’s not the case; I’m fine with that).

Her vocabulary resource is wide and it allows her to talk about a range of topics with some flexibility and precision.

There are plenty of examples of stylistically appropriate language (political pressure; into corruption; today’s world; offensive; promote the product; a money-making business) with only occasional inaccuracies (do a meeting; in a right/wrong manner).

She uses a wide range of structures with a high level of accuracy. She makes only occasional minor errors. She uses pronunciation well to reinforce meaning, with rhythm, stress and intonation all used appropriately (at times I do like). There are only occasional lapses in word stress and in the formation of “th”.

4.6. Example 6 Kopi Band 8

Kopi from Botswana also talks about “famous people”.

Excerpt 6

T: How are famous people used in advertising?

S: + Well, + same music, it reaches out to people. So the government and other companies and other bodies in the country are trying to use music to reach out to people and pass the messages. And that regards that I think music is playing a part in that way. So.

Kopi speaks some smallwords, such as well in his speech. There is no filled pauses in this excerpt. He speaks fluently but rather slowly, with occasional hesitation as he engages with the topics. He is able to give quite complex and detailed responses without any loss of coherence, drawing on a range of markers to introduce his ideas (in that way; in some way; we have a situation in our country; in that regards; in every respect).

He skillfully uses his wide vocabulary in a sophisticated way to express himself precisely and accurately (reaching out through music; I wouldn’t put it past them; significant level; growing trend; a ripple effect), although there are a few inappropriate word forms and choices (old generation instead of “older generation”; have a long way instead of “have a long way to go”; a step back instead of “a backward step”).

He uses a full range of sentence forms and grammatical structures naturally, accurately and appropriately.

He is easy to understand throughout the test, in spite of a slight accent. Occasional misplaced stress and vowel formation (misicians for “musicians”; Bread Pitt for “Brad Pitt”) only minimally affect intelligibility.

4.7. Example 7 Kenn Band 8.5

Kenn from Singapore talks about “famous people”.

Excerpt 7

T: No what about in the past would you say that the kinds of people famous in the past are different?

S: I think there is a lot that was going. Number of the television celebrities mainly because that television industry has thought of maturity and Singapore but other than that I think that is pretty much the same.

Kenn speaks very fast and there are almost no pauses in his speech. He also uses some smallwords, like other than that. He speaks fluently for most of the time and develops topics coherently and appropriately, with only slight content-related hesitations as he engages with the topics.

His vocabulary is precise and sophisticated throughout this part of the test (prominent businessmen; emulate; a growing number of television celebrities; to promote charitable causes; endorsing a cause; negative repercussions; conscious of body image; susceptible to; prevalent).

He uses a wide range of grammatical structures naturally and accurately, with no noticeable error. He also uses a full range of pronunciation features to convey precise and subtle meaning such as emphatic stress (one example that comes to mind is celebrities) and contrastive stress (it’s not necessarily for causes… it’s also for celebrity behaviors). He sustains this flexible use of features of connected speech throughout and is effortless to understand.

4.8. Example 8 Anuradha Band 9

Anuradha from Malaysia talks about “famous people”.

Excerpt 8

T: Because of many cultures quite the opposite has happened where politicians used to be quite well-known. Whereas nowadays movie stars, television stars are more well-known. What do you think about in the future? Will this going to continue? Politicians will continue to be?

S: I think definitely in the future because the world is becoming more globalized Malaysians would have, I think, have a tendency to be exposed to more international programs and they know more international celebrities compared to the local actors and actresses or local politicians. So we would follow international politics, maybe American and British politics or even the models or actresses internationally.

Anuradha talks smoothly. Hesitation markers can hardly be found in her speech. She speaks fluently, with only rare repetition or self-correction. Any hesitation is not to search for language but to think of ideas. Her speech is coherent, with fully appropriate cohesive features (if you’re talking about; other than that; I think it’s more; as you can see).

She uses vocabulary with full flexibility and precision in all topics with a wide range of idiomatic language (have a tendency; be exposed to; the world is becoming more globalized; the norm; strikes a chord; communication tool; actors that sponsor; materialistically; cool gadgets; grasp of people’s mindset).

Her grammatical structures are precise and accurate at all times. She uses a full and natural range of structures and sentence types and makes no noticeable errors. She uses a full range of phonological features with precision and subtlety. The rhythm of her language is sustained throughout and stress and intonation are invariably used to good effect. This and her very clear production of individual words and sounds result in her being effortless to understand.

5. Discussion

Low-proficiency speakers are more likely to hesitate than high-proficiency speakers. This is because, next to the question of what to say next (“conceptualization”), speakers have to work out how to say it (“formulation”), and given that the language in which learners express themselves is not their mother tongue but a—usually imperfectly acquired—foreign or second language, this second stage normally involves more difficulties for them. The above analysis, however, has shown that not all categories of hesitation markers are overused by learners, as one may have expected. Low-proficiency speakers overuse pauses and other such non-lexical devices, but smallwords, on the other hand, tend to be significantly underrepresented in learner speech, well being a notable exception. One may wonder whether these differences between high-proficiency speakers’ and low-proficiency speakers’ use of the hesitation function are just that differences or whether they should best be viewed as pragmatic deficiencies, which should somehow be remedied. This is the issue that is addressed in this section.

Pragmatic differences have been given considerable attention in the literature on English as a Lingua Franca (ELF), i.e. English as a means of communication between speakers with different mother tongues (see e.g., Seidlhofer, 2005). According to the advocates of ELF, only those features which cause misunderstanding should be eradicated. Features which differ from native English but allow mutual intelligibility, on the other hand, are tolerated (or even promoted). In this context many cross-cultural encounters are claimed to be successful, and according to Aston, “interlanguage pragmatics should operate with a difference hypothesis rather than a deficit hypothesis”. Hesitation phenomena such as those investigated here do not normally lead to misunderstanding or communicative breakdown. They are at best “‘ripples’ on the pragmatic surface” (Seidlhofer, 2001: p. 147). As such, they should not qualify for the label of “deficiencies”, but should instead be considered as mere differences, which are “non-fatal” (Jordan & Fuller, 1975) to the conversation. In what follows, however, we would like to argue that markers of hesitation may have a role to play in the success (or otherwise) of interactions, and that it is precisely those markers that are overused by learners which may be detrimental to the conversation, whereas the markers they underuse help make the pragmatic “ripples” smoother.

Let us consider silent pauses. Not only do they fulfill the function of hesitation, but they may also indicate that the speaker has finished his/her turn and that the floor has become empty. Silences, therefore, may be misinterpreted, and the learner who overuses them runs the risk of losing his/her turn, while s/he was just trying to gain some time. This is especially true of long pauses (three seconds or more), where speaking may be “declared to have stopped rather than merely paused” (Griffiths, 1991: p. 346). Pauses of one second or less are comparatively well tolerated, one second being, according to Jefferson (1989), the “standard maximum silence” in interactions. It should also be noted that the position of the silent pause (not examined here) may be relevant, as Lennon (1990: p. 393) points out, with pauses occurring at major syntactic boundaries being more easily accepted (and, I would add, less likely to be misinterpreted) than pauses occurring within syntactic units. Yet, whatever their length or position, silent pauses have a feature, shared by other non-lexical markers of hesitation such as fillers or drawls, which makes them undesirable in interactions, especially when they are overrepresented: they are, in Möhle’s (1984: p. 36) words, “communicatively disturbing”. More precisely, these markers, often referred to as “temporal variables” (Grosjean, 1980), have been shown to contribute to the impression of non-fluency among EFL speakers (see Lennon, 1990). In comparison, the (native-like) use of smallwords of hesitation enables the speaker to hold the floor and stall for time, but in addition, gives an impression of fluency, as convincingly demonstrated by Hasselgren (2002).

The key issue here seems to be fluency, that is, “the ability to contribute to what a listener, proficient in the language, would normally perceive as coherent speech, which can be understood without undue strain, and is carried out at a comfortable pace, not being disjointed, or disrupted by excessive hesitation” (Hasselgren, 2002: p. 148). Although fluency would not be considered as one of the “core” features of ELF, since it is not crucial to intelligibility (it just helps to be “understood without undue strain”), it is nonetheless an important aspect of oral language. As Lennon (1990: pp. 391-392) explains, “fluency reflects the speaker’s ability to focus the listener’s attention on his or her message by presenting a finished product rather than inviting the listener to focus on the working of the production mechanisms”. In other words, it makes it possible for the listener to concentrate on what should be central to an utterance, namely its content. Fluency is crucial in the acquisition of a foreign/second language, as witnessed for example by the Common European Framework of Reference for Languages, which lists spoken fluency as one of the two generic qualitative factors determining the functional success of the learner (the other one being propositional precision) and requires that learners at the C2 level be able to “express [themselves] at length with a natural, effortless, unhesitating flow”. To the question of whether learners’ use of the hesitation function is “deficient” or merely “different”, we would therefore argue for the former. If learners are to achieve native-like proficiency and despite the claims made by ELF, this is still a goal pursued by many of them (Mukherjee, 2005) and one that is pedagogically sound (Kuo, 2006)—they have to learn how to deal with hesitation (which is part and parcel of any unplanned spoken interaction) in a way which does not impair fluency.

It therefore seems important to incorporate the function of hesitation into the (advanced) ELT curriculum, at least in the form of awareness-raising activities. Until recently, students were invariably presented with “aseptic” spoken texts (both in reading and listening comprehension tasks), from which all hesitation markers had been removed. Lately, mainly under the impetus of corpus linguistics and the ensuing wave of “authenticity”, hesitation markers have started to creep into textbooks, because of the lack of salience of such markers, however, simple exposure is not enough to raise students’ consciousness. Hesitation markers need to somehow “emerge” and be brought to students’ attention by means of appropriate activities. This could involve addressing issues such as the non-universality of fillers, the variety of hesitation markers, the multifunctionality of smallwords or the role of hesitation as a politeness strategy. Learners should be taught to rely less on pauses and other non-lexical devices, overused and “communicatively disturbing”, and to have recourse, instead, to smallwords, since these are less disruptive and “oil… the wheels of verbal interaction” (Stubbe & Holmes, 1995: p. 63).

6. Conclusion

Hesitation phenomena are inherent in spontaneous speech, both low-proficiency and high-proficiency. As noted by Lennon (1990), it is therefore not the presence vs. absence of such features that distinguishes between low-proficiency and high-proficiency speakers’ performance, but their frequency and distribution. Advanced learners of English overuse pauses and other non-lexical devices, tend to underuse smallwords such as like, I mean or you know. This is quite unfortunate since non-lexical hesitation markers are precisely those that give an impression of non-fluency, whereas smallwords “keep our speech flowing” (Hasselgren, 2002: p. 150). For this reason, learners’ use of the hesitation function has been described as “deficient”, rather than just “different”, and it has been suggested that the function deserves a place in the (advanced) FLT curriculum. The idea is not to eliminate hesitations, which are inseparable from spontaneous speech, but to equip learners with techniques of hesitation that are less disruptive to the interaction.

Fox Tree & Clark (1997: p. 166-167) note that “…spontaneous speech is replete with signals about the actual process of production”. They add that “…any model of production will be incomplete until it accounts for these signals, including how they are planned and produced on the fly”. This paper has gone some way towards accounting for such aspects in non-native speech. Many more avenues need to be explored, however. To give but two examples, one could examine whether any influence of the mother tongue is noticeable in learners’ use of the hesitation markers, or whether interviewers react differently to native and non-native speakers’ ways of hesitating.

There are some limitations in this study. Firstly, the sample size is small because of the limited time. More videos of IELTS speaking test will be collected in the future to improve this study. Secondly, there are four categories of band descriptors in IELTS speaking test, including Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy and Pronunciation. We only focus on the fluency and coherence, so the overall band is not an accurate criterion for us. At last, the mixed method should be used in this research. The quantitative should be added in the future, in order to make the results more convincing.

Cite this paper: Wang, Y. (2021) A Study on the Use of Hesitation Markers in Varied-Level EFL Learners’ L2 Speaking Process. Open Journal of Modern Linguistics, 11, 823-840. doi: 10.4236/ojml.2021.115063.

[1]   Aijmer, K. (2002). English Discourse Particles. Evidence from a Corpus. John Benjamins Publishing Company.

[2]   Beattie, G. W. (1980). The Role of Language Production Processes in the Organization of Behaviourin Face-to-Face Interaction. In B. Butterworth (Ed.), Language Production: Speech and Talk (pp. 69-107). Academic Press.

[3]   Burns, A., & Joyce, H. (1997). Focus on Speaking. National Centre for English Language Teaching and Research.

[4]   Butterworth, B. (1980). Evidence from Pauses in Speech. In B. Butterworth (Ed.), Language Production: Speech and Talk (pp. 155-176). Academic Press.

[5]   Bygate, M. (2001). Effects of Task Repetition on the Structure and Control of Oral Language. In M. Bygate, P. Skehan, & M. Swain (Eds.), Researching Pedagogic Tasks Second Language Learning, Teaching and Testing (pp. 23-48). Pearson Education.

[6]   Chambers, F. (1997). What Do We Mean by Fluency? System, 25, 535-544.

[7]   Clark, H. H., & Fox Tree, J. E. (2002). Using uh and um in Spontaneous Speaking. Cognition, 84, 73-111.

[8]   Dörnyei, Z. (1995). On the Teachability of Communication Strategies. TESOL Quarterly, 29, 55-85.

[9]   Fillmore, C. J. (1979). On Fluency. In C. J. Fillmore, D. Kempler, & W. S. Y. Wang (Eds.), Individual Differences in Language Ability and Language Behavior (pp. 85-101). Academic Press.

[10]   Foster, P. (2001). Rules and Routines: A Consideration of Their Role in the Task-Based Language Production of Native and Non-Native Speakers. In M. Bygate, P. Skehan, & M. Swain (Eds.), Researching Pedagogic Tasks Second Language Learning, Teaching and Testing (pp. 75-93). Pearson Education.

[11]   Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring Spoken Language: A Unit for All Reasons. Applied Linguistics, 21, 354-375.

[12]   Fox Tree, J. E., & Clark, H. H. (1997). Pronouncing “the” as “thee” to Signal Problems in Speaking. Cognition, 62, 151-167.

[13]   Fulcher, G. (2003). Testing Second Language Speaking. Peason Education.

[14]   Gilquin, G. (2008). Hesitation Markers among EFL Learners: Pragmatic Deficiency or Difference? In J. Romero-Trillo (Ed.), Pragmatics and Corpus Linguistics: A Mutualistic Entente (pp. 119-149). Mouton de Gruyter.

[15]   Griffiths, R. (1991). Pausological Research in an L2 Context. A Rationale, and Review of Selected Studies. Applied Linguistics, 12, 345-364.

[16]   Grosjean, F. (1980). Temporal Variables within and between Languages. In H. W. Dechert, & M. Raupach (Eds.), Towards a Cross-Linguistic Assessment of Speech Production (pp. 39-53). Peter Lang.

[17]   Hasselgren, A. (2002). Learner Corpora and Language Testing. Smallwords as Markers of Learner Fluency. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching (pp. 143-173). John Benjamins Publishing Company.

[18]   Jefferson, G. (1989). Preliminary Notes on a Possible Metric Which Provides for a “Standard Maximum” Silence of Approximately One Second in Conversation. In D. Roger, & P. Bull (Eds.), Conversation: An Interdisciplinary Perspective (pp. 166-196). Multilingual Matters.

[19]   Jordan, B., & Fuller, N. (1975). On the Non-Fatal Nature of Trouble: Sense-Making and Trouble-Managing in Lingua Franca Talk. Semiotica, 13, 11-31.

[20]   Kormos, J., & Dénes, M. (2004). Exploring Measures and Perceptions of Fluency in the Speech of Second Language Learners. System, 32, 145-164.

[21]   Kuo, I.-C. (Vicky) (2006). Addressing the Issue of Teaching English as a Lingua Franca. ELT Journal, 60, 213-221.

[22]   Lennon, P. (1990). Investigating Fluency in EFL: A Quantitative Approach. Language Learning, 40, 387-417.

[23]   Lynch, T., & Maclean, J. (2001). A Case of Exercising: Effects of Immediate Task Repetition on Learners’ Performance. In M. Bygate, P. Skehan, & M. Swain (Eds.), Researching Pedagogic Tasks Second Language Learning, Teaching and Testing (pp. 141-162). Pearson Education.

[24]   Maclay, H., & Osgood, C. E. (1959). Hesitation Phenomena in Spontaneous English Speech. Word, 15, 19-44.

[25]   Mehnert, U. (1998). The Effects of Different Lengths of Time for Planning on Second Language Performance. Studies in Second Language Acquisition, 20, 83-108.

[26]   Möhle, D. (1984). A Comparison of the Second Language Speech Production of Different Native Speakers. In H. W. Dechert, D. Möhle, & M. Raupach (Eds.), Second Language Productions (pp. 26-49). Gunter Narr Verlag.

[27]   Mukherjee, J. (2005). The Native Speaker Is Alive and Kicking: linguistic and Language Pedagogical Perspectives. Anglistik, 16, 7-23.

[28]   Raupach, M. (1987). Procedural Learning in Advanced Learners of a Foreign Language. In J. Coleman, & R. Towell (Eds.), The Advanced Language Learner (pp. 123-56). CILTR.

[29]   Romero Trillo, J. (1994). Ahm, ehm, You Call It Theme? … A Thematic Approach to Spoken English. Journal of Pragmatics, 22, 495-509.

[30]   Schiffrin, D. (1987). Discourse Markers. Cambridge University Press.

[31]   Seidlhofer, B. (2001). Closing a Conceptual Gap: The Case for a Description of English as a Lingua Franca. International Journal of Applied Linguistics, 11, 133-158.

[32]   Seidlhofer, B. (2005). English as a Lingua Franca. ELT Journal, 59, 339-341.

[33]   Shehadeh, A. (1999). Non-Native Speakers’ Production of Modified Comprehensible Output and Second Language Learning. Language Learning, 49, 627-675.

[34]   Skehan, P. (2001). Tasks and Language Performance Assessment. In M. Bygate, P. Skehan, & M. Swain (Eds.), Researching Pedagogic Tasks Second Language Learning, Teaching and Testing (pp. 167-185). Pearson Education.

[35]   Stubbe, M., & Holmes, J. (1995). You Know, eh and Other “Exasperating Expressions”: An Analysis of Social and Stylistic Variation in the Use of Pragmatic Devices in a Sample of New Zealand English. Language & Communication, 15, 63-88.

[36]   Summers, D. (Ed.) (1995). Longman Dictionary of Contemporary English (3rd ed.). Longman Group Ltd.

[37]   Tavakoli, P., & Skehan, P. (2005). Strategic Planning, Task Structure, and Performance Testing. In R. Ellis (Ed.), Planning and Task Performance in a Second Language (pp. 239-273). John Benjamins.

[38]   Temple, L. (2000). Second Language Learner Speech Production. Studia Linguistica, 54, 288-297.

[39]   Towell, R., Hawkins, R., & Bazergui, N. (1996). The Development of Fluency in Advanced Learners of French. Applied Linguistics, 17, 84-119.

[40]   Wiese, R. (1984). Language Production in Foreign and Native Languages: Same or Different? In H. W. Dechert, D. Möhle, & M. Raupach (Eds.), Second Language Productions (pp. 11-25). Gunter Narr Verlag.