This blog is dedicated to criticism. It offers

  1. Critical suggestions and resources for those doing post graduate courses in teaching English as a foreign language.
  2. A critical appraisal of what’s happening in the world of English Language Teaching.

The commercialisation of the ELT industry (estimated to be worth more than $20 billion) and the corresponding weakening of genuinely educational concerns, means that today most teachers are forced to teach in a way that shows scant regard for their worth, their training, their opinions, their job satisfaction, or the use of appropriate methods and materials. The biggest single reason for this sorry state of affairs, and the biggest single obstacle to good ELT, is the coursebook.

Using a coursebook entails teachers leading students through successive units of a book. Each unit of the book concentrates on a certain topic where isolated bits of grammar and vocabulary are dealt with, on the assumption that students will learn them in the order that they’re presented. Such an approach to ELT flies in the face of research which suggest that SLA is a process whereby the learner’s interlanguage (a dynamic, idiosyncratic, evolving linguistic system approximating to the target language) develops as a result of communicating in the target language, and is impervious to attempts to impose the sequences found in coursebooks.

The publishing companies that produce coursebooks spend enormous sums of money on marketing, aimed at persuading stakeholders that coursebooks represent the best practical way to manage ELT. As an example, key  players in the British ELT establishment, the British Council, the Cambridge Examination Boards, the Cambridge CELTA and DELTA teacher training bodies among them, accept the coursebook as central to ELT practice. Worse still, TESOL and IATEFL, bodies that are supposed to represent teachers’ interests, have also succumbed to the influence of the big publishers, as their annual conferences make clear. So the coursebook rules, at the expense of teachers, of good educational practice, and of language learners.

By critically assessing the published views of those in the ELT establishment who promote coursebook-driven ELT, this blog hopes to lend support to those who fight for a less commercial, less centralised, more egalitarian, more learner-centred approach to ELT.

Materials Evaluation


Here’s a vocabulary exercise I found while browsing through material that Gerry Sweeny, a one time colleague at ESADE Idiomas, gave me.

Vocabulary in Context

The following sentences contain nonsense words. Can you make sense of them?

  1. The sentence was written on a piece of drurb.
  2. Most drurb, like snow, is osgrave.
  3. Cats are domestic ningles.
  4. Polar bears, which are osgrave ningles, live where there is cridlington.
  5. If you set fire to drurb, it firtles.
  6. If you pour narg on firtling drurb, the flames go out.
  7. If you put cridlington into hot narg, it frumes.
  8. Cridlington frumes at a bazoota over 0º C.
  9. Narg boobles at a bazoota of 100º C .
  10. We frize bazootas with a nast.

What do you think the nonsense words mean in the above sentences?

  1. drurb
  2. osgrave
  3. ningles
  4. cridlington
  5. firtles
  6. narg
  7. frumes
  8. bazoota
  9. boobles
  10. frize
  11. nast



I’m currently looking through material available to members of the Cooperativa de Serveis Linguistics de Barcelona, with the idea of getting a materials bank together which would help members to avoid using coursebooks.  While there’s an ambundance of ELT materials available online, it’s difficult to quickly find material that satisfies a few basic criteria, such as relevance, quality, useability and legality. Neill McMillan and I met recently and we reckon that we need to assemble a lot of material which satisfies these criteria, or rather, well-considered criteria that we can all agree on, and then classify them according to fields such as, off the top of my head, level, topic, media, grammar point, and skill. The idea is to give members access to a data base of materials where they can find written and spoken texts, with accompanying worksheets, at a certain level, topic, etc., so that they can easily confect everything from an ESP course with appropriate tasks, to lesson plans, to fillers. Maybe you’re only looking for a text; maybe you’re looking for a text plus worksheet, maybe you’re looking for a fresh aproach to practicing a function; maybe you need a good clear explanation of some grammar point, maybe you’re trying to get together a proposal for a 50 hour course aimed at auditors, and so on.  I should add that I have a particular interest in developing a process syllabus, which I’ve discussed in a previous post and which relies on a materials bank.


So we see the challenges of this project as being to decide on the criteria for any bit of material, to decide on how the collection of the individual bits of material is organised in the data base, and to indicate links among them.

Looking at the worksheet above, what to do? Supposing that it were well presented, and that there were no copyright issues, does it warrant inclusion? Is its openness a good thing (allowing teachers to exploit it in their own way), or does it need some lead in and some further work? Is it useful, anyway?  More generally, how do we judge it’s worth? If you look at most of the literature on materials evaluation, you’ll be hard put to apply the frameworks to this, because most frameworks are, either explicitly or implicitly geared to coursebooks. Rather than indulge in a rant, I invite you to give your opinion. If you were getting a materials bank together, would you include this?

Dumb bells in the Language Gym


The Language Gym follows the classic self-help format: I’ll tell you the answers to all your worries and fears (about language teaching) but you need to park your critical faculties at the front door. The posts are stridently prescriptive, shamelessly self-promotional, and dumbly dogmatic, with titles like these:

  • 10 commonly made mistakes in vocabulary instruction
  • Eight motivational theories and their implications for the classroom
  • 10 commonly made mistakes in vocabulary instruction
  • Six ‘useless’ things foreign language teachers do


The author of this blog is Gianfranco Conti, who never tires of selling himself and his terrible book. A few examples from recent posts:

  • But I do have a teacher-training background, a PhD in Applied Linguistics and an MA in TEFL on top of 25 years language teaching experience.
  • As professor Macaro, former Head of the Oxford University Education Department, wrote in his excellent review of our book ‘The Language Toolkit’ (click here) …
  • I have had to adopt feedback-to-writing strategies that are not aligned with my espoused theory of L2 learning and current research wisdom – despite having a PhD in error correction in second language writing.
  • Since posting my three articles on listening … I have been flooded with e-mail, Twitter and Facebook messages from teachers worldwide
  • My students conjugate verbs every day on the http://www.language-gym conjugator… often scoring 90 -100%

Every post has references to his book, and ends with a plug for it.

Well, “no harm done” you might reasonably say, and maybe none is. Still, in his two most recent posts, Dr. Conti says a few things that I think need commenting on.


1. Principled Teaching

In his latest post Conti argues that ELT must be grounded in a deep understanding (like his) of SLA. He says that teachers need to ask themselves these 3 questions:

  1. How are foreign languages learnt ?
  2. What are the implications of the answer to question (1) for language teaching and learning ?
  3. Is the answer to (2) truly reflected in your own teaching practice?

We’ll skip all the preamble, where Conti explains how his abundant qualifications and experience make him more ready than most teachers to be a “reflective practitioner” and look at his answer to Question 1. He says this:

Cognitive models of language acquisition (especially Skill-based theories and Connectionism) provided the basis for my espoused theory of learning and shaped much of what you read in my blogs and of what I have been doing in the classroom for the last 20 years.

I couldn’t find anything about Connectionism in the gym, but there are certainly quite a few posts where we’re told how learners’ brains work, and how getting things from their working memory into their long term memory is the secret of all teaching and learning. So let’s have a look at the theory which provides the basis for Conti’s principled teaching.


Skill Acquisition Theory

As a general learning theory, skill acquisition theory argues that when you start learning something, you do so through largely explicit processes; then, through subsequent practice and exposure, you move into implicit processes. So you go from declarative knowledge to procedural knowledge and the automatisation this brings. Declarative knowledge involves explicit learning or processes; learners obtain rules explicitly and have some type of conscious awareness of those rules. The automatization of procedural knowledge entails implicit learning or processes; learners proceduralise their explicit knowledge, and through suitable practice and use, the behaviour becomes automatic.

Quite a few objections have been raised to to this theory. First, the lack of an operational definition undermines the various versions of skill acquisition theory that Conti has referred to: there is no agreed operational definition for the constructs “skill”, “practice”, or “automatization”. Partly as a result, but also because of methodological issues (see, for example, Dekeyser, 2007), the theory is under-researched; there is almost no empirical support for it.

Second, skill acquisition theory is in the “strong-interface” camp with regard to the vexed issue of the roles of explicit and implicit learning in SLA. It holds that explicit knowledge is transformed into implicit knowledge through the process of automatization as a result of practice. Many, including perhaps most famously Krashen, dispute this claim, and many more point to the fact that the  theory does not take into account the role played by affective factors in the process of learning.  Practice, after all, does not always make perfect.

Third, the practice emphasized in this theory is effective only for learning similar tasks: it doesn’t transfer to dissimilar tasks. Therefore, many claim that the theory disregards the role that creative thinking and behaviour plays in SLA.

Fourth, to suggest that the acquisition of all L2 features starts with declarative knowledge is to ignore the fact that a great deal of vocabulary and grammar acquisition in an L2 involves incidental learning where no declarative stage is involved.

In my opinion, the most important weakness of skill acquisition theory is that it fails to deal with the sequences of acquisition which have been the subject of hundreds of studies in the last 50 years, all of them supporting the construct of interlanguages.

We may conclude that while there are some interesting aspects of skill acquisition theory, it is both poorly constructed and incomplete. Given the current state of SLA theory, and given the essentially unscientific nature of the craft of language teaching, the strident claims made by Conti are unwarranted. In as far as he gives the impression that he knows how people learn a foreign language, and that he knows how to use this knowledge to build the best methodology for ELT, Conti is as deluded as those who use their web sites to peddle homeopathic pills.


Planting Seeds

The limitations of Conti’s understanding of SLA are evident in his previous post “The seed-planting technique …..”,   where he says:

effective teaching and learning cannot happen without effective curriculum design…… A well-designed language curriculum plans out effectively when, where and how each seed should be sown and the frequency and manner of its recycling with one objective in mind : that by the end of the academic year the course’s core language items are comprehended/produced effectively across all four language skills under real life conditions.

This amounts to what Breen (1987) calls a “Product” syllabus, what White calls a “Type A” syllabus and what Long (2011 and 2015) calls a “Synthetic” syllabus. The key characteristic of Conti’s “effective curriculum” is that it concentrates on WHAT is to be learned. The designer decides on the content, which is divided up into bits of lexis and grammar that are presented and practiced in a pre-determined order (planting “seeds” which precede the scheduled main presentation and subsequent recycling). The syllabus is external to the learner, determined by authority. The teacher is the decision maker, and assessment of success and failure is done in terms of achievement or mastery.

The problem with Conti’s curriculum is that he relies on skill acquisition theory, which makes two false assumptions. First, it assumes that declarative knowledge is a necessary precursor to procedural knowledge, and second, it assumes that learners learn what teachers teach them, an assumption undermined by all the evidence from interlanguage studies. We know that learners, not teachers, have most control over their language development. As Long (2011) says:

Students do not – in fact, cannot – learn (as opposed to learn about) target forms and structures on demand, when and how a teacher or a coursebook decree that they should, but only when they are developmentally ready to do so. Instruction can facilitate development, but needs to be provided with respect for, and in harmony with, the learner’s powerful cognitive contribution to the acquisition process.

Even when presented with, and drilled in, target-language forms and structures, even when errors are routinely corrected, and even when the bits and pieces are “seeded” and recycled in various ways, learners’ acquisition of newly-presented forms and structures is rarely either categorical or complete, and it is thus futile to plan the curriculum of an academic year on the assumption that the course’s “core language items” will be “comprehended/produced effectively” by the end of the year. Acquisition of grammatical structures and sub-systems like negation or relative clause formation is typically gradual, incremental and slow, sometimes taking years to accomplish. Development of the L2 exhibits plateaus, occasional movement away from, not toward, the L2, and  U-shaped or zigzag trajectories rather than smooth, linear contours. No matter what the order or manner in which target-language structures and vocabulary are presented to them by teachers, learners analyze the input and come up with their own interim grammars, the product broadly conforming to developmental sequences observed in naturalistic settings. They master the structures in roughly the same manner and order whether learning in classrooms, on the street, or both. This led Pienemann to formulate his learnability hypothesis and teachability hypothesis: what is processable by students at any time determines what is learnable, and, thereby, what is teachable (Pienemann, 1984, 1989).

Once again, the hyped-up sales pitch turns out to be unwarranted. The carefully-planned, “principled” curriculum Conti showcases is nothing more than an old-fashioned product syllabus, with a few bells and whistles, or rather dumbbells and seeds, thrown in.



Breen, M. (1987) Learner contributions to task design. In C. Candlin and D. Murphy (eds.), Language Learning Tasks. Englewood Cliffs, N.J.: Prentice Hall. 23-46.

Dekeyser, R. (2007) Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition: An introduction (pp. 97-113). New Jersey: Lawrence Erlbaum Associates, Inc.

Long, M. (2011) “Language Teaching”. In Doughty, C. and Long, M. Handbook of Language Teaching. NY Routledge.

Long, M. (2015) SLA and TBLT. N.Y., Routledge.

White, R.V. (1988) The ELT Curriculum, Design, Innovation and Management.  Oxford: Basil Blackwell.

Selivan explains Trump speech similarities. No pigs seen flying


Leo Selivan has a rather off-hand way of treating research findings in corpus linguistics: he often uses undefined terms and blurred summaries to support his own particular view of ELT, which, let’s not forget, includes the breath-taking injunction “Never teach single words”. In his most recent post, Selivan repeatedly uses the term “chunks” without defining it, he misrepresents Pawley and Syder’s 1983 paper, and he then examines an excerpt from Melania Trump’s recent speech to the Republican party in order to demonstrate that “chunks”, not blatant plagiarism, explain the similarities with M. Obama’s 2008 speech.


1.  “Chunks”

Selivan says:

corpus research …. has shown that language is highly formulaic, i.e. consisting of recurring strings of words, otherwise known as “chunks”. What makes them chunks is the fact that they are stored in and retrieved from memory as ‘wholes’ rather than generated on a word-by-word basis at the moment of language production. 

Two comments are in order.

a)  What makes recurring strings of words “chunks” is not how they’re memorised, but rather their form.

b) It is not “a fact” that chunks are stored in and retrieved from memory as ‘wholes’. The hypothesis suggested by Pawley and Syder is that certain  types of strings of words are memorised and recalled in a certain carefully described way. By definition, this hypothesis is not true – it is a tentative theory which attempts to explain a problem.


2. Pauley and Syder

Selivan says:

The formulaic nature of language was first brought to the fore in a seminal paper by Australian linguist Andrew Pawley and and his colleague Frances Syder, who pointed out that competent language users have at their disposal hundreds of thousands of ready-made phrases (Pawley and Syder 1983).

Pawley and Syder’s paper was a lot more nuanced than Selivan suggests. They argued that control of a language entails knowledge of more than just a generative grammar, and that ‘memorized sentences’ and ‘lexicalized sentence stems’ (not “ready-made phrases”) were important additional parts of linguistic competence, useful in explaining the two puzzles of “nativelike selection” and fluency. As they say:

The terms refer to two distinct but interrelated classes of units, and it will be suggested that a store of these two unit types is among the additional ingredients required for native control (Pawley and Syder, 1983, p. 204).

When discussing ‘lexicalized sentence stems’, Pawley and Syder make it clear that these stems often include parts which can be transformed in various ways. They also admit that there are many problems in the treatment of lexicalized sentence stems.

How is a lexicalized sentence stem defined? How do you tell it apart from non-lexicalized sequences? There is no simple operation for doing this. The problem is essentially the same as in distinguishing any morphologically complex lexical item from other sequences; the question is what is ‘lexicalization’? What makes something a lexeme? ….  An expression may be more or less a standard designation for a concept, more or less clearly analysable into morphemes, more or less fixed in form, more or less capable of being transformed without change of meaning or status as a standard usage, and the concept denoted by the expression may be familiar and culturally recognized to varying degrees. Nor is there a sharp boundary between the units termed here ‘sentence stems’ and other phraseological units of a lower order (Pawley and Syder, 1983, p. 207).


3. The Speech

With regard to Melania Trump’s speech, Selivan looks at one of the copied parts and comments on the common uses of “impress upon”, and  the ubiquity of the phrases “work hard” and “keep promise” (sic). As a clincher, Selivan says

Looking at “treat people with respect” which is supposedly copied from Michelle Obama’s “treat people with dignity and respect”, you will see that “dignity” and “respect” are two of the very highly likely collocates here.

From this carefully assembled evidence, Selivan concludes:

If Melania’s faux pas indeed constitutes plagiarism, the text of her speech was no more plagiarized than an academic paper containing “Recent research has shown that” or “The results are consistent with data obtained in…”

Apart from the sentence being very badly constructed, and the claim being a ridiculous non-sequitur, can you imagine anybody seriously saying that the use of  “Recent research has shown that” or “The results are consistent with data obtained in…” by an academic in a published paper constitutes plagiarism? Likewise, who but Selivan and his Humpty-Dumpty use of “chunks” could seriously offer the analogy in order to defend Melanie Trump from the accusation of plagiarism?

Here’s an extract from the recent speech:

M Trump: Because we want our children in this nation to know that the only limit to your achievements is the strength of your dreams and your willingness to work for them.

And here’s an extract from the 2008 speech:

M. Obama: Because we want our children — and all children in this nation — to know that the only limit to the height of your achievements is the reach of your dreams and your willingness to work for them.

To attempt to explain the “similarities” between the two texts by appealing to “recurring sequences” is an indication of how far a little knowledge can lead one astray.


Pawley, A., & Syder, F.H. (1983) Two puzzles for linguistic theory: nativelike selection and nativelike fluency in Richards, J.C. & Schmidt, R.W. (eds) Language and Communication, London; New York: Longman, pp 191 – 225. *

*As Selivan usefully points out, this article is available online at



Mura Nava’s “Quick Cups of COCA”


Mura’s blog EFL Notes is an excellent source of up to date, well-considered information on using  corpora in ELT.  Mura uses his elegant blog to talk to teachers about how to use concordancers in their jobs, and he’s recently published “Quick Cups of Coca”, which I thoroughly recommend. You can download this gem from his website, and you should do it today.

Using a concordancer to search corpora for information about the English language is a rewarding activity for anybody  involved in ELT. It’s fascinating, absorbing, revealing, and it helps us to see the limitations of the explanations of grammatical forms, lexis, and lexical chunks that are offered by current coursebook writers, including those who claim to be implementing a lexical approach.

A concordancer helps you to examine these questions:

  • What words occur in the corpus (a body of texts)?
  • How often does each word occur? (Frequency counts)
  • In how many different types of text (different subject areas, different modes, different mediums) does the word appear?
  • Are there any significant subsets? (For example, in English, the 700 most frequent words account for 70% of all text.)
  • What are the collocations of the target item?
  • What are the contexts in which the word appears?

Taking a word as the search item, a concordancer will list all the different occurrences of the word in a text, it will count how often the word occurs, it will indicate what type of text the word appears in, and it will display the instances of the word in its context in a variety of formats, the most usual being the Key Word In Context (KWIC ) format, which lists all occurrences of the word in a 1 line context.


Tim Johns was among the first to suggest that a concordancer could be used in the classroom, either as a “silent resource” (just waiting until somebody asked a question it could help with), or as a means of making materials. Mura continues Tim’s work, and he does it splendidly. He uses one of the very best corpora available for free consultation (which is accompanied by a very user-friendly concordancer), namely COCA,  a corpus containing more than 520 million words of American English text: 20 million words each year 1990-2015, equally divided among spoken, fiction, popular magazines, newspapers, and academic texts.

Mura’s Quick Cups of COCA, which you can download from his site, is clear as a bell, uncluttered, interesting and thought-provoking. These are the tasks which he outlines:

  1. Using the wildcard asterisk to explore the difference between unmotivated and demotivated.
  2. How to look for synonyms.
  3. Variations of “bring to the boil”.
  4. Relative clauses.
  5. Lemmas (in this case benefit) and parts of speech.
  6. Compound words.
  7. Comparing words (in this case rotate and revolve).
  8. Clauses (in this case the verb claim: claim to have, claim to be, claim to know, etc.).
  9. Miscellaneous: possessives; past regular & irregular; progressive auxiliaries; passives.

Notice the breadth of the tasks. Mura has, I’m sure deliberately, chosen tasks that illustrate how broad is the sweep of questions that you can ask.

So many questions come to mind about the use of concordancers in ELT. There’s so much to discuss here, and, somewhat typically, Mura leaves us to muse for ourselves. I did my MA dissertation on concordancers (contact me if you’d like a copy) and I worked with Tim Johns and others to produce Microconcord, a concordancer published by OUP and still available (Google it). Here’s an example of a worksheet that I wrote 20 long years ago to accompany the Microconcord software:


Activity:  Examine the different ways that for and during are used.

Warm Up

How do you think the two words above are used? Here are two examples:

  I haven’t seen Jim for two months.

  I lived in Holland during the war.

A common mistake is:

  x I haven’t seen Jim during two months. x

As a preliminary description, we can say that for is used to say how long something lasts, and during is used to say when something happened, but only in reference to a given stretch of time, like the second world war, or the summer holidays, for example.  For is much more common than during, and it is used in more different ways.

Write down a sentence of your own for each word.

Now we will see what the concordancer can find.


Note: Have the BASIC INSTRUCTIONS sheet with you, so that you can follow the steps.

  1. Load the program.
  2.  Type in: during\for as the search words.
  3. Hit RETURN.
  4. You see the texts that the program is sorting through, and a running total of the number of examples it has found. When the total is 100, hit the Esc key.
  5.  You see at the bottom of the screen a report on how many examples it found, and their frequency. Hit RETURN.
  6.  You see the examples of the 2 words in the middle of lines of text. They are sorted with 1st Right as first priority, and Search Word as second priority.
  7. Use the arrow keys to look through the examples.
  8. Use the arrow keys to go to the examples of during.

QUESTION: What words occur after during? Are there any examples that surprise you?   Write down some examples of words that come after during.

9. Now look at the examples of for. There are a lot, and the word is used in different ways.

QUESTION: How many of the examples refer to how long something lasts? Write down 5 examples.

QUESTION: Can you identify other ways that for is used? Try to find different categories. Write down 5 sentences that interest you.

QUESTION: What would you add to the explanation at the beginning of the exercise?

Well, there it is. I won’t bother to comment on the worksheet, which has many weaknesses, save to note that it attempts to engage learners in an exploration, rather than simply telling them “the answer”.

For the moment I urge you to get a copy of Quick Cups of COCA, after which I hope you’ll talk about Mura’s work here, at his blog, and to all those who care about ELT.

Summer Reading


Dan Brown doesn’t do it for you?  Jeremy Harmer’s greatest hits leave you unquenched? Try these:

Best Fiction of 2016 so far


Julian Barnes: The Noise of Time.

A real tour de force by Barnes who does a fine job of transmitting the true horror of Stalinist Russia’s denial of free expression, the awful results of the absolute and fickle control of philistines over culture, the constant fear under which everybody lived their lives. This is a very powerful book, “a condensed masterpiece that traces the lifelong battle of one man’s conscience, one man’s art, with the insupportable exigencies of totalitarianism” as Guardian critic Alex Preston says in his review.   Not the most relaxing pool side read, not easy, not light, but it’s a compelling story and it left me with a re-kindled fear of totalitarian regimes and a grudging gratitude for living in the West.

Best Fiction I’ve read in 2016 so far


Edward St Aubyn: The Melrose Novels. I don’t know why it took me so long to find these 5 novels, but I’m so glad that I finally had the chance to enjoy them. From the opening lines of the first book – Never Mind – to the last lines of book 5 – At Last – St Aubyn dazzles with his quite extraordinary writing. He tells a harrowing tale, but he tells it with verve, sparkle, wit and honesty; he doesn’t flinch, he doesn’t hold back and there’s not a trace of bathos or self-pity. I don’t think I’ve ever been so initially impressed with a novelist’s style. The 5 books rip along – you can read the lot in a week. The story is awful, starting with how he was consistently raped by his father.  It’s frightening, it’s magnificent, it’s funny, it’s appalling, it’s heroic, it’s witty; it’s tragic, it’s inspring. St. Auybyn says that writing these books saved his life and it’s obvious that they’re cathartic. You have to read them all, but if you only have time for one, then I recommend Mother’s Milk. If you think Silvia Plath was scathing about her dad, read what St Aubyn has to say about his mum – the mum who did nothing to protect him from his dad’s abuse.

Best Non-Fiction books of 2016 so far


Yanis Varoufakis And The Weak Suffer What They Must?

The Greek Finance Minister takes us on a compelling ride through the eurozone from post Second World War attempts at recovery to the inevitable collapse in 2008 and beyond. This is a fresh, persuasive narrative which argues that “the weakest citizens of the weakest nations have paid the price for bankers’ mistakes” and that “the principle of the greatest austerity for those suffering the greatest recessions has led to a resurgence of racist extremism.”  Well-written and well-informed, with perhaps just a tad too much reliance on fiscal and monetary shenanigans to explain the fundamental flaws in the EU, this is a real pool side page turner; no really: it is.


Miichael Greger: How Not To Die.

The best guide to healthy eating ever written. All the top causes of premature death – heart disease, various cancers, diabetes and many more – can be beaten by “nutritional and lifestyle interventions”. Well. I’ll grant you that that isn’t the best phrase ever written, but the book is wonderfully clear and very practical. We really must stop eating red meat and processed food. Unprocessed plant foods – beans, berries, other fruits, cruciferous vegetables, greens, other veg., flaxseeds, nuts, spices, whole grains – plus lots of teas and water, is what you need.  Greger argues his case very forcefully, but he’s not a zealot. Here’s a sample:

Whenever I’m asked whether a certain food is healthy or not, I reply “Compared to what?” For example, are eggs healthy? Compared to oatmeal, definitely not. But compared to the sausage links next to them on the breakfast platter? Yes.  

Best New Book on SLA so far in 2016


Stefano Rastelli Discontinuity in SLA.

I’ve already given Mike Loing’s review of this book, so suffice it to say that it’s a must read. Tired of bullshit from the likes of Larsen Freeman? Read this. Stefano will deliver a paper on Intra language at the upcoming SLRF conference in September- stand by!  If you’re pool side, get in the shade, put down that drink and read Rastelli’s book. It’s invigorating. Mike Long has already questioned bits of it, and I await the verdicts of Kevin Gregg, Nick Ellis, Peter Robinson, William O’Grady and others. I wonder what Scott Thornbury will make of it.

Best Book on SLA I’ve read in 2016 so far


William O’Grady How Children Learn Language. Kevin Gregg chastised me for not having already read this book. It’s superb. The clarity of O’Grady’s writing is supreme, and the force of his argument is daunting. All those who fumble and stumble in their criticisms of Chomsky’s UG should read O’Grady’s splendid work. It’s one of the best books on language learning I’ve ever read. It’s accessable, it’s persuasive, it’s a model of coherence and cohesion.  It should, in my opinion, be required reading on any ELT course.

Best Book on ELT so far in 2016


Brian Tomlinson (ed) SLA Research and Materials Development  For Language Learning. I’m a bit wary about recommending this book because I haven’t finished reading it, but it looks good. It has Tomlinson’s hand all over it, and it’s uneven, but still, it has some some good chapters in it, including some that slam the use of cousebooks and give a much more considered view of how lexical chunks should be dealt with than that provided by the usual suspects, who give so little evidence of scholarship.

And if you don’t like the sound of any of the above books, may I recommend Thomas Pynchon’s V – the best novel I’ve ever read.


Have a great summer.

Rastelli’s Discontinuity Hypothesis: a new challenge for SLA researchers


Mike Long’s review of Stefano Rastelli: Discontinuity in Second Language Acquisition. the Switch between Statistical and Grammatical Learning, Multilingual Matters, 2014, appeared recently in the Applied Linguistics journal’s on line Advance Access. Here’s a brief summary of the review. I’ve taken gross liberties cutting Long’s text, but that’s about all I’ve done: some of it appears below verbatim and the rest is as Long wrote it, but with big bits lopped off. I share Long’s view that this is an important book which deserves our attention, but additionally I personally think that it highlights the weaknesses of attempts made by Larsen-Freeman, Thornbury, Hoey and others to use a garbled version of emergentism to support their views. Rastelli’s hypothesis represents the beginnings of a research programme that could pose a real challenge to Processability Theory, which is the theory most often adopted currently in attempts to explain SLA.

Rastelli’s book is part of the growing research interest in the potential of statistical learning and usage-based accounts of SLA by adults. The general idea is that learners can detect absolute frequencies, probabilistic patterns, and co-occurrences of items in the linguistic environment, and use the resulting information to bootstrap their way into the L2. Statistical learning (SL) is a general learning theory which relies on the construct of a domain-general capacity that operates incidentally, results in implicit knowledge, and functions for all linguistic sub-systems, from phonology, through word learning, morphology and syntax, to pragmatics.

Long says that Stefano Rastelli’s book (henceforth, Discontinuity) is remarkable for 3 things:

  1. its coherence.
  2. The breadth and depth of Rastelli’s knowledge of current theory and research in linguistics, cognitive psychology, neurolinguistics and SLA, and his ability to synthesize and integrate work from all four.
  3. The originality of his perspective.

Rastelli claims that SL is the initial way learners handle combinatorial grammar, i.e., regular co-occurrence relationships between audible or visible forms  that are overt in the input and the meanings and functions of those forms. Because audible or visible and regular, the patterns are frequency driven and countable, which is what SL requires to operate. Combinatorial grammar comprises recurrent combinations of adjacent and non-adjacent whole words and morphemes. The form-function pairs can be stored and retrieved first as wholes, and then broken down into their component parts in order to be computed by abstract rules.

Combinatorial grammar is learned twice, Rastelli claims, first by SL, and then by grammatical learning (GL). This is the meaning of ‘discontinuity’ in his hypothesis. SL prepares the ground for GL: “Statistics provides the L2 grammar the ‘environment’ to grow and develop” (2014: 220). SL involves first a computation over transition probabilities and subsequently bottom-up category formation; GL is achieved through computation over symbolic abstract rules and top-down category formation. GL happens when learners recognize (implicitly) not just regularities in the ways certain words co-occur, but why they co-occur. At that point, they can move beyond statistically based patterns and induce productive combinatorial rules. They can abstract away from particular exemplars that contain regular markings for number, tense, case, etc., now understanding (implicitly, again) that these properties can be applied to new exemplars.

The shift to GL is an abrupt, qualitative change — a rupture, not simply the next stage in a single continuous developmental process.This is one of several places where Rastelli departs from received wisdom in the field. He likens the SLA process to learning to swim, ride a bicycle or ski: progress is initially slow, tentative and uneven, with many failures, not a gradual succession of gradient states, until suddenly, the child (or adult) can swim, ride or ski unaided. This, he claims, is because SLA is quantized. Learners need to encounter a statistically critical number of instances of a form or structure. Once that threshold is crossed, they are able to perceive regularities in the features they share andto conceptualize the motivation behind those regularities, in order to apply a rule over novel instances. The formation of grammatical categories is what triggers discontinuity — sudden quantum leaps from SL to GL.

Crucially, the new grammatical representations do not displace previously acquired statistical rule(s). Rather, the sudden shift to GL is marked by gemination: dual statistical and grammatical representation of an item or structure at two cognitive levels in underlying competence. The two learning processes, SL and GL, and the two mental representations for the same L2 phenomena, statistical rules and grammatical categories, continue to exist side by side.

The continued co-existence of SL and GL has at least two possible neurophysiological explanations. First, implicit and explicit knowledge of the same item coexist, remain independent, and can be accessed independently by speakers (Paradis 2009: 15). Second, although independent, declarative and procedural memory compete and cooperate with one another across a learner’s lifespan. Some parts of the temporal lobe serve as a repository for already proceduralised knowledge, while some areas of the prefrontal cortex are activated when knowledge stored in declarative memory is selected and retrieved. There is also evidence of a direct anatomical connection between the medial temporal lobe and the striatum, that is, the caudate nucleus and putamen in the basal ganglia (Poldrack and Packard 2003: 4), which, says Rastelli, is why Ullman and colleagues believe L2 acquirers can learn the same items by exploiting the resources of both declarative and procedural memory.

The use of ‘quantum’ and ‘quantized’ is deliberate. Rastelli notes that the idea of abrupt discontinuity in SLA parallels the trajectory identified for many phenomena in the natural sciences, and above all in quantum physics and quantum probability theory. A classic example is the finding in quantum physics that electrons do not change their orbit around a nucleus gradually along a continuous gradient-like energy scale with change in proportion to increased energy, but instead ‘jump’ from one energy level to another at the precise moment that the energy supplied is sufficient to reach the threshold required to trigger the change. In just the same way, SLA is quantized; there is no straightforward relationship between increased L2 exposure and L2 development.

So much for combinatorial grammar. Non-combinatorial grammar, in contrast, pertains to invisible features, such as null subjects, filler-gap dependencies, and island constraints on wh- extraction, and phenomena at the discourse-syntax and syntax-semantics interfaces. This means there is nothing overt in the input to combine, and frequency is therefore irrelevant. SL is no use here because learners cannot categorize over absences (empty categories or displaced items). Such items are computed and represented only mentally. Thus, non-combinatorial grammar cannot be acquired via SL. Rastelli predicts, for example, that adult learners of Italian will have more trouble with null subjects than with auxiliaries in compound tenses, not due to differences in their frequency, but because SL can support the procedure for concatenation of co-occurring items (auxiliaries and main verbs), but not for computation of absent items (missing subject pronouns). In the sentence Elena e arrivata (Elena is arrived), e + arrivata is a chunk that may consolidate in a learner’s memory over time and eventually constitute the basis for a productive rule for auxiliary selection. Conversely, the absent pronoun in Elena e arivata ma _ non ha parlato (Elena arrived but [she] did not talk) provides nothing the learner can remember and re-use in similar situations. SL allows the need for some form of the auxiliary verb ‘to be’ eventually to become predictable every time ‘arrived’ appears (and later, other verbs of movement), whereas the presence or absence of a subject pronoun cannot be predicted and must be computed each time. Gemination will occur in the former case, but not in the latter, when GL alone will be pressed into service. If the non-/combinatorial distinction turns out to be valid, Rastelli suggests, it is presumably one of the reasons missing features are problematic and often never acquired by some adult L2ers. Instead of SL, non-combinatorial grammar must be handled by GL, and the capacity for GL differs at the individual level and is more subject than SL to age-effects.

After a discussion of Rastelli’s position on age effects, Long moves to the differences between Rastelli’s hypothesis and other theories of SLA. Rastelli notes how ‘discontinuity’ differentiates his position from that of ‘continuity’ theories, such as Processability Theory, the norm in most SLA theorizing. As should be clear by now, he rejects the notion that L2 development is continuous, a series of incremental shifts (developmental stages) as a result of increased exposure to L2 input, without fractures or leaps:

The core idea of discontinuity is that the process of adult acquisition of L2 grammar is not uniform and incremental but differentiated and redundant. To learn a second language, adults apply two different procedures to the same linguistic materials: redundancy means that the same language items may happen to be learned twice.  (2014: 5)

The SL/GL distinction is qualitative (neurophysiological) in nature. It is not a matter of converting explicit to implicit knowledge (for Rastelli, implicit learning takes precedence, after all), so not a question of automatization of what started life as declarative knowledge, as in Skill Acquisition Theory, and not amenable, therefore, to the use of such measures as processing speed or reaction times. Discontinuity differs from restructuring in that the qualitative shift is not from non-productive to productive use of chunks via practice, but between two neurophysiologically distinct ways of learning that target two different parts of grammar. It shares ground with Ullman’s Declarative/Procedural Model (DPM), but as Rastelli shows through a detailed comparison, differs in important ways, with the discontinuity hypothesis, again, focusing on two kinds of learning processes, rather than two kinds of learning products, the lexicon and the grammar (differentiating between which is in any case far from straightforward), with some L2 grammatical items held to be learned statistically before grammatically. Rastelli also discusses the relevance of work in theoretical linguistics by Berwick, Yang, Roeper, O’Grady, Chomsky, Pinker, Grodzinsky, Hawkins, Tsimpli and others, the discontinuity hypothesis being shown to constitute a ‘semi-modular’ position in which categorical grammar relies on innate principles, while probabilistic grammars can be learned from positive evidence alone. Work of SLA scholars considered includes that of Bley-Vroman, N. Ellis, Wray, Sorace, Pienemann, Sharwood-Smith, Paradis, Slabakova, White, Ullman, Montrul, Robinson, Newport, and Williams.

Despite its broad scope and the obvious interest in similarities and differences between his own position and that of other theorists, Rastelli denies that Discontinuity offers a new theory of SLA:

Crucially, the word ‘theory’ is avoided purposely in this book . . . Basically, there cannot be a theory of discontinuity yet because the evidence provided so far can be interpreted in different ways . . . An expression such as ‘discontinuity hypothesis’ better conveys the image of the embryonic stage of a prospective theory of discontinuity. (2014: 6)

Nevertheless, the hypothesis he proposes is unquestionably innovative, and likely to motivate several new lines of empirical work. It will probably be regarded as (healthily) controversial in some quarters, but is without doubt an exceptionally interesting and intellectually refreshing contribution to the current SLA literature.



Abrahamsson, N. and K. Hyltenstam. 2009. ‘Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny,’Language Learning 59: 249-306.

Aslin, R. N. and E. L. Newport. 2012. ‘Statistical learning: From acquiring specific items to forming general rules,’Psychological Science 21/3: 170-76.

Aslin, R.N. and E. L. Newport. 2014. ‘Distributional language learning: Mechanisms and models of category formation,’ Language Learning 64/1: 86-105.

Berwick, R. C. 1997. ‘Syntax facitsaltum: Computation and the genotype and phenotype of language,’ Journal of Neurolinguistics 10/2-3: 231-49.

DeKeyser, R. M. 2000. ‘The robustness of critical period effects in second language acquisition,’ Studies in Second Language Acquisition 22/4: 499-533.

Ellis, N. C. 2002. ‘Frequency effects in language acquisition: A review with implications for theories of implicit and explicit language acquisition,’ Studies in Second Language Acquisition 24/1: 143-88.

Ellis, N. C. 2006. ‘Language acquisition as rational contingency learning,’ Applied Linguistics 27/1: 1-24.

Ellis, N. C. 2009. ‘Optimizing the input: Frequency and sampling in usage-based and form-focused learning’ in M. H. Long, andC. J. Doughty (eds): The Handbook of Language Teaching. Blackwell, pp. 139-58.

Ellis, N. C. and S. Wulff. 2015. ‘Usage-based approaches to SLA’ in B. VanPatten, and J. Williams (eds.):Theories in Second Language Acquisition. An introduction. 2nd edition. Lawrence Erlbaum, pp. 75-93.

Hilles, S. 1986. ‘Interlanguage and the pro-drop parameter,’ Second Language Research 2/1: 33-51.

Granena, G. and M. H. Long.2013. ‘Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains,’Second Language Research 29/3: 311-43.

Hamrick, P. 2014. ‘A role for chunk formation in statistical learning of second language syntax,’Language Learning 64/2: 247-78.

Janacsek, K., J. Fiser, and D. Nemeth. 2012. ‘The best time to acquire new skills: age-related differences in implicit sequence learning across the human lifespan,’Developmental Science 15/4: 496-505.

Munnich, E. and B. Landau. 2010. ‘Developmental decline in the acquisition of spatial language,’Language Learning and Development 6/1: 32-59.

Nemeth, D., K. Janacsek, and J. Fiser. 2013. ‘Age-dependent and coordinated shift in performance between implicit and explicit skill learning,’Frontiers in Computational Neuroscience 7/147: 1-13.

Osterhout, L., A. Poliakov, K. Inoue, J. McLaughlin, G. Valentine, L. Pitkanen, C. Frenck-Mestre, and J. Hirschensohn. 2008. ‘Second-language learning and changes in the brain,’ Journal of Neurolinguistics 21: 509-21.

Paradis, M. 2009. Declarative and procedural determinants of second languages. John Benjamins.

Poldrack, R. A. andM. G. Packard. 2003. ‘Conpetition among multipl memory systems: Converging evidence from animal and human brain studies,’ Neuropsychologia1497: 1-7.

Rebuschat, P. (ed). 2015. Implicit and Explicit Learning of Languages. John Benjamins.

Rebuschat, P. and J. N. Williams (eds).­ 2012. Statistical Learning and Language Acquisition. Walter de Gruyter.

Robinson, P. andN. C. Ellis (eds). 2008. Handbook of Cognitive Linguistics and Second Language Acquisition.Routledge.

Saffran, J. R. 2003. ‘Statistical language learning: Mechanisms and constraints,’Current Directions in Psychological Science 12: 110-14.

Saffran, J. R., E. L. Newport, and R. N. Aslin. 1996. ‘Word segmentation: The role of distributional cues,’Journal of Memory and Language 35: 606-21.

Spadaro, K. 2013. ‘Maturational constraints on lexical acquisition in a second language’ in G. Granena, and M. H. Long (eds): Sensitive Periods, Language Aptitudes, and Ultimate L2 Attainment. John Benjamins, pp. 43-68

Tanner, D., K. Inoue, and L. Osterhout. 2014. ‘Brain-based individual differences in on-line L2 grammatical comprehension,’ Bilingualism: Language and Cognition 17: 277-93.

Williams, J. N. 2009. ‘Implicit learning’ inW. C.Ritchie, and T. K. Bhatia (eds):The New Handbook of Second Language Acquisition. Emerald Group Publishing, pp. 319-53.


Harmer on Brexit: Version 2


Harmer’s response on Facebook to the UK referendum result blames “the sclerotic elderly” and “an angry working class” for what he considers to be a catastrophic decision. He predicts it will soon result in “prime minister Farage, President Le Pen and a right-wing surge across the continent with a rise in racist violence and the gradual growth of intolerance and misunderstanding.” Harmer expresses “a profound loathing for the people who have led this despicable isolationist and backward-looking movement.” While these loathsome people hail “a new dawn”, Harmer sees “nothing but a great darkness settle over the land.”

Nearly 5,000 people “Liked” Harmer’s text and I think it’s curious that while nobody raises any objection to Harmer’s declaration of “a profound loathing” for the leaders of the “Leave” campaign, many strongly object to my repeated criticisms of the style and content of Harmer’s writing. That aside, I suggest that Harmer’s “Apology to Europe” is over-emotional and badly argued. It’s highly unlikely that Farage will become UK prime minister, or that Le Pen will become President of France. If there is “a right-wing surge across the continent, … “,  etc., etc., it won’t be the fault of those in the UK who voted “Leave”, but rather of the racists themselves and of the economic conditions which provide fertile ground for the spread of such beliefs. There’s been a spike in race-hate complaints since 23rd June, no doubt because racists feel emboldened, and thus there is some justification for fears of the right wing surge, etc. which Harmer predicts. I can understand why so many ordinary people are very upset by the result, and I think there are certainly reasons to be worried about what happens next. But we don’t know what will happen, and in my opinion Harmer’s reaction is simplistic, unreasonable and unhelpful.

It’s also worth pointing out that not everybody who voted “Leave” was a racist, or old, or working class – many wanted to leave the EU because there are a great many things wrong with its institutions and because the policies carried out by the unelected EU Commission and the Council of Ministers, with little control from the European Parliament, have caused a great deal of hardship. The Common Agricultural policy, at one point responsible for 60% of the total EU budget, was for decades a wasteful disaster which did much to damage good farming practices. The budget deficit limits imposed by the 1992 Maastricht treaty, triggered a wave of unemployment and welfare cuts across the continent. After that, the financial sector was increasingly de-regulated, and, with increasing pressure from Germany and France, the euro was introduced as a common currency, making it impossible for weaker members to use their own currency as a tool to manage their economic affairs. During the first decade of monetary union, weaker European economies were subjected to a wave of cheap credit from banks of the most powerful states. When the global crisis erupted, banking bailouts, rising social spending and sharp declines in tax revenue sparked a debt crisis in countries such as Greece, Portugal and Spain. The EU Commission responded by imposing severe austerity on Greece and doing everything possible to bring down the Syriza government, a demonstration of the Troika’s determination to maintain a system of austerity across the region. The recently passed European Fiscal Compact further limits state spending across the euro zone.

So the EU is a deeply undemocratic organisation that promotes and protects the interests of its members’ ruling classes. On the other hand, there’s no doubt that the UK “Leave” campaign was fuelled by ugly racism and absurd “Little Englander” propaganda, and there’s no obvious reason to think that things will be better in the UK or anywhere else as a result of the decision of the UK to leave.

Rather than react as Harmer has done, we should surely concentrate on promoting grass roots democratic organisations that fight for people’s rights wherever they are. The gap between rich and poor is widening, and there’s little reason to believe that if the UK had remained in the EU things would have improved for most of its inhabitants. In the UK today, 63 per cent of poor children grow up in families where one member is working. More than 600,000 residents of Manchester, are “experiencing the effects of extreme poverty” and 1.6 million are slipping into penury. The situation in other EU countries is even worse and the inability of EU members to use their own separate currencies as a way of dealing with economic problems, coupled with the policy of austerity imposed by the Troika, makes it likely that things will get worse before they get better.  What’s done is done; however regrettable you might think it is, I suggest that it can be seen not just as a worrying  threat, but also as an opportunity.


I apologise to those who wrote the works below for not citing them properly in my text.

Observations on Brexit  http://www.wsm.ie/c/anarchist-observations-brexit-lexit-uk-eu-referendum

The left wing case for quitting  the EU http://londonprogressivejournal.com/article/view/2300

John Pilger: Why the British said no to Europe  http://johnpilger.com/articles/why-the-british-said-no-to-europe

IATEFL 2016 Plenary. Scott Thornbury: The Entertainer


So, without more ado, ladies and gentlemen, please put your hands most forcefully together and give it up for the one, the only, the inimitable, the ever-so wonderful ……………… Scott Thornbury!!

And on he walks.

He looks good; he looks fit, well turned out, up for it. Rather than hide behind the lectern and read from a script, he roams the whole expanse of the colossal stage with practised ease, expertly addressing different sections of the huge auditorium , bringing everybody into the warm glow. He starts brilliantly. He puts the years of important signposts of his life on the screen:

  • 1950
  • 1975
  • 1997
  • 2004

and asks for suggestions as to what happened to him in those years.

“Uh oh! There’s “an element” in here today”, he says in response to a group on the right of the hall that’s having fun calling out the wrong answers to his elicitations.

His voice is warm, fruity, well-modulated, and it comes across perfectly, helped by a good PA system and by the fact that the enormous hall is packed with people. Of the IATEFL conference talks I saw on line, there was something near gender equality as far as quality of presentation is concerned, but nobody else reached Scott’s standard. John Faneslow used to be able to put him in the shade, and Michael Hoey on a good day came close, but these days, Scott’s unrivalled: he’s The Entertainer.

And it’s not just the way he performs of course – the best stand up artist depends on his or her material, right? Scott’s plenary had some very good material, and, what’s more, the content was both coherent and cohesive. Scott led us through 50 years of ELT history pointing out that really there’s nothing new under the sun; that we made lots of mistakes, that some “methods” look really weird today, while others that we think of as new were already there in the 60s, and so on.

Having arrived in his history of ELT at 1975, Scott highlighted the publication of the Strategies series of courseboooks, which he describes as “revolutionary”, since they were the first pedagogical material to be based not on grammatical structures but rather, on functions; and the first to be based not on what the language is, but rather on what you do with it.  At this point in the history, Scott came to the main part of his argument.

Two Kinds of Discourse

He suggests that two “intertwining but not interconnecting” discourses can be detected. On the one hand, there’s the “old view” that informs the various methodologies associated with grammar-based teaching. On the other, there’s the “new discourse”, which comes from a functional approach to language  and a more sociolinguistic view of language learning

In the figure below, the “old” view is on the left, and the “new” view is on right. From the top, the categories are:

  • the nature of language
  • units of language acquisition
  • the nature of learning
  • learning path
  • goals.


Scott suggests that the “Strategies” series of coursebooks resolves the argument between these 2 views in favour of the view on the right. Obviously, Scott likes the “new” view, so he was excited when the Strategies series was published – he felt he was at the dawn of a new age of ELT. But, Scott goes on to say, the matter wasn’t in fact resolved: current ELT practice has reverted to reflect the old view. Today, a grammar-based syllabus is used extensively in the global ELT industry.

So, what happened? Why didn’t things change? Why did the old discourse win out? A particularly important question is: Why does the grammar-based syllabus still reign despite clear findings from SLA research? Scott pointed out that SLA research suggests that teachers can’t affect the route of L2 development in any significant way: the inbuilt syllabus triumphs. Grammatical syllabuses fly in the face of the results of SLA research.

Scott showed results from a survey he did of more than 1,000 teachers, which showed that most teachers say they use a grammar based syllabus because students want it. In a way, they blame the students for an approach they say they’re not entirely happy with.

Despairing of finding a solution inside the ELT world, Scott thought maybe he should look at general education. But, when he took a look, he discovered that things in general education are “terrible”. Everywhere knowledge is being broken down into tiny little bits which can then be tested.  He comments: “There’s something really unhealthy in main stream education and it’s exacerbated by a discourse that’s all about McNuggets again.”

Scott then quoted Lin (2013)

“Language teaching is increasingly prepackaged and delivered as if it were a standardised, marketable product…”

“This commodifying ideology of language teaching and learning has gradually penetrated into school practices, turning teachers into ‘service providers’.”

So what’s the solution, then? Determined not to end on such a pessimistic note, Scott suggested three endings:

  1. The pragmatic route
  2. The dogmatic route
  3. The dialectic route

The Pragmatic Route says: Accept things the way they are and get on with it.

The Dogmatic (or Dogmetic!) Route says: Get rid of the coursebook, use communicative activities, and shape the language which emerges from genuine attempts at communication. Unfortunately, Scott said, this will never be really popular; at most it will be a footnote in Richards and Rogers. A more extreme route says get rid of the teacher. This isn’t an entirely silly suggestion, but again, it’s unlikely to be widely adopted.

The dialectic route tries, as in the Hegelian model, to overcome the limitations of the thesis and its antithesis by meshing the best from both. Here Scott gave two examples:

  • Language in The Wild. Used in Scandinavia. Students do classes but they’re sent out into the real world to do things like shopping.
  • The Hands Up Project.  Children who can’t get out of the classroom, such as children trapped in Gaza, are taught English by using technology to drive a communicative language learning approach.

The video of Nick in the UK interacting with some lovely kids in Gaza made a very uplifting ending to the talk.


I have two criticisms of Scott’s argument, one minor, one more important:

  1. The presentation of the two “intertwining but not interconnecting discourses” doesn’t do a good job of summarising differences between grammar-based ELT and a version of communicative language teaching that emphases interaction, student-centred learning, task-based activities, locally-produced materials, and communication for meaningful purposes.
  2. Scott’s framing of and solution to the problem of the grammar based syllabus is a cop out.

As to the first problem, Scott’s summary of the old and new, intertwined but not interconnected discourses has its limitations. The first three categories are not well-labelled, in my opinion. Language is not cognitive or social: the differences between grammatical and functional descriptions of language, or between cognitive and sociolinguistic approaches to SLA, are hardly well captured in this diagram.

Then, what are “units of acquisition”? How does the contrast between grammar Mcnuggets and communicative routines explain different conceptualisations of these “units”? What does “the nature of learning” refer to? What do “atomistic” and “holistic” mean here?  And while the fourth and fifth labels are clear enough, they’re false dichotomies; grammar-based teaching was and is concerned with promoting fluency, and communicative competence.

I think it would have been better to have used a framework like Breen’s (1984) to compare and contrast the syllabus types under scrutiny, asking of each one

  1. What knowledge does it focus on and prioritise?
  2. What capabilities does it focus on and prioritise?
  3. On what basis does it divide and sub-divide what is to be learned?
  4. How does it sequence what is to be learned?
  5. What is its rationale?

That way Scott could have looked at a grammar-based, or structural syllabus, a functional syllabus, like the one effectuated in Strategies, and a CLT syllabus as enacted in Dogme. That way, he could have dealt with the serious limitations of the Strategies approach and he could have dealt properly with his own approach. Which brings me to the more important criticism.

Face The Problem

The problem ELT faces is not “How do we resolve the tensions between two different discourses?”; rather it’s the problem which Scott clearly stated and then adroitly side-stepped on his way to a typically more anodyne, less controversial, resolution. The real problem is:

How can we combat the commodifying ideology of language teaching and learning which has turned teachers into ‘service providers’ who use coursebooks to deliver language instruction as if it were a standardised, marketable product?  

And the solution, of course, is radical change.

Decentralise. Organise teaching locally. Get rid of the coursebook. Reform the big testing authorities. Reform CELTA. Etc., etc..

Why did Scott side-step all these issues? Why, having clearly endorsed the findings of SLA research which show up the futility of a grammar based syllabus, and having shown how “really unhealthy” current ELT practice is, did Scott not argue the case for Dogme, or for Long’s version of TBLT, or for a learner-centred approach? Why did he not argue for reform of the current tests that dominate ELT, or of CELTA ?  Why did Scott dismiss his own approach, Dogme, as deserving no more than a footnote in Richards and Rogers, instead of promoting it as a viable alternative to the syllabus type that he so roundly, and rightly criticised?

Maybe, as he said, it was the end of the conference and he didn’t want to be gloomy. Or maybe it’s because he’s The Entertainer and that part of him got the better of the critical thinker and the reformer in him. If so, it’s a darn shame, however much fun it was to watch the performance.


Breen, M.P. (1984) Process syllabuses for the language classroom. In C.J.Brumfit (Ed.).  General English Syllabus Design. ELT Documents No. 118. London: Pergamon Press & The British Council. 47-60.

Lin, A. 2013. Toward paradigmatic change in TESOL methodologies: building plurilingual pedagogies from the ground up, TESOL Quarterly, 47/3.

Larsen Freeman’s IATEFL 2016 Plenary: Shifting metaphors from computer input to ecological affordances


In her plenary talk, Larsen Freeman argued that it’s time to replace “input-output metaphors” with “affordances”. The metaphors of input and output belong to a positivist, reductionist  approach to SLA which needs to be replaced by “a new way of understanding” language learning based on Complexity Theory.

Before we look at Larsen Freeman’s new way of understanding, let’s take a quick look at what she objects to by reviewing one current approach to understanding the process of SLA.

Interlanguage and related constructs 

There’s no single, complete and generally agreed-upon theory of SLA, but there’s a widespread view that second language learning is a process whereby learners gradually develop their own autonomous grammatical system with its own internal organising principles. This system is referred to as “interlanguage”.  Note that “interlanguage” is a theoretical construct (not a fact and not a metaphor) which has proved useful in developing a theory of some of the phenomena associated with SLA; the construct itself needs further study and the theory which it’s part of  is incomplete, and possibly false.

Support for the hypothesis of interlanguages comes from observations of U-shaped behaviour in SLA, which indicate that learners’ interlanguage development is not linear. An example of U-shaped behaviour is this:


The example here is from a study in the 70s. Another example comes from morphological development, specifically, the development of English irregular past forms, such as came, went, broke, which are supplanted by rule-governed, but deviant past forms: comed, goed, breaked. In time, these new forms are themselves replaced by the irregular forms that appeared in the initial stage.

This U-shaped learning curve is observed in learning the lexicon, too, as Long (2011) explains. Learners have to master the idiosyncratic nature of words, not just their canonical meaning. While learners encounter a word in a correct context, the word is not simply added to a static cognitive pile of vocabulary items. Instead, they experiment with the word, sometimes using it incorrectly, thus establishing where it works and where it doesn’t. The suggestion is that only by passing through a period of incorrectness, in which the lexicon is used in a variety of ways, can they climb back up the U-shaped curve. To add to the example of feet above, there’s the example of the noun shop. Learners may first encounter the word in a sentence such as “I bought a pastry at the coffee shop yesterday.” Then, they experiment with deviant utterances such as “I am going to the supermarket shop,” correctly associating the word ‘shop’ with a place they can purchase goods, but getting it wrong. By making these incorrect utterances, the learner distinguishes between what is appropriate, because “at each stage of the learning process, the learner outputs a corresponding hypothesis based on the evidence available so far” (Carlucci and Case, 2011).


The re-organisation of new information as learners move along the U-shaped curve is a characteristic of interlanguage development. Associated with this restructuring is the construct of automaticity. Language acquisition can be seen as a complex cognitive skill where, as your skill level in a domain increases, the amount of attention you need to perform generally decreases . The basis of processing approaches to SLA is that we have limited resources when it comes to processing information and so the more we can make the process automatic, the more processing capacity we free up for other work. Active attention requires more mental work, and thus, developing the skill of fluent language use involves making more and more of it automatic, so that no active attention is required. McLaughlin  (1987) compares learning a language to learning to drive a car. Through practice, language skills go  from a ‘controlled process’ in which great attention and conscious effort is needed to an ‘automatic process’.

Automaticity can be said to occur when associative connections between a certain kind of input and output pattern occurs. For instance, in this exchange:

  • Speaker 1: Morning.
  • Speaker 2: Morning. How are you?
  • Speaker 1: Fine, and you?
  • Speaker 2: Fine.

the speakers, in most situations, don’t actively think about what they’re saying. In the same way, second language learners’ learn new language through use of controlled processes, which become automatic, and in turn free up controlled processes which can then be directed to new forms.


There is a further hypothesis that is generally accepted among those working on processing models of SLA, namely that L2 learners pass through developmental sequences on their way to some degree of communicative competence, exhibiting common patterns and features across differences in learners’ age and L1, acquisition context, and instructional approach. Examples of such sequences are found in the well known series of morpheme studies; the four-stage sequence for ESL negation; the six-stage sequence for English relative clauses; and the sequence of question formation in German (see Long, 2015 for a full discussion).

Development of the L2 exhibits plateaus, occasional movement away from, not toward, the L2, and U-shaped or zigzag trajectories rather than smooth, linear contours. No matter what the learners’ L1 might be, no matter what the order or manner in which target-language structures are presented to them by teachers, learners analyze the input and come up with their own interim grammars, and they master the structures in roughly the same manner and order whether learning in classrooms, on the street, or both. This led Pienemann to formulate his learnability hypothesis and teachability hypothesis: what is processable by students at any time determines what is learnable, and, thereby, what is teachable (Pienemann, 1984, 1989).

All these bits and pieces of an incomplete theory of L2 learning suggest that learners themselves, not their teachers, have most control over their language development. As Long (2011) says:

Students do not – in fact, cannot – learn (as opposed to learn about) target forms and structures on demand, when and how a teacher or a coursebook decree that they should, but only when they are developmentally ready to do so. Instruction can facilitate development, but needs to be provided with respect for, and in harmony with, the learner’s powerful cognitive contribution to the acquisition process.

Let me emphasise that the aim of this psycholinguistic research is to understand how learners deal psychologically with linguistic data from the environment (input) in order to understand and transform the data into competence of the L2. Constructs such as input, intake, noticing, short and long term memory, implicit and explicit learning, interlanguage, output, and so on are used to facilitate the explanation, which takes the form of a number of hypotheses. No “black box” is used as an ad hoc device to rescue the hypotheses. Those who make use of Chomsky’s theoretical construct of an innate Language Acquisition Device in their theories of SLA do so in such a way that their hypotheses can be tested. In any case, it’s how learners interact psychologically with their linguistic environment that interests those involved in interlanguage studies. Other researchers look at how learners interact socially with their linguistic environment, and many theories contain both sociolinguistic and psycholinguistic components.

So there you are. There’s a quick summary of how some scholars try to explain the process of SLA from a psychological perspective. But before we go on, we have to look at the difference between metaphors and theoretical constructs.

Metaphors and Constructs

A metaphor is a figure of speech in which a word or phrase denoting one kind of object or idea is used in place of another to suggest a likeness or analogy between them. She’s a tiger. He died in a sea of grief. To say that “input” is a metaphor is to say that it represents something else, and so it does. To say that we should be careful not to mistake “input” for the real thing is well advised. But to say that “input” as used in the way I used it above is a metaphor is quite simply wrong. No scientific theory of anything uses metaphors because, as Gregg (2010) points out

There is no point in conducting the discussion at the level of metaphor; metaphors simply are not the sort of thing one argues over. Indeed, as Fodor and Pylyshyn (1988: 62, footnote 35) say, ‘metaphors … tend to be a license to take one’s claims as something less than serious hypotheses.’ Larsen-Freeman (2006: 590) reflects the same confusion of metaphor and hypothesis: ‘[M]ost researchers in [SLA] have operated with a “developmental ladder” metaphor (Fischer et al., 2003) and under certain assumptions and postulates that follow from it …’ But of course assumptions and postulates do not follow from metaphors; nothing does.

In contrast, theoretical constructs such as input, intake, noticing, automaticity, and so on, define what they stand for, and each of them is used in the service of exploring a hypothesis or a more general theory. All of the theoretical constructs named above, including “input”, are theory-laden: they’re terms used in a special way in the service of the hypothesis or theory they are part of,  and their validity or truth value can be tested by appeals to logic and empirical evidence. Some constructs, for example those used in Krashen’s theory, are found wanting because they’re so poorly-defined as to be circular. Other constructs, for example noticing, are the subject of both logical and empirical scrutiny. None of these constructs is correctly described as a metaphor, and Larsen Freeman’s inability to distinguish between a theoretical construct and a metaphor plagues her incoherent argument.  In short: metaphors are no grounds on which to build any theory, and dealing in metaphors assures that no good theory will result.

Get it? If you do, you’re a step ahead of Larsen Freeman, who seems to have taken several steps backwards since, in 1991, she co-authored, with Mike Long, the splendid An introduction to second language acquisition research.

Let’s now look at what Larsen Freeman said in her plenary address.

The Plenary

Larsen Freeman read this out:


Then, with this slide showing:


she said this:

Do we want to see our students as black boxes, as passive recipients of customised input, where they just sit passively and receive? Is that what we want?

Or is it better to see our learners as actively engaged in their own process of learning and discovering the world finding excitement in learning and working in a collaborative fashion with their classmates and teachers?

It’s time to shift metaphors. Let’s sanitise the language. Join with me; make a pledge never to use “input” and “output”.

You’d be hard put to come up with a more absurd straw man argument; a more trivial treatment of a serious issue. Nevertheless, that’s all Larsen Freeman had to say about it.


With input and output safely consigned to the dustbin of history, Larsen Freeman moved on to her own new way of understanding. She has a “theoretical commitment” to complexity theory, but, she said:

If you don’t want to take my word for it that ecology is a metaphor for now, .. or complexity theory is a theory in keeping with ecology, I refer you to your own Stephen Hawkins, who calls this century “the century of complexity.”

Well, if the great Stephen Hawkins calls this century “the century of complexity”, then  complexity theory must be right, right?

With Hawkins’ impressive endorsement in the bag, and with a video clip of a flock of birds avoiding a predator displayed on her presentation slide, Larsen Freeman began her account of the theory that she’s now so committed to.


She said:

Instead of thinking about reifying and classifying and reducing, let’s turn to the concept of emergence – a central theme in complexity theory. Emergence is the idea that in a complex system different components interact and give rise to another pattern at another level of complexity.

A flock of birds part when approached by a predator and then they re-group. A new level of complexity arises, emerges, out of the interaction of the parts.

All birds take off and land together. They stay together as a kind of superorganism. They take off, they separate, they land, as if one.

You see how that pattern emerges from the interaction of the parts?

Notice there’s no central authority: no bird says “Follow me I’ll lead you to safety”; they self organise into a new level of complexity.

What are the levels of complexity here? What is the new level of complexity that emerges out of the interaction of the parts? Where does the parting and reformation of the flock fit in to these levels of complexity? How is “all birds take off and land together” evidence of a new level of complexity?

What on earth is she talking about? Larsen Freeman constantly gives the impression that she thinks what she’s saying is really, really important, but what is she saying? It’s not that it’s too complicated, or too complex; it’s that it just doesn’t make much sense. “Beyond our ken”, perhaps.


The next bit of Larsen Freeman’s talk that addresses complexity theory was introduced by reading aloud this text:


After which she said:

Natural themes help to ground these concepts. …………….

I invite you to think with me and make some connections. Think about the connection between an open system and language. Language is changing all the time, its flowing but it’s also changing. ………………

Notice in this eddy, in this stream, that pattern exists in the flux, but all the particles that are passing through it are constantly changing.  It’s not the same water, but it’s the same pattern. ………………………..

So this world (the stream in the picture) exists because last winter there was snow in the mountains. And the snow pattern accumulated such that now when the snow melts, the water feeds into many streams, this one being one of them. And unless the stream is dammed, or the water ceases, the source ceases, the snow melts, this world will continue. English goes on, even though it’s not…. the English of Shakespeare and yet it still has the identity we know and call English. So these systems are interconnected both spatially and temporally, in time. 

Again, what is she talking about? What systems is she talking about? What does it all mean? The key seems to be “patterns in the flux”, but then, what’s so new about that?

At some point Larsen Freeman returned to this “patterns in the flux” issue. She showed a graph of the average performance of a group of students which indicated that the group, when seen as a whole, had made progress. Then she showed the graphs of the individuals who made up the group and it became clear that one or two individuals hadn’t made any progress. What do we learn from this? I thought she was going to say something about a reverse level of complexity, or granularity, or patterns disappearing from the flux from a lack of  interaction of the parts, or something.  But no. The point was:

When you look  at group average and individual performance, they’re different.

Just in case that’s too much for you to take in, Larsen Freeman explained:

Variability is ignored by statistical averages. You can make generalisations about the group but don’t assume they apply to individuals. Individual variability is the essence of adaptive behaviour. We have to look at patterns in the flux. That’s what we know from a complexity theory ecological perspective.


Returning to the exposition of complexity theory, there’s one more bit to add: adaptiveness. Larsen Freeman read aloud the text from this slide


The example is the adaptive immune system, not the innate immune system, the adaptive one. Larsen Freeman invited the audience to watch the video and see how the good microbe got the bad one, but I don’t know why. Anyway, the adaptive immune system is an example of a system that is nimble, dynamic, and has no centralised control, which is a key part of complexity theory.

And that’s all folks! That ‘s all Larsen Freeman had to say about complexity theory: it’s complex, open and adaptive. I’ve rarely witnessed such a poor attempt to explain anything.


Then Larsen Freeman talked about affordances. This, just to remind you, is her alternative to input.

There are two types of affordances

  1. Property affordances. These are in the environment. You can design an affordance. New affordances for classroom learning include providing opportunities for engagement; instruction and materials that make sure everybody learns; using technology.
  2. Second Order Affordances. These refer to the learner’s perception of and relation with affordances. Students are not passive receivers of input. Second order affordances Include the agent, the perceiver, in the system. Second order affordances are dynamic and adaptive; they emerge when aspects of the environment are in interaction with the agent. The agent’s relational stance to the property affordances is key. A learner’s perception of and interaction with the environment is what creates a second order affordance.

To help clarify things, Larsen Freeman read this to the audience:


(Note here that their students “operate between languages”, unlike mine and yours (unless you’ve already taken the pledge and signed up) who learn a second or foreign language. Note also that Thoms calls “affordance” a construct.)

If I’ve got it right, “affordances” refer first to anything in the environment that might help learners learn, and second to the learner’s relational stance to them. The important bit of affordances is the relational stance  bit: the learner’s perception of, and interaction with, the environment. Crucially, the learner’s perception of the affordance opportunities, has to be taken into account. “Really?” you might say, “That’s what we do in the old world of input too – we try to take into account the learner’s perception of the input!”

Implications for teaching

Finally Larsen Freeman addresses the implications of her radical new way of understanding for teaching.

Here’s an example. In the old world which Larsen Freeman is so eager to leave behind, where people still understand SLA in terms of input and output, teachers use recasts. In the shiny new world of complexity theory and emergentism, recasts become access-creating affordances.


Larsen Freeman explains that rather than just recast, you can “build on the mistake” and thus “manage the affordance created by it.”

And then there’s adaption.


Larsen Freeman refers to the “Inert Knowledge Problem”: students can’t use knowledge learned in class when they try to operate in the real world. How, Larsen Freeman asks, can they adapt their language resources to this new environment?  Here’s what she says:

So there’s a sense in which a system like that is not externally controlled through inputs and outputs but creates itself. It holds together in a self-organising manner – like the bird flock –  that makes it have its individuality and directiveness in relation to the environment.  Learning is not the taking in of existing forms but a continuing dynamic adaptation to context which is always changing  In order to use language patterns , beyond a given occasion, students need experience in adapting to multiple and variable contexts.

“A system like that”??  What system is she talking about? Well it doesn’t really matter, does it, because the whole thing is, once again, beyond our ken, well beyond mine, anyway.

Larsen Freeman gives a few practical suggestions to enhance our students’ ability to adapt, “to take their present system and mold (sic) it to a new context for a present purpose.”

You can do the same task in less time.

Don’t just repeat it, change the task a little bit.

Or make it easier.

Or give them a text to read.

Or slow down the recording.

Or use a Think Aloud technique in order to freeze the action, “so that you explain the choices that exist”. For example:

If I say “Can I help you?”, the student says:

“I want a book.”

and that might be an opportunity to pause and say:

“You can say that. That’s OK; I understand your meaning.”

But another way to say it is to say

“I would like a book.”

Right? To give information. Importantly, adaptation does not mean sameness, but we are trying to give information so that students can make informed choices about how they wish to be, um,… seemed.

And that was about it. I don’t think I’ve left any major content out.


This is the brave new world that two of the other plenary speakers – Richardson and Thornbury – want to be part of. Both of them join in Larsen Freeman’s rejection of the explanation of the process of SLA that I sketched at the start of this post, and both of them are enthusiastic supporters of Larsen Freeman’s version of complexity theory and emergentism.

Judge for yourself.      



Carlucci, L. and Case, J. (2013) On the Necessity of U-Shaped Learning.  Topics in Cognitive Science, 5. 1,. pp 56-88.

Gregg, K. R. (2010) Shallow draughts: Larsen-Freeman and Cameron on complexity. Second Language Research, 26(4) 549–56.

McLaughlin, B. (1987) Theories of Second Language Learning.  London: Edward Arnold.

Pienemann, M. (1987) Determining the influence of instruction on L2 speech processing. Australian Review of Applied Linguistics 10, 83-113.

Pienemann, M. (1989) Is language teachable? Psycholinguistic experiments and hypotheses. Applied Linguistics 10, 52-79.

IATEFL 2016 Plenary by Silvana Richardson: The Case for NNESTs


Richardson’s plenary was a well-prepared, well-delivered, passionate plea for an end to the discrimination against Non-Native English Speaking Teachers (NNESTs). It was a great talk; it’s received lots of praise from others and doesn’t need more from me. So I just want to indicate a few points where I think Richardson over-egged the pudding. Given that there’s so much compelling evidence to support her case, there’s really no need to spoil it by giving a distorted picture of current SLA research.

Here are the slides I object to:


The Monolingual Bias of SLA

Very few SLA researchers today assume that NS is the “best model”; or that NSA is the best route;  or that a NS is the best teacher. It’s simply not true.

Nor is it true that most SLA researchers view the L1 as “an obstacle”.



Generative Grammar

Chomsky’s theory of generative grammar (UG) gives no support whatsoever to any argument in favour of monolingualism or Native Speakerism. Chomsky’s choice of a limited domain was based on considerations of scientific theory construction; to suggest that UG theory is ideologically biased in such a way that it’s somehow contributed to discrimination against NNESTs is plain silly. 



Cognitivist Theories of SLA

A cognitivist approach to SLA research can’t fairly be used as evidence for the “Supremacy of the ‘mono’”, whatever that means. Richardson suggested that the image of the tunnel in the slide indicates how narrow, dark and confining the “cognitivist theoretical space” is. But “cognitivist” approaches take many forms, and indeed one of those forms is emergentism, which Richardson seems happy to endorse. Unless more information is given about what’s being referred to, the sweeping assertion that a cognitivist approach to research leads to “narrow approaches to teaching learning and teacher education” is unwarranted. It’s equally unwarranted to assert that a cognitivist aproach to SLA leads necessarily to native speakerism, monolingualism or monoculturalism.



Task-Based LT and the Lexical Approach

Neither task-based language teaching (TBLT) nor the lexical approach tries to “thrust a monolingual approach upon the world”. What unites the very different views of proponents of TBLT and Lexical approaches, such as Willis, Long, Nunan, Skehan (TBLT) and Lewis and Dellar (Lexical Approach), is their commitment to the fight for equal rights for NNESTs.    


Paradigm Shifts

Along the course of her talk, Richardson gives recurrent indications that she has a poor view of current SLA research, which might explain why, in her eagerness to promote her cause, she chooses some dubious bedfellows. Richardson suggests that a paradigm shift from “SLA” to “Plurilingual Development” will somehow usher in a new world of ELT practice where NNESTs are no longer discriminated against. This naïve view rests on attributing ideological positions to the two “sides”, such that those involved in current “cognitivist” SLA research are regarded as conservative reactionaries who support the status quo, while those promoting the shift to “Plurilingual Development” are seen as a liberating vanguard. The most cursory examination of the political views of education and social organisation expressed by those in the two camps will quickly show this up for the falsehood that it is. 

And then there’s the small matter of academic excellence and the pursuit of knowledge to be considered. I suggest that Richardson watches the video recording of Larsen Freeman’s IATEFL 2016 plenary and then reads Larsen-Freeman and Cameron (2008) Complex systems and applied linguistics. I think she’ll be struck by the woeful lack of clarity and the poor standards of scholarship and argumentation displayed. She might then like to compare these examples of a “Plurilingual Development” paradigm with the work of those working in the current “cognitivist” SLA paradigm; as an almost random example: Cook and Singleton (2014) Key Topics in Second Language Acquisition. 

There are plenty of things to criticise about the state of SLA theory, but they don’t include an insensitivity to the cause of NNESTs, and the wide range of research projects currently being pursued don’t deserve to be lumped together and given the careless treatment they get here.   

I’ll post a full review of Larsen Freeman’s plenary next week.     

A New Term Starts!


Here we go again – a new term is starting at universities offering Masters in TESOL or AL, so once again I’ve moved this post to the front.

Again, let’s run through the biggest problems students face: too much information; choosing appropriate topics; getting the hang of academic writing.

1. Too much Information.

An MA TESOL curriculum looks daunting, the reading lists look daunting, and the books themselves often look daunting. Many students spend far too long reading and taking notes in a non-focused way: they waste time by not thinking right from the start about the topics that they will eventually choose to base their assignments on.  So, here’s the first tip:

The first thing you should do when you start each module is think about what assignments you’ll do.

Having got a quick overview of the content of the module, make a tentative decision about what parts of it to concentrate on and about your assignment topics. This will help you to choose reading material, and will give focus to studies.

Similarly, you have to learn what to read, and how to read. When you start each module, read the course material and don’t go out and buy a load of books. And here’s the second tip:

Don’t buy any books until you’ve decided on your topic, and don’t read in any depth until then either.

Keep in mind that you can download at least 50% of the material you need from library and other web sites, and that more and more books can now be bought in digital format. To do well in this MA, you have to learn to read selectively. Don’t just read. Read for a purpose: read with a particular topic (better still, with a well-formulated question) in mind. Don’t buy any books before you’re abslutely sure you’ll make good use of them .

2. Choosing an appropriate topic.

The trick here is to narrow down the topic so that it becomes possible to discuss it in detail, while still remaining central to the general area of study. So, for example, if you are asked to do a paper on language learning, “How do people learn a second language?” is not a good topic: it’s far too general. “What role does instrumental motivation play in SLA?” is a much better topic. Which leads me to Tip No. 3:

The best way to find a topic is to frame your topic as a question.

Well-formulated questions are the key to all good research, and they are one of the keys to success in doing an MA. A few examples of well-formulated questions for an MA TESL are these:

• What’s the difference between the present perfect and the simple past tense?

• Why is “stress” so important to English pronunciation?

• How can I motivate my students to do extensive reading?

• When’s the best time to offer correction in class?

• What are the roles of “input” and “output” in SLA?

• How does the feeling of “belonging” influence motivation?

• What are the limitations of a Task-Based Syllabus?

• What is the wash-back effect of the Cambridge FCE exam?

• What is politeness?

• How are blogs being used in EFL teaching?

To sum up: Choose a manageable topic for each written assignment. Narrow down the topic so that it becomes possible to discuss it in detail. Frame your topic as a well-defined question that your paper will address.

3. Academic Writing.

Writing a paper at Masters level demands a good understanding of all the various elements of academic writing. First, there’s the question of genre. In academic writing, you must express yourself as clearly and succinctly as possible, and here comes Tip No. 4:

In academic writing “Less is more”.

Examiners mark down “waffle”, “padding”, and generally loose expression of ideas. I can’t remember who, but somebody famous once said at the end of a letter: “I’m sorry this letter is so long, but I didn’t have time to write a short one”. There is, of course, scope for you to express yourself in your own way (indeed, examiners look for signs of enthusiasm and real engagement with the topic under discussion) and one of the things you have to do, like any writer, is to find your own, distinctive voice. But you have to stay faithful to the academic style.

While the content of your paper is, of course, the most important thing, the way you write, and the way you present the paper have a big impact on your final grade. Just for example, many examiners, when marking an MA paper, go straight to the Reference section and check if it’s properly formatted and contains all and only the references mentioned in the text. The way you present your paper (double-spaced, proper indentations, and all that stuff); the way you write it (so as to make it coherent); the way you organise it (so as to make it cohesive); the way you give in-text citations; the way you give references; the way you organise appendices; are all crucial.

Making the Course Manageable

1. Essential steps in working through a module.

Focus: that’s the key. Here are the key steps:

Step 1: Ask yourself: What is this module about? Just as important: What is it NOT about? The point is to quickly identify the core content of the module. Read the Course Notes and the Course Handbook, and DON’T READ ANYTHING ELSE, YET.

Step 2: Identify the components of the module. If, for example, the module is concerned with grammar, then clearly identify the various parts that you’re expected to study. Again, don’t get lost in detail: you’re still just trying to get the overall picture. See the chapters on each module below for more help with this.

Step 3: Do the small assignments that are required. If these do not count towards your formal assessment , then do them in order to prepare yourself for the assignments that do count, and don’t spend too much time on them. Study the requirements of the MA TESL programme closely to identify which parts of your writing assignments count towards your formal assessment and which do not. • Some small assignments are required (you MUST submit them), but they do not influence your mark or grade. Don’t spend too mch time on these, unless they help you prepare for the main asignments.

Step 4: Identify the topic that you will choose for the written assignment that will determine your grade. THIS IS THE CRUCIAL STEP! Reach this point as fast as you can in each module: the sooner you decide what you’re going to focus on, the better your reading, studying, writing and results will be. Once you have identified your topic, then you can start reading for a purpose, and start marshalling your ideas. Again, we will look at each module below, to help you find good, well-defined, manageable topics for your main written assignments.

Step 5: Write an Outline of your paper. The outline is for your tutor, and should give a brief outline of your paper. You should make sure that your tutor reviews your outline and gives it approval.

Step 6: Write the First Draft of the paper. Write this draft as if it were the final version: don’t say “I’ll deal with the details (references, appendices, formatting) later”. Make it as good as you can.

Step 7: If you are allowed to do so, submit the first draft to your Tutor. Some universities don’t approve of this, so check with your tutor. If your tutor allows such a step, try to get detailed feedback on it. Don’t be content with any general “Well that look’s OK” stuff. Ask “How can I improve it?” and get the fullest feedback possible. Take note of ALL suggestions, and make sure you incorporate ALL of them in the final version.

Step 8: Write the final version of the paper.

Step 9: Carefully proof read the final version. Use a spell-checker. Check all the details of formatting, citations, Reference section, Appendices. Ask a friend or colleage to check it. If allowed, ask your tutor to check it.

Step 10: Submit the paper: you’re done!

3. Using Resources

Your first resource is your tutor. You’ve paid lots of money for this MA, so make sure you get all the support you need from him or her! Most importantly: don’t be afraid to ask help whenever you need it. Ask any question you like (while it’s obviously not quite true that “There’s no such thing as a stupid question”, don’t feel intimidated or afraid to ask very basic questions) , and as many as you like. Ask your tutor for suggstions on reading, on suitable topics for the written assignments, on where to find materials, on anything at all that you have doubts about. Never submit any written work for assessment until your tutor has said it’s the best you can do. If you think your tutor is not doing a good job, say so, and if necessary, ask for a change.

Your second resource is your fellow students. When I did my MA, I learned a lot in the students’ bar! Whatever means you have of talking to your fellow-students, use them to the full. Ask them what they’re reading, what they’re having trouble with, and share not only your thoughts but your feelings about the course with them.

Your third resource is the library. It is ABSOLUTELY ESSENTIAL to teach yourself, if you don’t already know, how to use a university library. Again, don’t be afraid to ask for help: most library staff are wonderful: the unsung heroes of the academic world. At Leicester University where I work as an associate tutor on the Distance Learning MA in Applied Linguistics and TESOL course, the library staff exemplify good library practice. They can be contacted by phone, and by email, and they have always, without fail, solved the problems I’ve asked them for help with. Whatever university you are studying at, the library staff are probably your most important resource, so be nice to them, and use them to the max. If you’re doing a presential course, the most important thing is to learn how the journals and books that the library holds are organised. Since most of you have aleady studied at university, I suppose you’ve got a good handle on this, but if you haven’t, well do something! Just as important as the physical library at your university are the internet resources offered by it. This is so important that I have dedicated Chapter 10 to it.

Your fourth resource is the internet. Apart from the resources offered by the university library, there is an enormous amount of valuable material available on the internet. See the “Doing an MA” and “Resources” section of this website for more stuff.

I can’t resist mentioning David Crystal’s Encyclopedia of The English Language as a constant resource. A friend of mine claimed that she got through her MA TESL by using this book most of the time, and, while I only bought it recently, I wish I’d had it to refer to when I was doing my MA.

Please use this website to ask questions and to discuss any issues related to your course.

Mike Long on Recasts


To paraphrase Long (2007), some teachers think that almost all overt error correction is beneficial. Some theorists (see, e.g., Carroll, 1997; Truscott, 1996, 1999) claim that negative feedback plays no role at all. The view that a complex array of linguistic and psychological factors affect its utility seems the most reasonable.

In the light of my recent remarks on Conti’s post, and stuff I’ve just seen on Marc’s blog , here’s a quick post based on what Long (2007) says about recasts.

After a review of recent research on L2 recasts, Long concludes that

implicit negative feedback in the form of corrective recasts seems particularly promising.

This contradicts Conti’s claim that

As several studies have clearly shown, recasts do not really ‘work’.

Conti’s argument is based on cherry picking bits of research and on crass claims about how memory works. Long’s argument is based on a rational interrogation of the evidence.


Corrective Recasts

Long (2007) defines a corrective recast as

a reformulation of all or part of a learner’s immediately preceding utterance in which one or more nontarget-like (lexical, grammatical, etc.) items is/are replaced by the corresponding target language form(s), and where, throughout the exchange the focus of the interlocutors is on meaning, not language as object.

The important thing to note is that the “corrections” in recasts are implicit and incidental.

Long says that recasts are useful because

    • They convey needed information about the target language in context where interlocutors share a joint attentional focus and when the learner already has prior comprehension of at least part of the message, thereby facilitating form-function mapping.
    • Learners are vested in the message as it’s their message which is at stake and so will probably be motivated and attending, conditions likely to facilitate noticing of any new linguistic information in the input.
    • Since they already understand part of the recast, they have additional freed-up attentional resources that can be allocated to the form-function mapping. They also have the chance to compare the incorrect and correct utterances.

Long gives a review of cross-sectional and longitudinal studies of recasts in SLA and shows that there is clear evidence that the linguistic information recasts contain is both useable and used .

In his Summary (half way through the chapter), Long says that al the studies show that recasts exist in relatively high frequencies in both classroom-based and noninstructional settings observed. He goes on to say that learners notice the negative feedback that corrective recasts contain; that the feedback is useable and used, and that, while not necessary for acquisition, recasts appear to be facilitative, and to work better than most explicit modelling. He concludes that the jury is still out on recasts but that the results of studies to date are encouraging.


Long then moves to “The Sceptics”. He deals at length with 2 main objections by Lyster and Lyster and Ranta (see Long 2007 for the references or email me). They are:

  1. The function of recasts can often be ambiguous.
  2. “Uptake” as a result of recasts is sparse.

As Long says, both ambiguity and uptake are important considerations when evaluating any form of negative feedback and thus worthy of discussion.

Long argues that while the function of some recasts can be ambiguous, that doesn’t negate their usefulness. He notes, interestingly, that the risk of ambiguity seems to be greater in immersion couses and in some task-based and content-based lessons.

The “uptake” sceptism is more harshly dealt with by Long. His arguments are too detailed to be quickly summarised, but they make very interesting reading. Long challenges the way the construct of “uptake” is used by Lyster and others, and highlights weaknesses in both study methods and data interpretation. What’s interesting is how carefully and rationally Long scrutinises the work he discusses, and his discussion highlights just how tricky investigating aspects of SLA is. How can we operationalise the construct “uptake” so that our studies are as rigorous as possible? How can we best articulate the research questions that drive the study? How can we organise a study so that it focuses carefully on it’s well-articulated research questions? How can we use statistical measures to interpret the data? How else can we interpret the data? And so on. It makes instructive reading for all those doing post graduate work in SLA, and I have often pointed my own tutees to Long’s work as a good example of a critical scholar at work. (I also point out that Long is supremely well-informed, which helps!)

With regard to uptake, one point stands out for me in Long’s reply to the sceptics: no form of feedback will always have immediate corrective effects “least of all as measured by spoken production, which is often one of the very last indications of change in an underlying grammar, whether induced by recasts or otherwise” (p. 99). Given that data on the immediate effects by themselves are unreliable, how much weight do we give to different measures of the effectiveness of different kinds of correction? Long discusses these issues, of course, but they indicate just how difficult it is to study things like recasts. But onward thru the fog we must go, armed with rationality and empirical evidence, because otherwise we’ll be ruled by mere prejudice and the assays of bias, anecdotes, folk law, and bullshit, eh what?  Just BTW, Long refers to work done by Oliver (1995; 2000) which is well worth reading.

Long’s chapter continues by looking at recasts and perceptual salience. The relationship between the the saliency of linguistic targets and the relative utility of models, recasts, and production-only opportunities, as studied in Ono and Witzel (2002) is discussed. If there’s any interest in this among readers, I’ll deal with it in a separate post. Perceptual salience is fascinating, don’t you think? No really, it is. What stands out when we’re learning? What happens to the non-salient bits? Does saliency explain putative fossilisation?  Is trying to get advanced learners to memorise thousands of esoteric lexical chunks the answer?

Then Long deals with “Methodologial Issues” of research. Required reading if you’re doing post graduate work.

The final section of Long’s chapter is on Pedagogical Implications. He recommends the use of recasts in such a way that they match a raft of factors, but anyway, he recommends them. Now just imagine you read that in an MA paper! Ughh! My apologies to all; as Scott would say “Come on: it’s only a blog!”



If we want to teach well, we need a good grasp of the most effective way to give feedback to our students when they make mistakes. As always in ELT, there’s no definitive answer to the question “What’s the best way?” It depends, it really does. It depends crucially, as Long is keen to stress, on local factors that only the teacher in that situation can evaluate. Long stresses that precisely how teachers interact with their learners  in their own environment is their decision. He highlights the hopeless inadequacies of current ELT practice, but he never, ever, tells teachers what to do.

Research suggests that coursebook-based teaching is unlikely to be as effective as teaching that pays more attention to learners’ needs. Research suggests that basing teaching on the presentation and practice of pre-selected bits of the language is unlikely to be as effective as basing teaching on real communication. And research suggests (sic) that recasts are an effective way of helping learners notice their mistakes and to make progress.

Against this, we have the unprincipled, over-confident assertions of a motley crew of ELT teacher trainers and gurus, all promoting their own commercial wares, who confidently say, with equal force, that recasts work and that they don’t. Among them there are real charlatans, and a bunch of well-intentioned fools. Too many of them talk ill-informed, populist nonsense on their blogs, publish “How to ..” books by the score, tour the world peddling their snake oil, and prey on teachers who haven’t had the chance to find out for themselves just how bad the advice they’re being sold really is. One way to fight them is through rational criticism.

I hasten to add that this last bit, the inevitable rant, has absolutely no approval from Mike Long, who fights his own battles with far more grace, not to mention knowledge, than I do.

All References can be found in

Long, M.H. (2007) Problems in SLA. Mahwah, Earlbowm.

The Language Gym


Gianfranco Conti’s The Language Gym is a blog, but really it’s a sales pitch for his book The Language Teacher Toolkit. Actually he’s the co-author but it’s hard to find the other bloke’s name.

There’s another web site too, with a front page that reminds me of Orwell’s 1984. Try it. I think it’s horrific, like you’re trapped, like you can’t get out, like you have to do the work out, like come on, sign up, follow.

The book claims that the tools it describes can successfully help teachers to get learners to transfer knowledge held in some version of working memory to some version of long term memory, and then ensure some version of communicative competence. Conti and his co-author use the distinction between working memory and long term memory as if they knew how the two constructs worked in SLA; as if, that is, they were working with some theory of SLA that had some definitive explanation of the putative transition. None exists. What do Conti and his co-author think happens to input?  What do they think the difference between input and intake is? What do they think “noticing” does? Come to that, what do they think “noticing” is?  What theory is it that they think explains how knowledge supposedly held in working memory goes into long term memory? What happens then? The book is stuffed with baloney, but I’ll deal with the book more fully on another day.


Six Useless Things Foreign Teachers Do

It’s the blog that I want to criticise here. Recently, I noticed to my surprise that people whose opinions I respect “liked” the posts on Conti’s blog, and, in particular, they “liked”  the latest post Six Useless Things Foreign Teachers Do. Well I don’t like it. I object to it because

  1. There’s a strident sales pitch.
  2. There’s scant respect for research.

The Sales Pitch

The sales pitch is evident throughout this blog. Each and every post ends with a “Buy This Book” call. As if that weren’t enough, there’s a special page devoted to promoting the book where  “Oxford University ‘legend’, Professor Macaro, reviews The Language Teacher Toolkit”. Macaro is an Oxford University Legend?  Really?

The Respect for Research

As for the scant respect for research, there are claims in Conti’s post which rely on cherry picking academic work and which make little attempt to present the real complexities of the matters discussed. Here are 2 examples

1. As several studies have clearly shown, recasts do not really ‘work’.

This is false. See Long (2007) Problems of SLA, Chapter 4, “Recasts in SLA, the Story so Far”.  The case for recasts should be properly considered, not argued with disregard for a proper weighing of the evidence.

2.  Direct correction, whereby the teacher corrects an erroneous grammatical form and provides the correct version of that structure with an explanation on margin is pretty much a waste of valuable teacher time.

This is also false. See Bitchener and Ferris (2011) Written Corrective Feedback in Second Language Acquisition and Writing for why it’s false.

When he’s not misrepresenting research findings, Conti is just blowing off. For example, he says

Indirect correction, on the other hand, is not likely to contribute much to acquisition as the learner will not be able to correct what s/he does not know (e.g. I cannot self-correct an omission of the subjunctive if I have not learnt it) and if s/he is indeed able to correct, s/he will not really learn much from it.

Note that this blather is followed by this:

To learn more about my views on this issue read my blog “Why asking students to self-correct their errors is a waste of time”.

Go on, have a look, go and see what he says about why asking students to self-correct their errors is a waste of time, and I hope you’ll note that he’s once again over-stepping the mark.


Theoretical Constructs in SLA


Here again is my short contribution to Robinson, P. (ed) 2013 The Encyclopedia of SLA London, Routledge.

1. Introduction
Theoretical constructs in SLA include such terms as interlanguage, variable competence, motivation, and noticing. These constructs are used in the service of theories which attempt to explain phenomena, and thus, in order to understand how the term “theoretical construct” is used in SLA, we must first understand the terms “theory” and “phenomena”.

A theory is an attempt to provide an explanation to a question, usually a “Why” or “How” question. The “Critical Period” theory (see Birdsong, 1999) attempts to answer the question “Why do most L2 learners not achieve native-like competence?” The Processability Theory (Pienemann, 1998) attempts to answer the question “How do L2 learners go through stages of development?” In posing the question that a theory seeks to answer, we refer to “phenomena”: the things that we isolate, define, and then attempt to explain in our theory. In the case of theories of SLA, key phenomena are transfer, staged development, systemacity, variability and incompleteness. (See Towell and Hawkins, 1994: 15.)

A clear distinction must be made between phenomena and observational data. Theories attempt to explain phenomena, and observational data are used to support and test those theories. The important difference between data and phenomena is that the phenomena are what we want to explain, and thus, they are seen as the result of the interaction between some manageably small number of causal factors, instances of which can be found in different situations. By contrast, any type of causal factor can play a part in the production of data, and the characteristics of these data depend on the peculiarities of the experimental design, or data-gathering procedures, employed. As Bogen and Woodward put it: “Data are idiosyncratic to particular experimental contexts, and typically cannot occur outside those contexts, whereas phenomena have stable, repeatable characteristics which will be detectable by means of different procedures, which may yield quite different kinds of data” (Bogen and Woodward, 1988: 317). A failure to appreciate this distinction often leads to poorly-defined theoretical constructs, as we shall see below.

While researchers in some fields deal with such observable phenomena as bones, tides, and sun spots, others deal with non-observable phenomena such as love, genes, hallucinations, gravity and language competence. Non-observable phenomena have to be studied indirectly, which is where theoretical constructs come in. First we name the non-observable phenomena, we give them labels and then we make constructs. With regard to the non-observable phenomena listed above (love, genes, hallucinations, gravity and language competence), examples of constructs are romantic love, hereditary genes, schizophrenia, the bends, and the Language Acquisition Device. Thus, theoretical constructs are one remove from the original labelling, and they are, as their name implies, packed full of theory; they are, that is, proto-typical theories in themselves, a further invention of ours, an invention made in our attempt to pin down the non-observable phenomena that we want to examine so that the theories which they embody can be scrutinised. It should also be noted that there is a certain ambiguity in the terms “theoretical construct” and “phenomenon”. The “two-step” process of naming a phenomenon and then a construct outlined above is not always so clear: for Chomsky (Chomsky, 1986), “linguistic competence” is the phenomenon he wants to explain, to many it has all the hallmarks of a theoretical construct.

Constructs are not the same as definitions; while a definition attempts to clearly distinguish the thing defined from everything else, a construct attempts to lay the ground for an explanation. Thus, for example, while a dictionary defines motivation in such a way that motivation is distinguishable from desire or compulsion, Gardener (1985) attempts to explain why some learners do better than others, and he uses the construct of motivation to do so, in such a way that his construct takes on its own meaning, and allows others in the field to test the claims he makes. A construct defines something in a special way: it is a term used in an attempt to solve a problem, indeed, it is often a term that in itself suggests the answer to the problem. Constructs can be everyday parlance (like “noticing” and “competence”) and they can also be new words (like “interlanguage”), but, in all cases, constructs are “theory-laden” to the maximum: their job is to support a hypothesis, or, better still, a full-blown theory. In short, then, the job of a construct is to help define and then solve a problem.


2. Criteria for assessing theoretical constructs used in theories of SLA

There is a lively debate among scholars about the best way to study and understand the various phenomena associated with SLA. Those in the rationalist camp insist that an external world exists independently of our perceptions of it, and that it is possible to study different phenomena in this world, to make meaningful statements about them, and to improve our knowledge of them by appeal to logic and empirical observation. Those in the relativist camp claim that there are a multiplicity of realities, all of which are social constructs. Science, for the relativists, is just one type of social construction, a particular kind of language game which has no more claim to objective truth than any other. This article rejects the relativist view and, based largely on Popper’s “Critical Rationalist” approach (Popper, 1972), takes the view that the various current theories of SLA, and the theoretical constructs embedded in them, are not all equally valid, but rather, that they can be critically assessed by using the following criteria (adapted from Jordan, 2004):

1. Theories should be coherent, cohesive, expressed in the clearest possible terms, and consistent. There should be no internal contradictions in theories, and no circularity due to badly-defined terms.
2. Theories should have empirical content. Having empirical content means that the propositions and hypotheses proposed in a theory should be expressed in such a way that they are capable of being subjected to tests, based on evidence observable by the senses, which support or refute them. These tests should be capable of replication, as a way of ensuring the empirical nature of the evidence and the validity of the research methods employed. For example, the claim “Students hate maths because maths is difficult” has empirical content only when the terms “students”, “maths”, “hate” and “difficult” are defined in such a way that the claim can be tested by appeal to observable facts. The operational definition of terms, and crucially, of theoretical constructs, is the best way of ensuring that hypotheses and theories have empirical content.
3. Theories should be fruitful. “Fruitful” in Kuhn’s sense (see Kuhn, 1962:148): they should make daring and surprising predictions, and solve persistent problems in their domain.

Note that the theory-laden nature of constructs is no argument for a relativist approach: we invent constructs, as we invent theories, but we invent them, precisely, in a way that allows them to be subjected to empirical tests. The constructs can be anything we like: in order to explain a given problem, we are free to make any claim we like, in any terms we choose, but the litmus test is the clarity and testability of these claims and the terms we use to make them. Given it’s pivotal status, a theoretical construct should be stated in such a way that we all know unequivocally what is being talked about, and it should be defined in such a way that it lays itself open to principled investigation, empirical and otherwise. In the rest of this article, a number of theoretical constructs will be examined and evaluated in terms of the criteria outlined above.

3. Krashen’s Monitor Model

The Monitor Model (see Krashen, 1985) is described elsewhere, so let us here concentrate on the deficiencies of the theoretical constructs employed. In brief, Krashen’s constructs fail to meet the requirements of the first two criteria listed above: Krashen’s use of key theoretical constructs such as “acquisition and learning”, and “subconscious and conscious” is vague, confusing, and, not always consistent. More fundamentally, we never find out what exactly “comprehensible input”, the key theoretical construct in the model, means. Furthermore, in conflict with the second criterion listed above, there is no way of subjecting the set of hypotheses that Krashen proposes to empirical tests. The Acquisition-Learning hypothesis gives no evidence to support the claim that two distinct systems exist, nor any means of determining whether they are, or are not, separate. Similarly, there is no way of testing the Monitor hypothesis: since the Monitor is nowhere properly defined as an operational construct, there is no way to determine whether the Monitor is in operation or not, and it is thus impossible to determine the validity of the extremely strong claims made for it. The Input Hypothesis is equally mysterious and incapable of being tested: the levels of knowledge are nowhere defined and so it is impossible to know whether i + 1 is present in input, and, if it is, whether or not the learner moves on to the next level as a result. Thus, the first three hypotheses (Acquisition-Learning, the Monitor, and Natural Order) make up a circular and vacuous argument: the Monitor accounts for discrepancies in the natural order, the learning-acquisition distinction justifies the use of the Monitor, and so on.

In summary, Krashen’s key theoretical constructs are ill-defined, and circular, so that the set is incoherent. This incoherence means that Krashen’s theory has such serious faults that it is not really a theory at all. While Krashen’s work may be seen as satisfying the third criterion on our list, and while it is extremely popular among EFL/ESL teachers (even among those who, in their daily practice, ignore Krashen’s clear implication that grammar teaching is largely a waste of time) the fact remains that his series of hypotheses are built on sand. A much better example of a theoretical construct put to good use is Schmidt’s Noticing, which we will now examine.

4. Schmidt’s Noticing Hypothesis

Schmidt’s Noticing hypothesis (see Schmidt, 1990) is described elsewhere. Essentially, Schmidt attempts to do away with the “terminological vagueness” of the term “consciousness” by examining three senses of the term: consciousness as awareness, consciousness as intention, and consciousness as knowledge. Consciousness and awareness are often equated, but Schmidt distinguishes between three levels: Perception, Noticing and Understanding. The second level, Noticing, is the key to Schmidt’s eventual hypothesis. The importance of Schmidt’s work is that it clarifies the confusion surrounding the use of many terms used in psycholinguistics (not least Krashen’s “acquisition/ learning” dichotomy) and, furthermore, it develops one crucial part of a general processing theory of the development of interlanguage grammar.

Our second evaluation criterion requires that theoretical constructs are defined in such a way as to ensure that hypotheses have empirical content, and thus we must ask: what does Schmidt’s concept of noticing exactly refers to, and how can we be sure when it is, and is not being used by L2 learners? In his 1990 paper, Schmidt claims that noticing can be operationally defined as “the availability for verbal report”, “subject to various conditions”. He adds that these conditions are discussed at length in the verbal report literature, but he does not discuss the issue of operationalisation any further. Schmidt’s 2001 paper gives various sources of evidence of noticing, and points out their limitations. These sources include learner production (but how do we identify what has been noticed?), learner reports in diaries (but diaries span months, while cognitive processing of L2 input takes place in seconds and making diaries requires not just noticing but also reflexive self-awareness), and think-aloud protocols (but we cannot assume that the protocols identify all the examples of target features that were noticed).

Schmidt argues that the best test of noticing is that proposed by Cheesman and Merikle (1986), who distinguish between the objective and subjective thresholds of perception. The clearest evidence that something has exceeded the subjective threshold and been noticed is a concurrent verbal report, since nothing can be verbally reported other than the current contents of awareness. Schmidt adds that “after the fact recall” is also good evidence that something was noticed, providing that prior knowledge and guessing can be controlled. For example, if beginner level students of Spanish are presented with a series of Spanish utterances containing unfamiliar verb forms, and are then asked to recall immediately afterwards the forms that occurred in each utterance, and can do so, that is good evidence that they noticed them. On the other hand, it is not safe to assume that failure to do so means that they did not notice. It seems that it is easier to confirm that a particular form has not been noticed than that it has: failure to achieve above-chance performance in a forced-choice recognition test is a much better indication that the subjective threshold has not been exceeded and that noticing did not take place.

Schmidt goes on to claim that the noticing hypothesis could be falsified by demonstrating the existence of subliminal learning, either by showing positive priming of unattended and unnoticed novel stimuli, or by showing learning in dual task studies in which central processing capacity is exhausted by the primary task. The problem in this case is that, in positive priming studies, one can never really be sure that subjects did not allocate any attention to what they could not later report, and similarly, in dual task experiments, one cannot be sure that no attention is devoted to the secondary task. In conclusion, it seems that Schmidt’s noticing hypothesis rests on a construct that still has difficulty measuring up to the second criteria of our list; it is by no means easy to properly identify when noticing has and has not occurred. Despite this limitation, however, Schmidt’s hypothesis is still a good example of the type of approach recommended by the list. Its strongest virtues are its rigour and its fruitfulness, Schmidt argues that attention as a psychological construct refers to a variety of mechanisms or subsystems (including alertness, orientation, detection within selective attention, facilitation, and inhibition) which control information processing and behaviour when existing skills and routines are inadequate. Hence, learning in the sense of establishing new or modified knowledge, memory, skills and routines is “largely, perhaps exclusively a side effect of attended processing”. (Schmidt, 2001: 25). This is a daring and surprising claim, with similar predictive ability, and it contradicts Krashen’s claim that conscious learning is of extremely limited use.


5. Variationist approaches

An account of these approaches is given elsewhere In brief, variable competence, or variationist, approaches, use the key theoretical construct of “variable competence”, or, as Tarone calls it, “capability”. Tarone (1988) argues that “capability” underlies performance, and that this capability consists of heterogeneous “knowledge” which varies according to various factors. Thus, there is no homogenous competence underlying performance but a variable “capacity” which underlies specific instances of language performance. Ellis (1987) uses the construct of “variable rules” to explain the observed variability of L2 learners’ performance: learners, by successively noticing forms in the input which are in conflict with the original representation of a grammatical rule acquire more and more versions of the original rule. This leads to either “free variation” (where forms alternate in all environments at random) or “systematic variation” where one variant appears regularly in one linguistic context, and another variant in another context.

The root of the problem of the variable competence model is the weakness of its theoretical constructs. The underlying “variable competence” construct used by Tarone and Ellis is nowhere clearly defined, and is, in fact, simply asserted to “explain” a certain amount of learner behaviour. As Gregg (1992: 368) argues, Tarone and Ellis offer a description of language use and behaviour, which they confuse with an explanation of the acquisition of grammatical knowledge. By abandoning the idea of a homogenous underlying competence, Gregg says, we are stuck at the surface level of the performance data, and, consequently, any research project can only deal with the data in terms of the particular situation it encounters, describing the conditions under which the experiment took place. The positing of any variable rule at work would need to be followed up by an endless number of further research projects looking at different situations in which the rule is said to operate, each of which is condemned to uniqueness, no generalisation about some underlying cause being possible.

At the centre of the variable competence model are variable rules. Gregg argues cogently that such variability cannot become a theoretical construct used in attempts to explain how people acquire linguistic knowledge. In order to turn the idea of variable rules from an analytical tool into a theoretical construct, Tarone and Ellis would have to grant psychological reality to the variable rules (which in principle they seem to do, although no example of a variable rule is given) and then explain how these rules are internalised, so as to become part of the L2 learner’s grammatical knowledge of the target language (which they fail to do). The variable competence model, according to Gregg, confuses descriptions of the varying use of forms with an explanation of the acquisition of linguistic knowledge. The forms (and their variations) which L2 learners produce are not, indeed cannot be, direct evidence of any underlying competence – or capacity. By erasing the distinction between competence and performance “the variabilist is committed to the unprincipled collection of an uncontrolled mass of data” (Gregg 1990: 378).

As we have seen, a theory must explain phenomena, not describe data. In contradiction to this, and to criteria 1and 2 in our list, the arguments of Ellis and Tarone are confused and circular; in the end what Ellis and Tarone are actually doing is gathering data without having properly formulated the problem they are trying to solve, i.e. without having defined the phenomenon they wish to explain. Ellis claims that his theory constitutes an “ethnographic, descriptive” approach to SLA theory construction, but he does not answer the question: How does one go from studying the everyday rituals and practices of a particular group of second language learners through descriptions of their behaviour to a theory that offers a general explanation for some identified phenomenon concerning the behaviour of L2 learners?

Variable Competence theories exemplify what happens when the distinction between phenomena, data and theoretical constructs is confused. In contrast, Chomsky’s UG theory, despite its shifting ground and its contentious connection to SLA, is probably the best example of a theory where these distinctions are crystal clear. For Chomsky, “competence” refers to underlying linguistic (grammatical) knowledge, and “performance” refers to the actual day to day use of language, which is influenced by an enormous variety of factors, including limitations of memory, stress, tiredness, etc. Chomsky argues that while performance data is important, it is not the object of study (it is, precisely, the data): linguistic competence is the phenomenon that he wants to examine. Chomsky’s distinction between performance and competence exactly fits his theory of language and first language acquisition: competence is a well-defined phenomenon which is explained by appeal to the theoretical construct of the Language Acquisition Device. Chomsky describes the rules that make up linguistic competence and then invites other researchers to subject the theory that all languages obey these rules to further empirical tests.


6. Aptitude

Why is anybody good at anything? Well, they have an aptitude for it: they’re “natural” piano players, or carpenters, or whatever. This is obviously no explanation at all, although, of course, it contains a beguiling element of truth.To say that SLA is (partly) explained by an aptitude for learning a second language is to beg the question: What is aptitude for SLA? Attempts to explain the role of aptitude in SLA illustrate the difficulty of “pinning down” the phenomenon that we seek to explain. If aptitude is to be claimed as a causal factor that helps to explain SLA, then aptitude must be defined in such a way that it can be identified in L2 learners and then related to their performance.

Robinson (2007) uses aptitude as a construct that is composed of different cognitive abilities. His “Aptitude Complex Hypothesis” claims that different classroom settings draw on certain combinations of cognitive abilities, and that, depending on the classroom activities, students with certain cognitive abilities will do better than others.. Robinson adds the “Ability Differentiation Hypothesis” which claims that some L2 learners have different abilities than others, and that it is important to match these learners to instructional conditions which favor their strengths in aptitude complexes. In terms of classroom practice, these hypotheses might well be fruitful, but they do not address the question of how aptitude explains SLA.

One example of identifying aptitude in L2 learners is the CANAL-F theory of foreign language aptitude, which grounds aptitude in “the triarchic theory of human intelligence” and argues that “one of the central abilities required in FL acquisition is the ability to cope with novelty and ambiguity” (Grigorenko, Sternberg and Ehrman, 2000: 392). However successfully the test might predict learner’s ability, the theory fails to explain aptitude in any causal way. The theory of human intelligence that the CANAL-F theory is grounded in fails to illuminate the description given of FL ability; we do not get beyond a limiting of the domain in which the general ability to cope with novelty and ambiguity operates. The individual differences between foreign language learners’ ability is explained by suggesting that some are better at coping with novelty and ambiguity than others. Thus, whatever construct validity might be claimed for CANAL-F, and however well the test might predict ability, it leaves the question of what precisely aptitude at foreign language learning is, and how it contributes to SLA, unanswered.

How, then, can aptitude explain differential success in a causal way? Even if aptitude can be properly defined and measured without falling into the familiar trap of being circular (those who do well at language aptitude tests have an aptitude for language learning), how can we step outside the reference of aptitude and establish more than a simple correlation? What is needed is a theoretical construct.

7. Conclusion

The history of science throws up many examples of theories that began without any adequate description of what was being explained. Darwin’s theory of evolution by natural selection (the young born to any species compete for survival, and those young that survive to reproduce tend to embody favourable natural variations which are passed on by heredity) lacked any formal description of the theoretical construct “variation”, or any explanation of the origin of variations, or how they passed between generations. It was not until Mendel’s theories and the birth of modern genetics in the early 20th century that this deficiency was dealt with. But, and here is the point, dealt with it was: we now have constructs that pin down what “variation” refers to in the Darwinian theory, and the theory is stronger for them (i.e. more testable). Theories progress by defining their terms more clearly and by making their predictions more open to empirical testing.

Theoretical constructs lie at the heart of attempts to explain the phenomena of SLA. Observation must be in the service of theory: we do not start with data, we start with clearly-defined phenomena and theoretical constructs that help us articulate the solution to a problem, and we then use empirical data to test that tentative solution. Those working in the field of psycholinguistics are making progress thanks to their reliance on a rationalist methodology which gives priority to the need for clarity and empirical content. If sociolinguistics is to offer better explanations, the terms used to describe social factors must be defined in such a way that it becomes possible to do empirically-based studies that confirm or challenge those explanations. All those who attempt to explain SLA must make their theoretical constructs clear, and improve their definitions and research methodology in order to better pin down the slippery concepts that they work with.


Birdsong, D. (ed) (1999) Second Language Acquisition and the Critical Period Hypothesis. Mahwah, NJ: Lawrence Erlbaum Associates
Bogen, J. and Woodward, J. (1988) “Saving the phenomena.” Philosophical Review 97: 303-52.
Cheesman, J., & Merikle. P. M. (1986) “Distinguishing conscious from unconscious perceptual processes.” Canadian Journal of Psychology, 40:343-367.
Chomsky, N. (1986) Knowledge of Language: Its Nature, Origin and Use. New York:
Ellis, R. (1987) “Interlanguage variability in narrative discourse: style-shifting in the use of the past tense.” Studies in Second Language Acquisition 9, 1-20.
Gardner, R. C. (1985) Social psychology and second language learning: the role of
attitudes and motivation. London: Edward Arnold.
Gregg, K. R. (1990) “The Variable Competence Model of second language acquisition
and why it isn’t.” Applied Linguistics 11, 1. 364—83.
Grigorenko, E., Sternberg, R., and Ehrman, M. (2000) “A Theory-Based Approach to the Measurement of Foreign Language Learning Ablity: The Canal-F Theory and Test.” The Modern Language Journal 84, iii, 390-405.
Jordan, G. (2004) Theory Construction in SLA. Benjamins: Amsterdam
Kuhn, T. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
Krashen, S. (1985) The Input Hypothesis: Issues and Implications. New York: Longman.
Pienemann, M. 1998: Language Processing and Second Language Development:
Processability Theory. Amsterdam: John Benjamins
Popper, K. R. 1972: Objective Knowledge. Oxford: Oxford University Press.
Schmidt, R. (1990) “The role of consciousness in second language learning.” Applied
Linguistics 11, 129-58
Schmidt, R.(2001) “Attention.” In Robinson, P. (ed.) Cognition and Second Language
Instruction. Cambridge: Cambridge University Press, 3-32.
Tarone, E. 1988: Variation in interlanguage. London: Edward Arnold.
Towell, R. and Hawkins, R. (1994) Approaches to second language acquisition.
Clevedon: Multilingual Matters.

Empiricism and Its Exciting Alternatives


There’s a growing opinion among academics and pundits in the ELT industry that exposure to language in the environment is enough to explain how we learn languages. This is a rebuttal of what we might call the cognitive paradigm established in the 60s by Chomsky, according to which our knowledge of language can’t be explained that way. Following Chomsky’s criticisms of Skinner, there’s been a generally accepted view among experts in the field for the last 50 years or so that language learning is not one more example of learning by reinforced behaviour, but rather a special case of learning which draws on a unique property of the mind to interpret linguistic  information.

Serious questions of epistemology are at issue here. They revolve around re-visting questions of whether we can speak with any sense about the mind, or if, as the empiricists insist, we can only talk, “sensibly” (geddit) about measureable things presented to our senses. The quest for reliable knowledge led the extreme empiricists, the Logical Positivists, in the 20s and early 30s to insist that all talk of the mind had to be purged, and that only a hand full of carefully-vetted sentences could be used if the chaos and confusion of normal discourse were to be overcome. They arrived at an inevitable dead end, climbed up Wittgenstein’s ladder, and fell into well-earned oblivion. The work of people like Tarski allowed the few scientists who might have been disconcerted by the positivists’ doubts to settle down, adopt a sensible, sorry, I mean common sense, “correspondence“ view of truth, and continue their work, which rested on the view that there’s an objective world out there which we can dispassionately observe and, basing ourselves on empirical (i.e. factual, non-judgemental) data, theorise about the way it works, using rules of logic to guide us.


But those studying human behaviour have, quite rightly of course, had a hard job gaining admittance to the science club. If they wanted to be scientific they’d have to base themselves on empirical observation, wouldn’t they. Hence, Skinner and behaviourism, who and which confused empiricism as a philosophical movement with empirical research. They decided that human behaviour is best studied by observing what people observably do. How do they learn, then? They learn by doing things, by reacting to their environment. They form habits based on repeating the same behaviour in response to their environment. They have bigger brains than other creatures so their learning is more sophisticated. Reasoning is no more than a sophisticated reaction to a stimulus in the environment.

Chomsky questioned Skinner’s general learning theory and you all know how. But Chomsky’s view makes use of a raft of non-observable theoretical constructs which we allow in order for him to develop his theory. Pace Larsen Freeman, they’re not metaphors, any more than gravity is a metaphor, they’re things we invent in order to explain phenomena which we’re trying to explain. Post Chomsky, the most widely accepted view of language learning is that it’s a process that goes on in the mind, itself a theoretical construct, and that it involves the processing of data. How we process the data is the stuff of lots of different theories which try to explain different bits of the process; none of the theories is complete (none offers a complete explanation of the phenomena under investigation in SLA) and none is firmly established. But most of the theories rely on Chomsky’s theory that we learn our first language thanks to an innate capacity of the mind to make sense of the linguistic data we get from the environment.


So here comes Emergentism, which returns to empiricism and its epistemological roots. It takes many forms; it’s been proposed by various academics like O’Grady, MacWhinney and N. Ellis, with varying results.  Gregg has done his usual elegant job of pointing out the weaknesses of N. Ellis’ well considered arguments  (see my post on Emergentism)  and MacWhinney seems to be making little progress. O’Grady, on the other hand, looks better every time I read his work, which I’ve only recently started to do, having got the tip off from Kevin Gregg.  I urge you, as Kevin urged me, to read O’Grady (2005) How Children Learn Language. It’s like listening to Glenn Gould play Bach; it’s crystal clear, high definition brilliance, one of the best books I’ve read in years. While it’s not based on an empiricist epistemology (far from it!), it totally rejects Chomsky’s UG and argues that a general learning  device explains  language learning. Actually, you need to read more of his stuff than just the book, but anyway, ..


And then there’s Stefano Rastelli’s (2014) Discontinuity in Second Language Acquisition. the Switch between Statistical and Grammatical Learning. Mike Long put me on to this, and it’s superb. Long has written a review of Rastelli’s book which I hope will appear soon. In the review Long notes that “recent years have seen growing research interest in the potential of statistical learning and usage-based accounts of SLA by adults”.  What Long finds so interesting is that Rastelli has dedicated a full book to his version of statistical learning, not just an article in a journal or a chapter in an edited collection.

Rastelli ‘s theory is that statistical learning is the initial way learners handle combinatorial grammar, i.e., regular co-occurrence relationships between forms that are overt in the input (not absent, like pro-drop, for example) and the meanings and functions of those forms. Combinatorial grammar comprises recurrent combinations of adjacent and non-adjacent whole words and morphemes. The form-function pairs can be stored and retrieved first as wholes, and then broken down into their component parts in order to be computed by abstract rules.

And get this: Combinatorial grammar is learned twice, first by statistical learning and then by grammatical learning. This is the meaning of ‘discontinuity’ in his hypothesis. Statistical learning prepares the ground for grammar learning

Statistics provides the L2 grammar the ‘environment’ to grow and develop. (Rastelli, 2014, p.42).

I hope Thornbury reads this; he might just find in Rastelli some long-missed support for his assertions about learning grammar for free, although that’s not exactly what Rastelli is saying.

So Rastelli rejects the notion that L2 development is continuous, a series of developmental stages as a result of increased exposure to L2 input:

  The core idea of discontinuity is that the process of adult acquisition of L2 grammar is not uniform and incremental but differentiated and redundant. To learn a second language, adults apply two different procedures to the same linguistic materials: redundancy means that the same language items may happen to be learned twice (2014: 5).

I’d like to say more, but I don’t want to steal Mike’s thunder, if I haven’t already done so. I hope that’s enough to whet your appetite.


Arguments about SLA are based partly on epistemological underpinnings that need to be declared and understood. Those like Hoey who say that we acquire all the knowledge we have about words on the basis of frequency of exposure, and Larsen Freeman who says complexity theory explains it all, are, whether they appreciate it or not, adopting an empiricist epistemology. Consequently, their theories are doomed to failure unless they create crafty loopholes. But it is, it seems, possible to argue from a different, but still rational, cognitive perspective, that the poverty of the stimulus argument is wrong, if you know what you’re doing. Good scholars like O’Grady and Rastelli do it, and it’s very exciting.

Real Tasks Guide Long’s TBLT


TBLT stands ELT on its head doesn’t it? What’s a task, anyway? My posts on TBLT have led various people, many of them MA students, to ask me to clear up some misunderstandings. So here I offer bits and pieces from Mike Long’s 2016 article which I hope will clarify his version of TBLT. As usual, I’ve torn his well organised, well written article to shreds and, as usual, count more on his friendship than on his tolerance of academic sloppiness to forgive me for all its faults.

I haven’t put it all between quotation marks, but it’s all his work.

Long opposes coursebook-based product syllabuses. 

Research has shown that L2 grammatical development does not follow an order externally imposed by a teacher or textbook. Instead, learners traverse developmental sequences, often producing utterances reflecting learner-created non-target-like rules in the process. Part of the definition of ‘developmental sequence’ in SLA is that it consists of a fixed series of stages, none of which can be omitted. Evidence for the existence of developmental sequences is long-standing and robust (for review, see Ortega, 2009).

40 years of instructed SLA research and classroom studies have demonstrated that teachers cannot teach whatever they want, whenever they want if language learning is their goal. Stages in sequences cannot be skipped by presenting learners with full target-like grammatical structures and drilling them until they are “automatized.” In Pienemann’s terms, processability determines learnability, and learnability determines teachability (Pienemann, 1984).

Students play a decisive role in the language-learning process; their readiness to learn is not determined by the day of the week or the page in a textbook. As Skehan (2002, p. 294) notes, PPP and the skill-builders’ view is that language acquisition is teacher-driven, whereas, with support from decades of SLA research, TBLT views language acquisition as learner-driven.

Long’s version of TBLT

For Long, tasks are “the real-world communicative uses to which learners will put the L2 beyond the classroom — the things they will do in and through the L2.”

The real-world tasks may be required for academic purposes, e.g., locating a journal in a university library, writing a lab report, or attending a graduate-level economics lecture. They may be for vocational training purposes, e.g., in a noisy restaurant kitchen, preparing kitchen utensils and cooking ingredients at the direction of a master chef, attending a class for trainee computer technicians, or following written directions for the use of specialized automobile repair equipment. They may be for occupational purposes, either in the home country, e.g., while employed in the tourist industry, welcoming and checking in hotel guests, renting surf boards, or leading a guided tour, or overseas, e.g., while stationed at an embassy, interviewing visa applicants, issuing instructions to security personnel, or delivering an after-dinner speech. Whatever their main purpose and whether short- or long-term (including immigration to a new country), overseas stays will usually also involve a variety of “social survival” tasks, such as following street directions, using public transport, opening a bank account, renting an apartment, taking a driver’s test, visiting a doctor, or registering a child for school.

Identified in the first stage of a task-based needs analysis (Long 2005, 2015, pp. 85-168), these real-world communicative activities are target tasks for the learners concerned.

In the second stage of the NA, samples are gathered of spoken or written language use by native speakers engaged in the most critical and/or frequent of the target tasks. Modified elaborated (not linguistically simplified) versions of the samples subsequently become part of the task-based materials — the pedagogic tasks — produced for a course, and constitute the major source of new language for the learners, new language that is relevant for their target discourse domains.

Instructional materials for TBLT take the form of pedagogic tasks — initially simple, progressively more complex, approximations to the original target tasks. Multiple series of pedagogic tasks are sequenced for classroom use according to their intrinsic complexity — task complexity, not linguistic complexity (Long, 2015, pp. 223-247; Robinson, 2009, 2011). Each series culminates in the full target task or a simulation thereof, which serves as the exit task for a module. Collectively, the series of pedagogic tasks form a task syllabus. There is no linguistic syllabus, overt or covert, other than what Corder (1967) termed the internal “learner syllabus.”

The task syllabus is delivered in conformity with ten (putatively universal) methodological principles (MPs) for LT

  • MP1: Use task, not text, as the unit of analysis,
  • MP2: Promote learning by doing,
  • MP3: Elaborate input,
  • MP4: Provide rich input,
  • MP5: Encourage inductive “chunk” learning,
  • MP6: Focus on form,
  • MP7: Provide negative feedback,
  • MP8: Respect learner syllabi and developmental processes,
  • MP9: Promote cooperative collaborative learning, and
  • MP10: Individualize instruction.

The MPs are motivated by what SLA research has shown about how children and adults learn L2s successfully (Long, 2009, 2015, pp. 300-328), and independently, by principles from the philosophy of education (Long, 2015, pp. 63-83).

The MPs are realized at the local classroom level by pedagogic procedures (PPs). Selection of appropriate PPs from the many available in each case is best left to the teacher, usually the expert on local circumstances, assuming he or she is well trained and experienced. There are no universal or “best” PPs. Rather, choices should vary systematically to cater to individual learner differences (age, level of L1 or L2 literacy, working memory, aptitudes for implicit or explicit learning, etc.), type of linguistic feature (salient or non-salient, marked or unmarked, fragile or robust, etc.), and so on. To deal with persistent errors with a non-salient target language feature, such as intra-sentential clitics, a teacher of literate, analytically-oriented adults might choose a PP for providing explicit negative feedback (MP7), e.g., a simple pedagogic rule of thumb, but for a salient feature, such as adverb placement, one for providing implicit negative feedback, e.g., recasts or clarification requests, or on indirect negative evidence.

The whole approach is task-based throughout, and constitutes genuine task-based LT (TBLT).


Long is against “Hybrids”

 Long opposes the view of R. Ellis (1994, 2003, 2009) and others, who see TBLT in terms of what Long calls “a dual structural and task hybrid”. He quotes R.Ellis

  “ . . . task-based teaching need not be seen as an alternative to more traditional, form-focused approaches but can be used alongside them . . . (Long and Skehan view traditional structural teaching as theoretically indefensible while I see it as complementary to TBLT)” (R. Ellis, 2009, pp. 221 and 225).

and goes on to discuss Klapper (2003) who proposes a ‘hybrid model’ “which accepts the primacy of the communicative focus but reinstates declarative knowledge and practice at the appropriate point in the task cycle” (p. 40). He claims this is “the most effective way to make forms salient to students and thereby to speed up the acquisitional process” (p. 40). He is careful, however, to distinguish his proposal from conventional PPP:

“In such a model, there would be no structural syllabus independent of the task syllabus, rather the forms to be practiced would arise from the task context; but planning would be required to ensure grammatical structures were regularly revisited and recycled, especially those that were poorly represented in classroom input and task instruction.”  (Klapper, 2003, p. 40).

Long rejects this view. His TBLT rejects the assumption that performance of individual grammatical structures can be developed to native-like levels rooted in a separate explicit knowledge system via the massive practice required for automatization. It also rejects the view that the new underlying knowledge will morph into the separate implicit system.

Klapper’s proposal, like that of R. Ellis (1994, 2003, 2009), is for a hybrid model, one which may yet turn out to be correct but which some would see as fatally flawed because it seeks to meld two oppositional psycholinguistic positions. In my view, if there is a place for skill-acquisition theory in TBLT, it is for the use of task repetition to improve task performance, not performance of individual grammatical structures before their time.  


Incidental focus on form works

Swan writes that TBLT is often justified by the claim that “‘linguistic regularities’ are acquired through ‘noticing’, during communicative activity, and should therefore be addressed primarily by incidental ‘focus on form’ during task performance,” adding that “there is no compelling evidence for the validity of the model” (2005, p. 376).

Long points out that a considerable body of research has demonstrated that incidental focus on form works. Norris & Ortega’ (2000) statistical meta-analysis of 49 studies of the relative effectiveness of explicit and implicit instruction showed little advantage for explicit instruction. 11 of the original studies, plus 34 new ones reported in the following decade prompted a second statistical meta-analysis (Goo, Granena, Novella, & Yilmaz, 2015), which confirmed the earlier results. In addition, a large body of research — well over 60 studies — on recasts has shown that the implicit negative feedback they provide leads to substantial learning gains in grammar and vocabulary (for an excellent narrative review, see Goo & Mackey, 2013; and for two statistical meta-analyses, Li, 2010, and Mackey & Goo, 2007).

The results from a combined total of well over 140 empirical studies and several statistical meta-analyses of the issue to date in these two areas alone should suffice to meet critics’ demand for ‘compelling evidence.’ Meanwhile, nothing approaching the quality and scope of this body of empirical work exists in support of the favored PPP model.


Knowledge acquired incidentally is durable

Swan claims that few studies have demonstrated lasting retention and availability for spontaneous use of forms acquired incidentally (Swan, 2005, p. 379). The facts are these.

  • Results show that explicit procedures often do as well as, and sometimes slightly better than, incidental, on-line focus on form but usually only with simple linguistic targets, only on immediate post-tests, and only using discrete-point measures, and that improvements achieved that way tend to deteriorate over time (Doughty, 2003)
  • In contrast, while the jury is still out, results to date suggest that incidentally and/or implicitly learned L2 knowledge is more durable and tends to increase over time (probably because of the initial greater depth of processing involved). In a statistical meta-analysis of 33 studies, Li (2010) found a medium overall effect for oral corrective feedback, that the effect was maintained over time, and that while the immediate and short-term effect of explicit feedback was greater, the longer-term effect size for recasts (a prime example of ‘incidental, on-line focus on form’) was slightly larger than the short-term effect, more effective than explicit feedback on delayed post-tests, and more enduring, even increasing over time (2010, p. 343).


TBLT doesn’t neglect grammar

A number of critics (e.g., Swan, 2005; Widdowson, 2003) have alleged that TBLT pays insufficient attention to the teaching of grammar. While pedagogic tasks in genuine TBLT are not designed to teach particular grammatical structures, that does not mean that grammar is not taught. The difference is that attention to grammar (or phonology, lexis, collocations, pragmatics, etc.) is not carried out as a separate activity, as an end in itself (focus on forms), but during (and if necessary after, but not before) task work, as part of the methodology of TBLT.

A lot of grammar is learned from positive evidence, i.e., from repeated exposure to instances of target grammatical features and their authentic uses encountered primarily in task-based materials. Most problems are dealt with reactively, usually by the teacher, but also by other students, and if pedagogic tasks are designed cleverly enough, by corrective feedback loops in the tasks themselves. The reactive quality of MP6: Focus on form, and MP7: Provide negative feedback, means that the timing of attention to grammar is more likely to be developmentally appropriate and occur at the most propitious moment for the learner(s) concerned, not arbitrarily, when pre-determined by an unseen textbook writer. In sum, linguistic items are dealt with, and dealt with in a more scientifically defensible manner than by the traditional synthetic syllabus.

Long goes on to deal with other criticisms of his TBLT and I urge you to read the full article. The last part of his article deals with real issues that confront his proposals, which are

  • Task complexity criteria
  • Task-based assessment and the transferability of task-based abilities
  • In-service teacher education for TBLT


My guess is that most teachers reading this account will react by saying that it’s just not feasible to do the work needed to implement Long’s recommendations. How can most teachers possibly carry out the kind of needs analysis he recommends, and how can they then do the work required to produce the materials needed to elaborate the pedagogical tasks which flow from that needs analysis?

I sympathise, but on the other hand, how can we ignore the fact that ELT is largely ineffective, thanks to the way it’s run and done? If  Long’s closely argued case persuades us, surely we have to do something, not just say that the whole thing is unrealistic. It’s only unrealistic because of the way things are organised: the money spent on ELT is huge; does anybody seriously think that the industry can’t afford to do decent needs analyses or make different kinds of materials, or do different kinds of teacher training?


References cited above can be found in Long’s article:

Long, M. H. (2016). In defense of tasks and TBLT: Non-issues and real issues. Annual Review of Applied Linguistics 36, pp. 5-33. . © Cambridge University Press, 2016

 Let me just recommend

Goo, J., Granena, G., Novella, M., & Yilmaz, Y. (2015). Implicit and explicit instruction in L2 learning: Norris and Ortega (2000) revisited and updated. In Rebuschat, P. (ed.), Implicit and explicit learning of languages (pp. 443-482). Amsterdam and Philadelphia: John Benjamins.

IATEFL 2016: Birmingham Part 1


The last time I travelled to Birmingham, as we approached the station I heard the train driver say on the intercom

Next stop Birmingham; abandon hope all you here alight here.

The driver sped off before the locals could lynch him, out of the Midlands and on towards the North, where things actually get steadily worse until you reach the civilised haven of Scotland. Mind you, if you go South from Birmingham things get steadily worse till you reach the coast, where, if you take my advice, you’ll get the first ferry out and go to France.

It’s a dreary place, Birmingham. It was bombed very heavily in the second world war, they did a truly dire job of re-building it, and it’s now making desperate attempts to re-invent itself as a “service centre”. Part of the re-invention is the new Bullring (formerly the Bull Ring), a horrendous, massive, ugly shopping mall built in the early 60s. Ill-informed Spanish tourists used to flock there hoping to see some hapless local would-be matador get gored to death in the ring by a brave, Andaluz bull, only to find endless shops selling tatty clothes made in Spain by Zara. One up-beat note is that Burne-Jones was born here and the Birmingham Museum has a really excellent collect of Pre-Raphaelite art. The gallery itself is very well-restored, the optimistic glass roof now supplemented with clever artificial lighting, and certainly worth a visit. Make sure you see Ford Maddox Brown’s The Last of England, a splendidly disturbing work which will reinforce your instinct to flee England as soon as you can.

You detect an anti-English sentiment? Well. yes, you’re right: I’m not a big fan of England. In many ways I hate it. Which reminds me of the time I was in a pub in London, standing at the bar, and this big bloke walked in.

Un pint of bitter  he said to the landlord with a thick, what I took to be a German, accent.

Had this happened more recently, no doubt he would have followed Dellar’s example and said Can I get un pint of bitter, but, mercifully, English hadn’t yet sunk so low.

Where are you from? I asked him.

I am from Svizzzitzzzerland he replied.

Ah, yes, Switzerland  said I, craftily recasting.

You like Svizzzitzzzerland ? he asked.

To tell you truth, I find it a bit dull – the people are very law abiding, I explained.

Ya! I hate Svizzzitzzerland also! he said.


I don’t actually hate England, but outside London it’s a hard place to like, in my opinion. The weather’s the worst of it, but there’s also a deeply engrained, smug, anti-intellectualism which doesn’t suit my sensitive, bookish soul. Not that this anti-intellectualism is confined to the lower orders in England. The ruling classes have always been deeply suspicious of intellectuals, as illustrated by the expression “too clever by half” used in the upper echelons of the civil service, the judiciary, etc. to refer to anyone who reads without moving their lips or does mental arithmetic.

Anyway, Birmingham is the venue of the 2016 IATEFL jamboree and if you’ve already bought your tickets, well jolly good luck to you. My advice is: get off the train, find a taxi and go straight to your hotel. I’m not saying it’s dangerous to dawdle, just that there isn’t any point in dawdling, unless you really can’t wait to have your first chili flavoured bit of low grade meat and sawdust sarni, or you want to do a bit of bowling.

Once in your hotel, unless you’re not paying the bill, unless, that is, you’re a star of the event, or a commercial rep. of certain standing, you’ll probably notice the smell of damp carpets and overcooked cabbage. The damp carpet smell is a feature of English life; wall to wall carpeting used to be a sign that you’d dragged yourself out of poverty, now it’s a sure sign that you’re falling back down into it, unable to aspire to the stripped woodwork floors thrown with kilims that the more affluent homes display. As for the cabbage stink, it’s how the English cook. And if it’s not cabbage, well, it’s probably curry. Either way, your hotel will make matters worse by having air fresheners everywhere, just to make sure you can’t breathe properly.

So now it’s Registration Day. There you are, after a bad night’s sleep, waiting in line already suffering from the effects of a full English breakfast. Please, listen. Do NOT eat a hotel “Full English Breakfast”, or any part of it. It’s bad food, believe me: seriously, it’s very bad food. Do yourself a favour: don’t eat any of it. Really, eat the tablecloth before you eat the sausage or the bacon or the beans or the tinned tomatoes or any of it. You’ve got enough to cope with without eating that crap.

You get your plastic ID badge and your bag of shoddy conference goodies, and in you go. As the train driver said: Abandon hope!

To be Continued

Larsen-Freeman Lost in Complexity: Bullshit Baffles Brains


I see that Diane Larsen-Freeman is doing a plenary at the 2016 IATEFL conference. She shares the “Invited Plenary Speaker” limelight with her admirer Scott Thornbury, who will wisely side-step theoretical stuff and entertain everybody with a history of ELT.  Thornbury often cites Larsen-Freeman’s incoherent ventures into science and postmodernism in his own forlorn attempts to promote emergentism, thus showing that he’s not only a dedicated follower of fashion but also a bad judge of scholarship.

Larsen-Freeman wants to re-direct ELT. In her plenary she will suggest that we ditch our outdated processing model of SLA and replace it with an ecological approach where, among other radical proposals, the construct of affordances takes the place of input.

Actually, I’m not sure that affordances is a construct here; maybe it’s just a metaphor, but anyway Larsen-Freeman is the last person to ask for clarification since she messes around with terms like metaphor, hypothesis and explanation with such gay abandon that it’s anybody’s guess what she means.

In an attempt to prepare everybody in the audience for what you’ll hear, I offer a summary of Gregg’s (2010) review of Larsen-Freeman and Cameron’s truly dire book on which the talk will be based. I’ve taken terrible liberties with Gregg’s text. You should read his article and I apologise to him for my clumsy summary.  Everything that follows is from the article.


Larsen-Freeman and Cameron (LFC) explain that complexity theory is the theory of complex systems, a complex system being a certain type of system

‘produced by a set of components that interact in particular ways to produce some overall state or form at a particular point in time’ (p. 26).

Complex systems are dynamical, non-linear, sensitive to initial conditions, open to input from outside the system, and adaptive.

Got it? Well good, because the book gives us nothing that could enable one to analyse SLA data or critique SLA research in terms of complexity theory. It’s hard to find any connection between complexity theory on the one hand, and the conclusions LFC draw from it on the other. AS Gregg puts it:

by and large, when they say ‘from a complexity perspective, P’, they could just as well have said more simply, and more accurately, ‘We think that P.’

Misunderstanding Science

LFC start by misunderstanding the nature of standard science and the relation of complexity theory to it. For instance, they say:

 Whereas positivist research is based on the assumption that there are universal laws and thus    sets predictability as a goal of the research process, from this complexity perspective, no two    situations can be similar enough to produce the same behavior; thus predictability becomes    impossible. (p. 16)

Some comments:

  1. Positivists are a (defunct) class of empiricists. The nativist linguists and cognitive scientists that LFC oppose are not empiricists, and yet they, too, believe in universal laws and value predictability.
  2. Complexity theory does not oppose mainstream science, as Gribbin (2004: 3) points out, ‘chaos and complexity obey simple laws – essentially, the same simple laws discovered by Isaac Newton.’
  3. Predictability in standard science is not always the goal – consider palaeontology or evolutionary biology, for instance – so non-predictability by itself cannot distinguish complexity theory from other scientific theories.
  4. Not only do LFC incorrectly characterize science as positivistic, they consider it ‘reductionist’, where ‘reductionism’ seems to mean ‘looking for an underlying cause or causes’. LFC reject ‘the common reductionist approach in science, which relies on a central principle that one can best understand an object of inquiry by taking it apart and examining its pieces’ (p. 231). How this differs from identifying the elements of a complex system and studying their interactions is not clear. In any case, on the standard meaning of the term it is LFC who could be considered reductionists, since they reject the position that language cannot be reduced to general cognitive or neurological phenomena.

Misunderstanding language

LFC say

  • Language is no longer perceived as an idealized, objectified, atemporal, mechanistic ‘thing’. (p. 20)
  •  There is no need to conceive of language in a decontextualized terminal state of frozen animation. (p. 91)
  • A complex systems view of language rejects a view of language as something that is taken in a static commodity that one acquires and therefore possesses forever. (pp. 115–16)
  • Language is not a single homogeneous construct to be acquired. (p. 116)
  • Learning is not the taking in of linguistic forms by learners. (p. 135)

LFC offer not one single example of a researcher, nativist or otherwise, who perceives language as an idealized thing, or as something taken in, etc. This sort of preaching to the choir may satisfy the choir, or at least the less intellectually rigorous of the singers, but it is a disservice to the disinterested reader looking for information about complexity in language.

Here is one whole paragraph from the five paragraph section on ‘First language acquisition from a nativist position’ (p. 117; the numbers are added for reference):

Nativist accounts (1) rest on the assumption that the capacity to learn language is a unique    property of the human mind that is represented as (2) a separate module in the brain,  conceived of as (3) an organ within the brain that performs specific kinds of computation (4) (Chomsky, (1971). Nativists believe that this modular architecture allows the shape and form of I-language to be largely independent of (5) other aspects of cognitive processing or social functioning.  They also believe that the fact that (6) a UG is contained within the module (7) accounts for the evolution [sic; of language?] within the human species and explains how native language acquisition can take place so expediently, given what they feel is (8) a rather degenerative state [sic] of the input, filled with pauses and inchoate utterances and other dysfluencies, (9) referred to collectively as the ‘poverty-of-stimulus’. The impoverished input, combined with what is alleged to be an absence of negative evidence (i.e. evidence of what the system will not permit),  leads nativists to argue ‘that the complexity of core language cannot be learned inductively by general cognitive mechanisms and therefore learners must be (10) hard-wired with principles that are specific to language (Goldberg, 1995: 119), although quite naturally, the search for what these principles are is an ongoing one, which has gone through several stages so far.

Very briefly:

  1. Nativist accounts rest on assumptions that have nothing to do with the putative uniqueness of language.
  2. The module – again, not a specifically nativist assumption – is a module of the mind not the brain.
  3. Nor is it taken to be an organ of the brain.
  4.   Chomsky (1971) says nothing about the brain, or of the language faculty as an organ.
  5. The language faculty or UG is not an aspect of cognitive processing, let alone social functioning.
  6. UG is not contained in the putative module, it is the module.
  7. Nobody, but nobody, claims that UG accounts for the evolution of language, or of anything else, for that matter.
  8. Dysfluencies have never been of any importance in Poverty of the Stimulus arguments. And no nativist has ever said that input is filled with dysfluencies.
  9. And of course nobody refers to the dysfluencies themselves as the poverty of the stimulus.
  10. Nobody talks of ‘hard-wiring’ principles.

Misunderstanding acquisition research

This failure to engage with the facts shows up as well in LFC’s rare accounts of putative acquisition phenomena. They tell us, for instance, that a complexity perspective can explain the so-called ‘vocabulary burst’ in child language acquisition (pp. 129–30). They fail, however, to tell us what sort of burst this is; they do not provide any documentation of the burst; they do not even suggest what about the burst would be accounted for, or how, by a dynamic systems account. And they do not consider the claim by Bloom (2000) – who studied the data in detail – that there is no burst to account for (see also Ganger and Brent, 2004; McMurray, 2007).

Even more striking is LFC’s bizarre explanation for so-called U-shaped behavior in the first language acquisition of the English past tense (p. 129).

Initially, learners fail to mark past tense morphologically (i.e. they fail to use the verb + -ed construction) due to the frequency of irregular verbs in the language addressed to them. Later, the irregulars disappear in their production … As the number of verbs in the competition pool expands across the course of learning … the irregular forms reappear.

The only source LFC give for this idiosyncratic account is Ellis and Larsen-Freeman (2006), which simply says the same thing. LFC, in other words, make no attempt to examine the data, which clearly show that:

  1. Initially learners fail to mark the past on both regulars and irregulars.
  2. Irregular pasts, once they appear in production, do not ever disappear; and
  3. The U-shaped behavior LFC propose to account for does not exist (see, inter alia,  Marcus et al., 1992; Maratsos, 2000; Maslen et al., 2004).

Given the lack of a phenomenon to explain by appealing to dynamic systems theory, it is probably superfluous to point out that, in any case, the proffered explanation does not in fact depend in any way on dynamic systems.

This casual disregard for the actual phenomena of language is reflected in LFC’s reference list: something under 500 items, and not one single paper on language acquisition, first or second; not one single paper from the theoretical linguistics literature proposing some account of some specific phenomenon of language. This, of course, is because there is no discussion anywhere in the book itself about specific linguistic phenomena (unless you count the two examples just mentioned), even from their ‘complexity perspective’. Rather, what we get is ex cathedra, take-it-or-leave-it statements of doctrine:

  • Dynamic systems theory does away with the distinction between competence and performance. (p. 17)
  •  Complexity theory thus [sic] brings about a separation of explanation and prediction. (p. 72)
  • If language is conceived of as a complex system, then it is entirely possible for novel complexities to emerge. (p. 98)
  • A language-specific innate mental organ is not consistent with seeing language as a complex    adaptive system. (p. 156; pace, for example, Plaza-Pust, 2008; Hohenberger and Peltzer-Karpf,    2009: 482)

And so on. Sometimes LFC do not even bother to disguise their opinion as a claim. Here, for instance, in its entirety, is their discussion of the work of William O’Grady, arguably the one emergentist in the field of language acquisition to take seriously both the pheomena and the nativist account of the phenomena, and to offer a detailed, wide-ranging, and at least potentially competitive analysis that rejects the idea of a UG:

While we find O’Grady’s (2005) syntactic carpentry account intriguing, we do not feel that a computational account is necessary or desirable to account for the emergence of order (p. 159, footnote 1).

How, one might ask, do LFC propose to account for ‘the emergence of order’ without computation? Why would this be desirable? I have no idea and, if LFC do, they are keeping it under their hats.

Not only do LFC ignore the work that has been done by ‘positivist’, ‘reductionist’ acquisition researchers (i.e. researchers with a commitment to empirically sound science), they do not present any examples of research from their complexity perspective. As Ionin (2007: 28) says,

It does not seem very fair to criticize existing, developed theories for not incorporating all possible factors, when no alternative theory is presented that does incorporate these factors.

LFC do give an extended account of Larsen-Freeman (2006) – a study of five Chinese speakers writing the same story four times over a six-month period – but they do not show the relevance of dynamic systems theory or complexity theory to that research.

Gregg concludes

Complexity theory and dynamic systems theory are important, well-established, and productive parts of the physical sciences. Although it is still a matter of some controversy whether these theories can be applied to the biological and cognitive sciences, work such as that of Thelen and Smith is certainly promising, to say the least. So it is worth trying to find out to what extent, if any, complexity theory can be applied to the domain of language. LFC have failed utterly to address the question, let alone resolve it. The question is an empirical one, and the answer – if answer there be – will be found not by taking a perspective and expatiating on it, but by doing the science.


We must be critical. We must be on guard against bullshit. So often we’re lulled into acquiescence by big names and by academic credentials. What Gregg makes clear in his article is that the book he reviews is bullshit.

Scott Thornbury swallows this bullshit and attempts his own popular version of it (see https://criticalelt.wordpress.com/sla/emergentism-2/ for example). Like the LFC book, Thornbury’s attempts to argue the case for emergentism are hopelessly argued. Actually, they’re not argued at all: they’re just asserted with lots of illustrations.

So much of this new wave of relativist bullshit, most of it weirdly inspired by an unconsidered embrace of empiricism (why do so few of them appreciate the consequences of the epistemology they sign up for??) rides on the back of what is now considered to be Larsen-Freeman’s  “seminal work” of 1997, an article where she does exactly what Gregg criticises in his review of the 2007 work. In general, Larsen-Freeman

  • misrepresents scientific method; in particular she fails to distinguish between description and explanation and demonstrates an ignorance of the differences between a theory, a hypothesis, a theoretical construct, and a metaphor
  • misreprents positivism
  • misrepresents empricism
  • misrepresents reductionism
  • misreprents nativism
  • misreprsents cognitive SLA research
  • offers no convincing connection between general aspects of complexity theory and her views of second language learning and teaching.

In 1999 I was in a bar in San Cugat with my good friend Connie O’Grady and George Yule. George was on a roll. “Bullshit baffles brains, George”, intervened Connie, “but don’t suppose for a second that we don’t see through it. We don’t  like dressed-up crap, and we recognise good stuff on the rare occasions that we get it”.


Gregg, K. R. (2010) Shallow draughts: Larsen-Freeman and Cameron on complexity. Second Language Research, 26(4) 549–56.

Larsen-Freeman, D. ( 1997 ). Chaos / complexity science and second language acquisition. Applied Linguistics, 18( 2 ), 141-165.

Larsen-Freeman, D. and Cameron, L. (2008) Complex systems and applied linguistics. Oxford:   Oxford University Press-

Can we get a pineapple?


Lost and Unfounded

Leo Selivan’s and Hugh Dellar’s recent contributions to EFL Magazine give further evidence that their strident, confidently expressed ideas lack any proper theoretical foundations.

We can compare the cumulative attempts of Selivan and Dellar to articulate their versions of the lexical approach with the more successful attempts made by Richards and Long to articulate their approaches to ELT.  Richards (2006) describes what he calls “the current phase” of communicative language teaching as

a set of principles about the goals of language teaching, how learners learn a language, the kinds of classroom activities that best facilitate learning, and the roles of teachers and learners in the classroom ( Richards, 2006:2)

Note that Richards says this on page 2 of his book: he rightly starts out with the assumption that “a set of principles” is required.

Long (2015) offers his own version of task based language teaching and he goes to great lengths to explain the underpinnings of his approach. His book is, in my opinion, the best example in the literature of a well-founded, well-explained approach to ELT. It’s based on a splendidly lucid account of a cognitive-interactionist theory of instructed SLA, on careful definitions of task and needs analysis, and on 10 crystal clear methodological principles. Long’s book is to be recommended for its scholarship, its thoroughness, and, not least, for its commitment to a progressive approach to ELT.

So what do Selivan and Dellar offer?

In his “Beginners’ Guide To Teaching Lexically”, http://eflmagazine.com/beginners-guide-to-the-lexical-approach/ Selivan makes a number of exaggerated generalisations about English and then outlines “the main principles of the lexical approach”. These turn out to be

  1. Ban Single Words
  2. English word ≠ L1 word
  3. Explain less – explore more
  4. Pay attention to what students (think they) know.

To explain how such “principles” adequately capture the essence of the lexical approach, Sellivan offers “A bit of theory”for each one. For example, Selivan says “A new theory of language, known as Lexical Priming, lends further support to the Lexical Approach.  ……. By drawing students’ attention to collocations and common word patterns we can accelerate their priming”. Says he. But what reasons does he have for such confident assertions? Selivan fails to give his reasons, and fails to give any proper rationale for the claims he makes about language and teaching.

In his podcast, http://eflmagazine.com/hugh-dellar-discusses-the-lexical-approach/ Dellar agrees that collocation is the driving force of English. He claims that the best way to conduct ELT is to concentrate on presenting and practising the lexical chunks needed for different communicative events. Teachers should get students to do things with these chunks such as “fill in gaps, discuss them, order them, say them, write them out themselves, etc.” with the goal of getting students to memorize them. Again, Dellar doesn’t explain why we should concentrate on these chunks, or why teachers should get students to  memorise them. Maybe he thinks “It stands to reason, yeah?”

At one point in his podcast Dellar says that, while those just starting to learn English will go into a shop and say “I want, um, coffee, um sandwich”,

…. as your language becomes more sophisticated, more developed, you learn to kind of grammar the basic content words that you’re adding thereSo you learn “Hi. Can I get a cup of coffee and a sandwich, please.” So you add the grammar to the words that drive the communication, yeah? Or you just learn that as whole chunk. You just learn “Hi. Can I get a cup of coffee? Can I get a sandwich, please?” Or you learn “Can I get…” and you drop in a variety of different things.

This is classic “Dellarspeak”: a badly-expressed misrepresentation of someone else’s erroneous theory.  Dellar doesn’t tell us how we teach learners “to grammar” content words, or when it’s better to teach “the whole chunk” – or what informs his use of nouns as verbs, for that matter. As for the “can I get…?” example, what’s wrong with just politely naming what we want:  Good MorningA coffee and a sandwich, please.”?  What is gained by teaching learners to use the redundant Can I get…. phrase?

But enough of Dellar’s hapless attempts to express other people’s ideas, let’s cut to the chase, if you get my drift. The question I want to briefly discuss is this:

Are Selivan’s and Dellar’s claims based on coherent theories of language and language learning, or are they mere opinions?


Models of English 

Crystal (2003) says: “an essential step in the study of a language is to model it”. Here are two models:

  1. A classic grammar model of the English language attempts to capture its structure, described in terms of grammar, the lexicon and phonology (see Quirk et.al. 1985, and Swan, 2001, for examples of descriptive and pedagogical grammars). This grammar model, widely used in ELT today, is rejected by Hoey.
  2. Hoey (2005) says that the best model of language structure is the word, along with its collocational and colligational properties. Collocation and “nesting” (words join with other primed words to form sequence) are linked to contexts and co-texts. So grammar is replaced by a network of chunks of words. There are no rules of grammar; there’s no English outside a description of the patterns we observe among those who use it. There is no right or wrong in language. It makes little sense to talk of something being ungrammatical (Hoey, 2005).

Selivan and Dellar uncritically accept Hoey’s radical new theory of language, but is it really better than the model suggested by grammarians?

Surely we need to describe language not just in terms of the performed but also in terms of the possible. Hoey’s argument that we should look only at attested behaviour and abandon descriptions of syntax strikes most of us as a step too far. And I think Selivan and Dellar agree, since they both routinely refer to the grammatical aspects of language. The problem is that Selivan and Dellar fail to give their own model of language, they fail to clearly indicate the limits of their adherence to Hoey’s model, they fail to say what place syntax has in their view of language. In brief, they have no coherent theory of language.

Hoey’s Lexical Priming Theory

Hoey (2005) claims that we learn languages by subconsciously noticing everything (sic) that we have ever heard or read about words, and storing it all in a massively repetitious way.

The process of subconsciously noticing is referred to as lexical priming. … Without realizing what we are doing, we all reproduce in our own speech and writing the language we have heard or read before. We use the words and phrases in the contexts in which we have heard them used, with the meanings we have subconsciously identified as belonging to them and employing the same grammar. The things we say are subconsciously influenced by what everyone has previously said to us.

This theory hinges on the construct of “subconscious noticing”, but instead of explaining it, Hoey simply asserts that language learning is the result of repeated exposure to patterns of text (the more the repetition the better the knowledge), thus adopting a crude version of behaviourism. Actually, several on-going quasi-behaviourist theories of SLA try to explain the SLA process (see, for example, MacWhinney, 2002; O’Grady, 2005; Ellis, 2006; Larsen-Freeman and Cameron, 2008), but Hoey pays them little heed, and neither do Selivan and Dellar, who swallow Hoey’s fishy tale hook line and sinker, take the problematic construct of priming at face value, and happily uses “L1 primings” to explain L1 transfer as if L1 primings were as real as the nose on Hoey’s face.

Hoey rejects cognitive theories of SLA which see second language learning as a process of interlanguage development, involving the successive restructuring of learners’ mental representation of the L2, because syntax plays an important role in them. He also rejects them because, contrary to his own theory, they assume that there are limitations in our ability to store and process information. In cognitive theories of SLA, a lot of research is dedicated to understanding how relatively scarce resources are used. Basically, linguistic skills are posited to slowly become automatic through participation in meaningful communication. While initial learning involves controlled processes requiring a lot of attention and time, with practice the linguistic skill requires less attention and less time, thus freeing up the controlled processes for application to new linguistic skills. To explain this process, the theory uses constructs such as comprehensible input, working and long term memory, implicit and explicit learning, noticing, intake and output.

In contrast, Hoey’s theory concentrates almost exclusively on input, passing quickly over the rest of the issues, and simply asserts that we remember the stuff that we’ve most frequently encountered. So we must ask Selivan and Dellar: What theory of SLA informs your claims? As an example, we may note that Long (2015) explains how his particular task-based approach to ELT is based on a cognitive theory of SLA and on the results of more than 100 studies.

Hoey’s theory doesn’t explain how L2 learners process and retrieve their knowledge of L2 words, or how paying attention to lexical chunks or “L1 primings” affects the SLA process. So what makes Selivan and Dellar think that getting students to consciously notice both lexical chunks and “L1 primings” will speed up primings in the L2? Priming, after all, is a subconscious affair. And what makes Dellar think that memorising lexical chunks is a good way to learn a second language? Common sense? A surface reading of cherry-picked bits of contradictory theories of SLA? Personal experience? Anecdotal evidence? What? There’s no proper theoretical base for any of Dellar’s claims; there’s scarce evidence to support them; and there’s a powerful theory supported by lots of evidence which suggests that they’re mistaken.


 All Chunks and no Pineapple 

Skehan (1998) says:

Phrasebook-type learning without the acquisition of syntax is ultimately impoverished: all chunks but no pineapple. It makes sense, then, for learners to keep their options open and to move between the two systems and not to develop one at the expense of the other. The need is to create a balance between rule-based performance and memory-based performance, in such a way that the latter does not predominate over the former and cause fossilization.

If Selivan and Dellar agree that there’s a need for a balance between rule-based performance and memory-based performance, then they have to accept that Hoey is wrong, and confront the contradictions that plague their present position on the lexical approach, especially their reliance on Hoey’s description of language and on the construct of priming. Until Selivan and Dellar sort themselves out, until they tackle basic questions about a model of English and a theory of second language learning, so as to offer some principled foundation for their lexical approach, then it amounts to little more than an opinion, more precisely: the unappetising opinion that ELT should give priority to helping learners memorise pre-selected lists of lexical chunks. 


Crystal, D. (2003) The English Language. Cambridge: Cambridge University Press.

Ellis, N. C. (2006) Language acquisition and rational contingency learning. Applied Linguistics, 27 (1), 1-24.

Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. Psychology Press.

Krashen, S. (1985) The Input Hypothesis: Issues and Implications. Longman.

Larsen-Freeman, D and Cameron, L. (2008) Complex Systems and Applied Linguistics. Oxford, Oxford University Press.

Lewis, M. (1993) The Lexical Approach. Language Teaching Publications.

Lewis, M. (1996) Implications of a lexical view of language’. In Willis, J,, & Willis, D. (eds.) Challenge and Change in Language Teaching, pp. 4-9. Heinemann.

Lewis, M. (1997) Implementing the Lexical Approach. Language Teaching Publications.

Long, M. (2015) Second Language Acquisition and Task-Based Language Teaching. Wiley.

MacWhinney, B. (2002) The Competition Model: the Input, the Context, and the Brain. Carnegie Mellon University.

O’Grady, W. (2005) How Children Learn Language Cambridge, Cambridge Universiy Press.

Richards, J (2006) Communicative Language Teaching Today. Cambridge University Press.

Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. (1985) A Comprehensive Grammar of the English Language, London: Longman.

Skehan, P. (1998) A Cognitive Approach to Language Learning. Oxford: Oxford University Press.

Swan, M. (2001) Practical English usage. Oxford: Oxford University Press.