The science behind English DNA

OK! You asked for it. This page is very long despite being a "short" explanation of the research English DNA is built on. Use the Quick Links to get around it easily.



When attempting to acquire a second language one of the most daunting tasks facing students is the acquisition of a vocabulary large enough to be even minimally competent in the language. Most students quickly realise that the memorization required to learn thousands of new words can be a tedious and boring undertaking. 

It is commonly agreed that the sheer size of the task makes acquiring an extensive vocabulary “one of the largest challenges in learning a second language” [1].

Yet acquiring vocabulary in our own language is an almost effortless activity. Why should there be such a dramatic difference in the effort required to learn new words in a foreign language in comparison with learning the vocabulary of one’s own native language?

Nation, (2001) provides some clues to this puzzle and makes the point that there are “different learning burdens for learners with different language backgrounds” (p. 23) “… the more a word represents patterns and knowledge that learners are already familiar with the lighter its learning burden… For learners whose first language is closely related to the second language, the learning burden of most words will be light. For learners who language is not related to the second language, the learning burden will be heavy” (p. 24).

In other words, when the “patterns” of foreign words are similar to the patterns present in the words of our native language, they are easier to acquire.


Fast mapping


Both adults and children can acquire new words very quickly in their native language. Small children can be observed learning new words on a single exposure to them. This process has come to be known as “fast mapping” (Carey & Bartlett, 1978; Dollaghan, 1985; Dollaghan, 1987).

In a series of experiments 4 to 5 year old children were casually exposed to some nonsense words and then shown a novel object that the new “nonword” was supposed to refer to. Later, the children had no difficulty selecting the correct objects when the experimenter asked for them using the new words—even though the children had previously only ever heard the word on a single occasion.

Now, weren’t these made-up “nonsense words” equivalent to “foreign words”? Could these English-speaking children have learned Hungarian or Mongolian word with equal ease?

Perhaps not…

Even though the experimenters had just invented the experimental words, these made-up words had a special quality—the underlying structure of the sounds of these nonsense words conformed to the typical structure of English words (not Hungarian or Mongolian words, which have their own unique underlying structure). The important thing here is that the English-speaking children found these nonsense (but English-like) words easy to learn because they had already acquired a deep knowledge of the sound patterns which typically make up the English lexicon by the time they are about 9 months of age—well before they even begin to speak.


Acquiring the underlying patterns


In a study by Jusczyk, Frederici, Wessels, Svenkerud, & Jusczyk (1993), 9-month-old American and Dutch infants listened to words that were either English or Dutch. Each list contained words from one language that violated the phonotactic constraints of the other language (i.e. they contained sequences of sounds which do not occur in the other language). Both groups of infants listened significantly longer to the lists in their own native language—showing that the babies were already able to distinguish between foreign words and words in their native language on the basis of the underlying structure of the words.

A related study by Friederici & Wessels (1993) provided clear evidence that infants younger than 1 year of age have already acquired information about the typical sequences of sounds (phonotactic patterns) of their native language. The authors demonstrated that 9-month-old Dutch infants listened longer to monosyllables that contained sequences that were phonotactically legal in Dutch (i.e. sequences of sounds that can typically occur in Dutch words) than to monosyllables containing illegal sequences (sequences of sound which would never occur in Dutch words).

When a novel word contains a sequence of sounds that never occurs in English it is quickly apparent to an English speaker that the word sounds “foreign”. In Russian, many words begin with the sequence “zdr…”—but this sequence never occurs at the beginning of English words and therefore sounds very un-English. On the other hand, a nonsense word starting with “bla…” doesn’t sound at all foreign to an English speaker because there are many words that begin with this very common combination.


The importance of frequency of patterns

English DNA helix

In another experiment with 9-month-old infants, Jusczyk, Luce, & Charles-Luce (1994) investigated whether infants are sensitive to the frequency of occurrence of phonotactic patterns in English words. The experimenters played the babies made-up words (“nonwords”) that contained sequences of sounds that occur very frequency in English or nonwords that were made up of patterns that do occur in English words, but less frequently. 

By observing how often the babies turned their heads towards the speakers that were used to broadcast the nonwords into the room, the experimenters were able to note that the infants listened significantly longer to words which contained high-probability patterns in comparison with the words which contained patterns with less frequently occurring patterns. 

The researchers came to the remarkable conclusion that the babies had already acquired a deep intuitive knowledge of the statistical probability of the occurrences of the many different sound patterns in their native language: "[t]he longer listening times to the high-probability lists may reflect the fact that the infants have registered something about the frequency of certain phonotactic patterns in the input" (Jusczyk et al., 1994, p. 636). 

These sorts of experiments give us compelling evidence that explains why we find it easy to learn and recognize words in our native language but not words in a foreign language. By the age of 9 months, we have acquired an inventory of the basic structural building blocks and the rules used to form them into the words that makes up our native lexicon. Learning to recognize the sound of a new word in our native language involves piecing together components that are already highly familiar with rules that are intuitively obvious. Not only are the individual elements already familiar to young infants, but the statistical probabilities of how these elements combine with each other are also known.

At a deeply intuitive level we know which combinations of sounds can be used to begin a word and which can’t, which sounds typically follow which others (and which never do), which groups of sounds tend to occur in the middle of words and which combinations tend to occur at the end of words.

This intuitive knowledge, built up on the basis of early and massive exposure to the spoken language lies behind “wordlikeness”—a phenomenon that provides a major clue to understanding how we can improve our ability to acquire foreign words more rapidly and easily.


Why is "wordlikeness" important?

English DNA helix

In a paper entitled "Is nonword repetition a test of phonological memory or long-term knowledge? It all depends on the nonword", Gathercole (1995) investigated how well children are able to simply repeat back specially made-up words.

Before the experiment began, a panel of English-speaking adults had listened to all the made-up words and had rated each one as being more or less “wordlike”—their subjective judgement of how much it resembled a typical English word.

During the experiment, the researchers would simply pronounce each “nonword” for the child, and then ask the child to repeat it back. The researchers were interested in the extent to which the children’s performance was influenced by some innate memory ability (their “phonological working memory”) or whether it was influenced by the child’s knowledge of English words.

The experimenters found that the children showed greater accuracy in repeating nonwords that the panel of adults had previously rated as more “wordlike” than those that had been rated as less “wordlike”.

The conclusion we can draw from this experiment is that the children were better able to reproduce (repeat aloud) the more wordlike stimuli because they were more familiar with the underlying patterns in these stimuli. The wordlike stimuli could well have been real English words that the children would learn quickly, while the unwordlike stimuli were more like foreign words—which the children found very difficult to remember and had difficulty repeating back accurately.

It turns out that adults are able to make very fine, accurate judgements about how “wordlike” made-up words are. This gives us an important clue about how the brain retains information about words.

In a series of experiments, Frisch, Large & Pisoni (2000) established that there is a strong relationship between the probability that certain small sequences of sound occur in real English words and how adults will rate made-up words as “wordlike”.

Nonwords containing low-probability sequences are typically rated lower on a “wordlikeness” scale than nonwords that contain combinations of sounds that occur very frequently.

Even more importantly, the experiments by Frisch, Large & Pisoni (2000) showed that nonwords constructed from sounds that occur more frequently in English were recognised more accurately and more quickly after the experimental subjects had only heard them once—i.e. the subjects were able to “fast map” these words effortlessly. This suggested to the researchers that the participants in the experiments were using an implicit (intuitive) knowledge of the frequently distribution of the patterns of English words to improve recognition. In other words the experimental nonwords that were more like real English words were learned quickly and efficiently, while the nonwords that were not constructed like real English words (like foreign words) were difficult to recall.

The implication of these experiments is that the sort of “fast mapping” (or quick learning) of new native words that we observe in children is dependent on how “wordlike” the new items are. How “wordlike” a learner may consider a new word is, in turn, dependent on his or her knowledge of the underlying phonotactic components of the target language and an implicit knowledge of the probability that these components can combine to form words. This “deep intuitive knowledge” about how words are structured is dependent on the person’s previous exposure to the spoken language.

The obvious conclusion for foreign language learners is that the acquisition of foreign language vocabulary becomes easier and more efficient to the extent that the learner is able to acquire this sort of implicit phonotactic knowledge early in the learning process.

We all acquire a detailed knowledge of the phonotactic probabilities of our native language very early in life—we abstract it out from massive exposure to the spoken language we hear around us as babies. There is evidence that this process even begins in the womb from about 20 weeks. 

Foreign language students miss this early formative stage in their attempt to acquire a new language. They begin their study without the sort of extensive passive exposure to the language that they had with their own native language. Without this exposure they lack the sort of neural framework that would make foreign words sound less “foreign” and sound more “wordlike”—and thus easier to acquire.


A framework for learning new words

English DNA helix

The experiments cited above all suggest that before the age of 1 year, our brains have built up a detailed (but intuitive) body of knowledge of about the phonological and phonotactic structure of words. This detailed internal representation, attuned to the specifics of the native language, provides a sort of “framework” into which new words can quickly be added.

We can observe the role of such detailed “frameworks” in all kinds of other memory tasks. In a recent popular science book about memory, Joshua Foer (2011) writes: “When information goes “in one ear and out the other,” it’s often because it doesn’t have anything to stick to. This is something I was personally confronted with not long ago, when I had the opportunity to visit Shanghai for three days while reporting an article. Somehow I had managed to scoot through two decades of schooling without ever learning the most basic facts about Chinese history. I’d never learned the difference between Ming and Qing, or even that Kublai Khan was actually a real person. I spent my time in Shanghai roving around the city like any good tourist, visiting museums, trying to get a superficial grasp of Chinese history and culture. But my experience of the place was severely impoverished. There was so much I didn’t take in, so much that I was unable to appreciate, because I didn’t have the basic facts to fasten other facts to. It wasn’t that I just didn’t know, it was that I didn’t have the ability to learn.

The paradox—it takes knowledge to gain knowledge—is captured in a study in which researchers wrote up a detailed description of a half inning of baseball and gave it to a group of baseball fanatics […] and a group of less avid fans to read. Afterwards they tested how well their subjects could recall the half inning. The baseball fanatics structured their recollections around important game-related events, like runners advancing and runs scoring. They were able to reconstruct the half inning in sharp detail. […] The less avid fans remembered fewer important facts about the game and were more likely to recount superficial details like the weather. Because they lacked a detailed internal representation of the game, they couldn’t process the information they were taking in. They didn’t know what was important and what was trivial. They couldn’t remember what mattered. Without a conceptual framework in which to embed what they were leaning, they were effectively amnesics.” (pp. 207-8).

Whether it’s learning Chinese history, recalling the details of a baseball game, or learning the vocabulary of one’s native (or a foreign) language, we first need a structured framework to integrate the new information into. Before they are a year old, children growing up in an English-speaking environment have already acquired an internal representation of how the sounds of English words work together, how they combine and interact. This is a completely different framework from the one that Hungarian, Russian or Chinese children build up. New words that fit into this this system and are remembered quickly and efficiently—words whose structure does not fit into the system are simply difficult to learn.


Learning the phonological form - before the meaning

English DNA helix

In a series of experiments conducted in the 1970’s, a number of researchers investigated how learners recruit their prior phonological knowledge and recycle this knowledge to remember new words. Pressley & Levin (1981) were investigating the popular “keyword” method of learning foreign vocabulary—a system where students attempt to find some element in the foreign word they can recognise and which might sound something like a known English word. They then try to associate this word with the meaning of the foreign word.

Imagine that a student is trying to learn the Spanish word “pájaro” (meaning “bird”). As the sound of this word (“paa-ha-roh”) sounds a bit like “parked car, oh!”, the student might imagine a parked car filled with lots of fluttering birds. The idea is that the next time the student hears “pájaro” it is likely to remind him of “parked car” (with lots of birds inside it) which, in turn, will remind him of the meaning—“bird”.

While investigating this mnemonic technique, Pressley & Levin (1981) noted that while the method seemed to have some success in helping students remember the meanings of new words when they heard them (“receptive learning”), it didn’t help them remember how to say the new foreign word when prompted with the meaning (i.e. “productive learning” did not seem to benefit from the method). The researchers reasoned that the difficulty of productive recall of the foreign words might have arisen because the foreign stimuli had “unfamiliar phonological patterns" which the learners had not be able to integrate into their memory.

To test their ideas, the experimenters conducted an experiment where the participants were “prefamiliarised” with the form (or sound) of new words before they were introduced to their definitions. In this experiment the participants were to learn some uncommon definitions of “already familiar (i.e. well-integrated) multiple meaning English words". 

The "prefamiliarisation" of the stimuli consisted of 8 trials where participants attempted to recall (repeat back orally) the experimental words in response to the initial syllable. These preliminary trials were aimed at ensuring that the subjects had the phonological form of the word securely in their memory before were they introduced to the definition of the word.

Using this approach, productive learning was dramatically increased—the subjects had little trouble remembering the words and producing the word in response to the definition.

The Pressley & Levin (1981) study highlights several important clues in solving the puzzle of why foreign words are so intrinsically hard to learn—and how we might make the process simpler and more natural. 

The ability to acquire new vocabulary efficiently (such as in the phenomenon of "fast-mapping") requires that the phonological representation of the new item must either be "well-integrated" in memory first, or that the brain can construct it quickly before it can usefully be associated with its meaning. 

Associative learning presupposes the existence of at least two neural objects to be associated—the phonological form of the word (what the word sounds like) and it semantic representation or “meaning”. The extent to which a learner can rapidly construct phonological form and quickly integrate it into long term memory is dependent on the availability of pre-existing sublexical “chunks” or “building blocks” in memory. These can only be acquired from prior experience and exposure to the language—and students studying a second language do not come to the task “pre-equipped” with this implicit knowledge.

Efficient vocabulary learning in a second language therefore presumes the creation of a network of new language-specific building blocks.


How can we build an internal representation of "foreign" words?

English DNA helix

English DNA has been developed to help foreign students of English quickly build up a network of the phonological building blocks and an implicit understanding of how they fit together to form words—what combinations of sounds are typically found at the beginnings of English words, in the middle and at the ends of words, or how English syllables are typically constructed and where the primary and secondary stresses typically fall. This is the sort of deep linguistic knowledge that we have all acquired about our native language before the age of one, and why we can make fine judgements about whether nonsense words are more or less “wordlike”.

But this sort of knowledge is too large and complex to learn from a book—to be useful it must simply be acquired from massive exposure to the spoken language. English DNA is a “brain training” game which provide students with the sort of exposure the brain needs to extract the underlying patterns of English words and to build the specific brain tissue that encodes this sort of intuitive information.

The memorization of tens of thousands of new words needed to become a competent speaker of a new language need not be a long, tedious or boring process. After all our brains are built to acquire words easily and naturally, which we all do continuously throughout our lifetime. When foreign students begin to assimilate the necessary knowledge about the underlying structure of English words at an intuitive level, then the process of “fast mapping” new English words begins to becomes as straightforward and natural as it is in their own native language.



[1] Wikipedia article on Vocabulary:


Dollaghan, C. (1985). Child meets word: "Fast mapping" in preschool children. Journal of speech and hearing research, 28(3), 449-454.

Dollaghan, C. (1987). Fast mapping in normal and language-impaired children. Journal of Speech and Hearing Disorders, 52, 218-222.

Dollaghan, C., Biber, M., & Campbell, T. (1993). Constituent Syllable Effects in a Nonsense-Word Repetition Task, Journal of Speech and Hearing Research, 36, 105 l-1054.

Dollaghan, C., Biber, M., & Campbell, T. (1995). Lexical influences on non-word repetition. Applied Psycholinguistics, 16, 211-222.

Friederici, A.D., & Wessels, J. M.(1993). Phonotactic knowledge and its use in infant speech perception. Perception & Psychophysics, 54, 287 -29 5.

Frisch, S.A., Large, N.R., & Pisoni, D. B. (2000). Perception of wordlikeness: Effects of segment probability and length on the processing of nonwords. Journal of Memory and Language, 42, 481-496.

Gathercole, S. E. (1995). Is nonword repetition a test of phonological memory or long-term knowledge? It all depends on the nonwords. Memory & Cognition, 23 (l), 83 -94.

Joshu Foer, (2011), Moonwalking with Einstein: the Art and Science of Remembering Everything, Penguin Books, London.

Jusczyk, P. W., Luce, P. A., & Charles-Luce, J. (1994). Infant's sensitivity to phonotactic patterns in the native language. Journal of Memory and Language, 33, 630-645.

Jusczyk, P.W., Frederici, A.D., Wessels, J., Svenkerud, V. Y., & Jusczyk, A. M. (1993). Infants' sensitivity to the sound pattern of native language words. Journal of Memory and Language, 32, 402-420.

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.

Pressley, M., & Levin, J. R. (1981). The keyword method and recall of vocabulary words from definitions. Journal of Experimental Psychology: Human Learning and Memory, 7 (1), 72-76.

Pressley, M., Levin, J. R., Hall, J.W., Miller, G.E., & Berry, J. K. (1980). The keyword method and foreign word acquisition. Journal of Experimental Psychology: Human Learning and Memory, 6, 163 - 173.
                                                                                                                                                        ©Dr Paul Sulzberger, 2014


If you'd like to know more please get in touch.