Using Electroencephalography (EEG) to Investigate Anticipatory Processing in Second Language Speakers
A core question in linguistics research concerns the types of mechanisms that readers/listeners rely on during online language processing. One such mechanism, prediction (the ability to use linguistic cues to anticipate what is likely to come up), plays a central role in many models of language processing. In line with the idea that the human brain is a predictive machine, there is evidence that native speakers actively generate predictions about what is likely to be uttered, which allows language comprehension to be fast and efficient. In contrast, the question of whether second language (L2) speakers can also generate predictions online remains open.
This project uses EEG (a brain-imaging method with high temporal precision) to examine predictive processing in L2 speakers. To date, very few studies have addressed this question. Thus, the project carries the potential to further our understanding of the qualitatively nature of L2 processing and to identify areas of divergence between L1 and L2 speakers. The project examines prediction across three domains of grammar (semantics, syntax, discourse), some of which remain understudied (syntax, discourse). Moreover, it examines the extent to which L2
predictive processing is impacted by (a) individual differences in cognitive (e.g. working memory) and linguistic skills (e.g. aptitude for L2 learning) and (b) L1-L2 similarity, two factors that have been found to impact prediction but have not been systematically examined.
Final report
PURPOSE AND DEVELOPMENT OF THE PROJECT
The project was informed by previous psycholinguistic research showing that native (L1) speakers of a language generate linguistic predictions at all levels of linguistic representation. For example, upon reading “It’s raining, so it’s best to go out with…” most L1-English speakers expect a continuation such as an umbrella. Other continuations, such as “a raincoat”, are possible but unexpected. Crucially, this ability to generate predictions has been argued to make language comprehension fast and efficient in L1 speakers, and to drive language acquisition in children. In contrast, the role of prediction in second language (L2) processing is less clear-cut. While there is agreement that adult L2 learners can generate predictions in the L2, the available evidence suggests that they do so to a lesser extent (and later) than L1 speakers. Thus, the project aimed at investigating some of the factors that have been argued to explain variability in L2 predictive processing (according to psycholinguistic theories and theories of L2 processing). Some of the factors that we explored in the project include the specific domain of grammar where predictions are generated (Lexical Semantics, Syntax, Discourse), cross-linguistic influence, and individual differences (IDs) in L2 proficiency and cognitive factors. More specifically, the project involves five studies using EEG (Electroencephalography) and event-related potentials (ERPs), with English as the target language. ERPs are brain responses to stimuli of interest, and we focused on ERPs argued to index mechanisms related to linguistic predictions (e.g., the N400, the Late Anterior Positivity).
Studies #1 and #2 examined prediction at the level of the Discourse. Participants read question-answer stories, and their brain activity was recorded while they read the answers to the questions (word by word). The answers manipulated discourse properties, such as information structure (i.e., how information is packaged in a sentence for coherence). Both studies also examined the role of cross-linguistic influence in predictive processing, by comparing learners of L2-English with either Spanish or Swedish as their L1. Cross-linguistic influence was examined by probing linguistic rules that are similar between English and Swedish, but different in Spanish. For example, in Study #2, the stories made a third-person possessive pronoun (his/her) highly expected. In English and Swedish, third-person possessive pronouns mark the possessor’s biological gender (Ann’s dad -> HER dad). In contrast, in Spanish, possessive pronouns mark the grammatical gender of the thing possessed (i.e., the gender of “dad”). Both studies were conducted in Sweden and Spain.
Study #3 examined prediction at the level of Lexical Semantics in L1-English speakers and L1-Swedish L2-English learners. Participants were presented with words that were either related to a prime (e.g., salt-PEPPER) or unrelated (e.g., isolated-PEPPER). They were tested in two contexts: one that encouraged prediction and one that did not. We examined whether participants would use the primes to predict the targets in the predictive context. We also examined how cross-linguistic influence affected these lexicosemantic predictions, by using words that were either English-Swedish cognates (e.g., assist-HELP, assistera-hjälpa) or non cognates (e.g., cinema-MOVIE, bio-film).
Studies #4 and #5 examined prediction at the level of the Syntax in L1-English speakers and L1-Swedish learners of English. Participants read (word by word) sentences such as “Mary bought a hat or a friend lent her one”, which include a coordination structure. Since comprehenders prefer the simplest syntactic analysis possible, they might initially analyze “a friend” as the object of “bought”. This would lead to a reading disruption at the word “friend”, since the string is semantically incongruous (“bought a hat or a friend”). We investigated whether the presence of “either” at the beginning of the sentence would avoid such a disruption, since “either” signals that a whole sentence is being coordinated (i.e., that “a friend” is the subject of a new clause). Study #5 was a follow-up study with L1-English speakers to elucidate the findings from Study #4.
Another goal of the project was to investigate the extent to which IDs in proficiency, working memory, processing speed, and general intelligence accounted for L1 and L2 speakers’ ability to predict. Thus, the same ID tasks were administered to all participants across the five studies. We planned to collect data from 32 L1-English speakers and 32 L2 learners of advanced proficiency (32 per L2 group) for every study, which is standard in EEG-based research. However, a reviewer recommended increasing the sample size to capture more variability, which we followed.
IMPLEMENTATION
STUDIES #1 and 2. We collected data from 34 L1-English speakers, 47 L1-Spanish L2-English learners, and 33 L1-Swedish L2-English learners. We point out that data collection for both studies was halted by the Covid19 pandemic, as both of our labs were closed for data collection for months between 2020-21. Thus, we moved forward with the data that we had at the time (which still followed standards in EEG research), and published Study #2 in Neuropsychologia in 2021. After our labs reopened, we collected more data to be able to examine the role of IDs in predictive processing. The L1-English data from Study #1 was published in Neuropsychologia in 2019, and we are currently analyzing the L2 data (data collection for the L1-Spanish group finished in December 2024).
STUDY #3. We collected data from 36 L1-English speakers and 61 L1-Swedish L2-English learners (i.e., substantially more data than originally planned). For Study #3, we conducted two additional studies whose results guided the selection of stimuli for the main EEG study. As recommended by a reviewer, we conducted a word associations study where 104 L1-Swedish L2-English learners provided associations for 1244 English words. This made it possible to use word associations in English that were comparable for L1 and L2 speakers. We also conducted a translation study where 20 L1-Swedish L2-English learners translated the target English words into Swedish. This made it possible to determine the cognate status (between English and Swedish) of the target words. Another reviewer recommended not using identical cognates as targets, which we honored. Study 3 was published in Journal of Experimental Psychology: Learning, Memory, and Cognition in 2024.
STUDY #4. We collected data from 40 L1-English speakers and 60 L1-Swedish L2-English learners (i.e., substantially more data than originally planned). We also collected plausibility judgments for the materials from >100 L1-English speakers. This was facilitated by Prof. Harold Torrence (UCLA, Linguistics Dpt.) and it helped interpret our results. For STUDY #5, which is a follow-up to Study #4 with L1-English speakers, we collected data from 32 L1-English speakers, which is comparable to the sample size of the L1-English group in Study #4. A manuscript reporting on Studies #4 and #5 is currently in preparation. We plan to submit it in 2025.
MOST SIGNIFICANT RESULTS AND A DISCUSSION OF THE CONCLUSIONS
STUDIES #1 and #2. Our results suggest that both L1 and L2 speakers generate predictions at the level of the Discourse. This is important, since very few studies have examined prediction in this linguistic domain. Moreover, our results show that prediction in the L2 can be qualitatively and quantitatively native-like, and that cross-linguistic differences and global proficiency do impact prediction in the L2.
STUDY #3. Our results show that L2 learners can generate lexicosemantic predictions and track the reliability of predictive cues as efficiently as L1 speakers. To our knowledge, this is the first study showing that L2 learners adjust their predictive behavior based on the reliability of the predictive cues. This study also highlights the importance of using stimuli that are matched on predictive strength for L1 and L2 speakers, especially when probing the possibilities and limitations of L2 predictive processing.
STUDIES #4 and #5. Our results suggest that L1 speakers predict syntactic structure, but L2 learners do not. Since our L2 learners come from the same population as those from Studies 1-3, this provides indirect evidence that syntax might be a domain of grammar where L2 speakers are less likely to predict to native-like levels.
NEW RESEARCH QUESTIONS
One question that remains open concerns the relationship between variability in the L1 and the L2 with respect to predictive processing. Prediction is probabilistic and depends on linguistic, contextual, and cognitive factors. Thus, both L1 and L2 speakers exhibit substantial variability in predictive processing (our own studies show that). We are interested in whether those L2 learners who rely on prediction in the L2 are also more likely to rely on prediction in their L1. We will apply for a project grant in 2025 to examine this question.
RESEARCH DISSEMINATION AND COLLABORATIONS
So far, the project has led to three peer-reviewed journal article publications (Quartile 1), all open access. We expect to publish three more journal articles in the next two years. The project has also led to four paper presentations and four poster presentations at both national and international conferences, in addition to an invited talk. An additional conference presentation and an additional invited talk are scheduled to take place in 2025. A complete list is provided below.
All of the studies in the project were a collaboration between the Centre for Research on Bilingualism at Stockholm University and the Basque Center on Cognition, Brain and Language (Spain). In addition, we are currently collaborating with Prof. Florian Jaeger (U. of Rochester) for a reanalysis of the data published from Study #3.
The project was informed by previous psycholinguistic research showing that native (L1) speakers of a language generate linguistic predictions at all levels of linguistic representation. For example, upon reading “It’s raining, so it’s best to go out with…” most L1-English speakers expect a continuation such as an umbrella. Other continuations, such as “a raincoat”, are possible but unexpected. Crucially, this ability to generate predictions has been argued to make language comprehension fast and efficient in L1 speakers, and to drive language acquisition in children. In contrast, the role of prediction in second language (L2) processing is less clear-cut. While there is agreement that adult L2 learners can generate predictions in the L2, the available evidence suggests that they do so to a lesser extent (and later) than L1 speakers. Thus, the project aimed at investigating some of the factors that have been argued to explain variability in L2 predictive processing (according to psycholinguistic theories and theories of L2 processing). Some of the factors that we explored in the project include the specific domain of grammar where predictions are generated (Lexical Semantics, Syntax, Discourse), cross-linguistic influence, and individual differences (IDs) in L2 proficiency and cognitive factors. More specifically, the project involves five studies using EEG (Electroencephalography) and event-related potentials (ERPs), with English as the target language. ERPs are brain responses to stimuli of interest, and we focused on ERPs argued to index mechanisms related to linguistic predictions (e.g., the N400, the Late Anterior Positivity).
Studies #1 and #2 examined prediction at the level of the Discourse. Participants read question-answer stories, and their brain activity was recorded while they read the answers to the questions (word by word). The answers manipulated discourse properties, such as information structure (i.e., how information is packaged in a sentence for coherence). Both studies also examined the role of cross-linguistic influence in predictive processing, by comparing learners of L2-English with either Spanish or Swedish as their L1. Cross-linguistic influence was examined by probing linguistic rules that are similar between English and Swedish, but different in Spanish. For example, in Study #2, the stories made a third-person possessive pronoun (his/her) highly expected. In English and Swedish, third-person possessive pronouns mark the possessor’s biological gender (Ann’s dad -> HER dad). In contrast, in Spanish, possessive pronouns mark the grammatical gender of the thing possessed (i.e., the gender of “dad”). Both studies were conducted in Sweden and Spain.
Study #3 examined prediction at the level of Lexical Semantics in L1-English speakers and L1-Swedish L2-English learners. Participants were presented with words that were either related to a prime (e.g., salt-PEPPER) or unrelated (e.g., isolated-PEPPER). They were tested in two contexts: one that encouraged prediction and one that did not. We examined whether participants would use the primes to predict the targets in the predictive context. We also examined how cross-linguistic influence affected these lexicosemantic predictions, by using words that were either English-Swedish cognates (e.g., assist-HELP, assistera-hjälpa) or non cognates (e.g., cinema-MOVIE, bio-film).
Studies #4 and #5 examined prediction at the level of the Syntax in L1-English speakers and L1-Swedish learners of English. Participants read (word by word) sentences such as “Mary bought a hat or a friend lent her one”, which include a coordination structure. Since comprehenders prefer the simplest syntactic analysis possible, they might initially analyze “a friend” as the object of “bought”. This would lead to a reading disruption at the word “friend”, since the string is semantically incongruous (“bought a hat or a friend”). We investigated whether the presence of “either” at the beginning of the sentence would avoid such a disruption, since “either” signals that a whole sentence is being coordinated (i.e., that “a friend” is the subject of a new clause). Study #5 was a follow-up study with L1-English speakers to elucidate the findings from Study #4.
Another goal of the project was to investigate the extent to which IDs in proficiency, working memory, processing speed, and general intelligence accounted for L1 and L2 speakers’ ability to predict. Thus, the same ID tasks were administered to all participants across the five studies. We planned to collect data from 32 L1-English speakers and 32 L2 learners of advanced proficiency (32 per L2 group) for every study, which is standard in EEG-based research. However, a reviewer recommended increasing the sample size to capture more variability, which we followed.
IMPLEMENTATION
STUDIES #1 and 2. We collected data from 34 L1-English speakers, 47 L1-Spanish L2-English learners, and 33 L1-Swedish L2-English learners. We point out that data collection for both studies was halted by the Covid19 pandemic, as both of our labs were closed for data collection for months between 2020-21. Thus, we moved forward with the data that we had at the time (which still followed standards in EEG research), and published Study #2 in Neuropsychologia in 2021. After our labs reopened, we collected more data to be able to examine the role of IDs in predictive processing. The L1-English data from Study #1 was published in Neuropsychologia in 2019, and we are currently analyzing the L2 data (data collection for the L1-Spanish group finished in December 2024).
STUDY #3. We collected data from 36 L1-English speakers and 61 L1-Swedish L2-English learners (i.e., substantially more data than originally planned). For Study #3, we conducted two additional studies whose results guided the selection of stimuli for the main EEG study. As recommended by a reviewer, we conducted a word associations study where 104 L1-Swedish L2-English learners provided associations for 1244 English words. This made it possible to use word associations in English that were comparable for L1 and L2 speakers. We also conducted a translation study where 20 L1-Swedish L2-English learners translated the target English words into Swedish. This made it possible to determine the cognate status (between English and Swedish) of the target words. Another reviewer recommended not using identical cognates as targets, which we honored. Study 3 was published in Journal of Experimental Psychology: Learning, Memory, and Cognition in 2024.
STUDY #4. We collected data from 40 L1-English speakers and 60 L1-Swedish L2-English learners (i.e., substantially more data than originally planned). We also collected plausibility judgments for the materials from >100 L1-English speakers. This was facilitated by Prof. Harold Torrence (UCLA, Linguistics Dpt.) and it helped interpret our results. For STUDY #5, which is a follow-up to Study #4 with L1-English speakers, we collected data from 32 L1-English speakers, which is comparable to the sample size of the L1-English group in Study #4. A manuscript reporting on Studies #4 and #5 is currently in preparation. We plan to submit it in 2025.
MOST SIGNIFICANT RESULTS AND A DISCUSSION OF THE CONCLUSIONS
STUDIES #1 and #2. Our results suggest that both L1 and L2 speakers generate predictions at the level of the Discourse. This is important, since very few studies have examined prediction in this linguistic domain. Moreover, our results show that prediction in the L2 can be qualitatively and quantitatively native-like, and that cross-linguistic differences and global proficiency do impact prediction in the L2.
STUDY #3. Our results show that L2 learners can generate lexicosemantic predictions and track the reliability of predictive cues as efficiently as L1 speakers. To our knowledge, this is the first study showing that L2 learners adjust their predictive behavior based on the reliability of the predictive cues. This study also highlights the importance of using stimuli that are matched on predictive strength for L1 and L2 speakers, especially when probing the possibilities and limitations of L2 predictive processing.
STUDIES #4 and #5. Our results suggest that L1 speakers predict syntactic structure, but L2 learners do not. Since our L2 learners come from the same population as those from Studies 1-3, this provides indirect evidence that syntax might be a domain of grammar where L2 speakers are less likely to predict to native-like levels.
NEW RESEARCH QUESTIONS
One question that remains open concerns the relationship between variability in the L1 and the L2 with respect to predictive processing. Prediction is probabilistic and depends on linguistic, contextual, and cognitive factors. Thus, both L1 and L2 speakers exhibit substantial variability in predictive processing (our own studies show that). We are interested in whether those L2 learners who rely on prediction in the L2 are also more likely to rely on prediction in their L1. We will apply for a project grant in 2025 to examine this question.
RESEARCH DISSEMINATION AND COLLABORATIONS
So far, the project has led to three peer-reviewed journal article publications (Quartile 1), all open access. We expect to publish three more journal articles in the next two years. The project has also led to four paper presentations and four poster presentations at both national and international conferences, in addition to an invited talk. An additional conference presentation and an additional invited talk are scheduled to take place in 2025. A complete list is provided below.
All of the studies in the project were a collaboration between the Centre for Research on Bilingualism at Stockholm University and the Basque Center on Cognition, Brain and Language (Spain). In addition, we are currently collaborating with Prof. Florian Jaeger (U. of Rochester) for a reanalysis of the data published from Study #3.