Researchers have investigated when children begin to form original sentences, marking the beginning of “linguistic productivity”, the ability to create sentences never heard before. Using data from 64 children learning English and a computer model that simulated their learning, it was possible to identify when they began to combine words creatively, such as in determiner-noun structures (“the dog”).
Language acquisition is one of the most fascinating processes in human development. One of the great challenges in this area is to understand when and how children begin to use language productively, that is, to create combinations of new and well-structured words that they have never heard before.
This ability, called linguistic productivity, is a milestone that differentiates human communication from other communication systems.
The central problem is to know whether children already have abstract categories to organize language from the beginning of learning or whether these categories emerge over time, as they process the linguistic stimuli around them.
One of the main obstacles to studying this phenomenon is the difficulty of mapping all the words and phrases that a child hears in their environment.
To address this challenge, researchers at the University of Amsterdam studied how 64 children were learning English. They used recordings of the children’s interactions with caregivers or in everyday situations.
These data were analyzed to identify when the children began to use determiner-noun combinations (such as “the” or “a”) with nouns. The idea was to see whether these combinations were novel, not present in the children’s linguistic input (i.e., they had not heard these specific combinations from other people), or productive, following grammatical patterns that indicate the application of linguistic rules.
These combinations are ideal for studying productivity because they require the child to understand and apply grammatical rules, such as the use of an article before a noun.
The scientists used a method that combined behavioral observations (what the children said) with computational modeling. This approach allowed them to map the first productive uses of language, identifying when the children began to create determiner-noun combinations that were not present in their linguistic input (i.e., that they had not heard before).
It also modeled linguistic behavior. By using a computational model to simulate how combinations emerge and comparing this with observed behavior in children.
Using a computational model is essential for this type of study because it allows complete control over the linguistic input.
The researchers knew exactly which words and phrases the model had “learned” and could accurately identify when the model created new combinations that went beyond the training material.
This approach allowed for a parallel with children: if the model, given a limited input, could create new combinations following linguistic rules, then this could reflect a similar process that occurs in children.
The results showed that children began to produce combinations of determiners and nouns that they had never heard before, indicating linguistic productivity. Furthermore, the computational model reproduced similar patterns, showing that it could also extrapolate its input to create new combinations.
Key findings include the timing of productivity: both children and the model began to create new combinations at similar developmental times, around 30 months on average.
Children not only repeated combinations that they had heard, but also applied rules to form new sentences, demonstrating linguistic creativity.
In some cases, children omitted the determiner (e.g. saying “dog” instead of “the dog”), and the model replicated this pattern as well, showing that these “mistakes” are part of the learning process.
The parallels found between children’s behavior and the computational model suggest that it is possible to better predict and understand how language emerges. This innovative approach could be used to investigate linguistic productivity in other languages, including sign languages.
Furthermore, the study has important theoretical implications for language acquisition:
Early abstract categories: Children appear to form abstract categories to organize language early in the acquisition process.
Learning beyond input: The ability to extrapolate beyond what has been heard suggests that language is more than simple imitation; it involves the application of underlying principles.
This study represents a significant advance in the understanding of language acquisition. By combining detailed behavioral observations with rigorous computational modeling, the researchers were able to capture the moment when children begin to use language creatively and productively.
This approach not only sheds light on how we learn to speak but also opens the door to new ways of studying language in diverse populations and different linguistic contexts.
READ MORE:
Using computational modeling to validate the onset of productive determiner–noun combinations in English-learning children
Raquel G. Alhama, Ruthe Foushee, Dan Byrne, Allyson Ettinger, Afra Alishahi, and Susan Goldin-Meadow
PNAS. November 21, 2024. 121 (50) e2316527121
Abstract:
Language is a productive system we routinely produce well-formed utterances that we have never heard before. It is, however, difficult to assess when children first achieve linguistic productivity simply because we rarely know all the utterances a child has experienced. The onset of linguistic productivity has been at the heart of a long-standing theoretical question in language acquisition do children come to language learning with abstract categories that they deploy from the earliest moments of acquisition? We address the problem of when linguistic productivity begins by marrying longitudinal behavioral observations and computational modeling to capitalize on the strengths of each. We used behavioral data to assess when a sample of 64 English-learning children began to productively combine determiners and nouns, a linguistic construction previously used to address this theoretical question. After the onset of productivity, the children produced determiner noun combinations that were not attested in our sample of their linguistic input from caregivers. We used computational techniques to model the onsets and trajectories of determiner–noun combinations in these 64 children, as well as characteristics of their utterances in which the determiner was omitted. Because we knew exactly what input the model was trained on, we could, with confidence, know that the model had gone beyond its input. The parallels found between child and model in the timing and number of novel combinations suggest that the children too were creatively going beyond their input.
留言