About the project

Project logo
Off

How do people acquire and make sense of ‘messy’ linguistic data?

Our international team is examining this question from multiple angles using data from the languages of central and eastern Europe. Funded by the UK’s Arts and Humanities Research Council, this project runs from 1 September 2020 to 31 August 2023. It will investigate two puzzling language phenomena as reflected in a variety of linguistic data and how we describe them when writing reference works for public use.

What do we mean by ‘feast and famine’?

Every day, as we use language, we unconsciously select forms of words that feel ‘right’ for the particular ‘slot’ we wish to put them in. When describing someone’s beverage of preference, most English speakers say she drank tea, rather than she drinked tea or some other variant: speakers select a verb form rapidly and will not find alternatives to be acceptable or grammatical. 

Occasionally, however, multiple forms may compete within a slot, such as the past participle of the verb prove (have proved? have proven?): here, speakers tend to find both forms acceptable and grammatical, although each of us might only use one of them.

In other places, we lack a suitable form where one is expected: we may baulk at forming the past tense of the verb troubleshoot, where we have a ‘slot’ (past tense needed) but no form that can adequately fill it (troubleshot? troubleshooted?).

These surprising examples of ‘feast’ (multiple forms) and ‘famine’ (no forms) show that selecting or approving the ‘right’ word form is not a process of mechanically mapping one form to a function or vice-versa; instead, speakers may weigh and select forms from a basket of those available to them, sometimes keeping around more forms than necessary, and sometimes failing to find a form that works for them.

How are we approaching this problem?

These ‘mismatches’ between form and function are found in all languages, but the languages of central and eastern Europe, which are rich in grammatical forms, are an especially useful testing ground for learning why similar situations result in opposite outcomes. The Feast and Famine project will be looking at this issue from multiple angles.

  • A team at the University of Sheffield (PI Neil Bermel and Alexandre Nikolaev) will be focusing on adult speakers and how they react when presented with forms and contexts of these two sorts, looking at Czech and Finnish.
  • How Croatian and Estonian children learn these difficult sorts of items will be the focus of researchers respectively in Zagreb (Gordana Hržica and Tomislava Bošnjak-Botica) and Tartu (Virve Vihman).
  • At Charles University, Prague, Dominika Kováříková will be looking at real-world data from large text databases (corpora) of Czech to see how the techniques developed at their institute can be applied to these phenomena.
  • A team at York (Dunstan Brown) and Brighton (Roger Evans) will develop computational models to predict the appearance of these ‘mismatches’.
  • Finally, teams at the Czech Language Institute (Kamila Smejkalová and Martin Beneš), and the Institute for Croatian Language and Linguistics (Tomislava Bošnjak-Botica)/University of Zagreb (Gordana Hržica) will look at how current handbooks of those languages describe ‘messy’ phenomena and help translate project findings into concrete recommendations that speakers can make sense of.