Acceptability and forced-choice judgements in the study of linguistic variation

Principal investigator: Neil Bermel

Research associate: Luděk Knittl

This research is funded by a Leverhulme Trust Research Project grant

All languages offer us multiple possibilities for expression in a variety of situations. We can express a thought in the active or the passive voice; select one of several nearly synonymous words or expressions (car/automobile, begin/start), or use one of several permissible forms (I have proved/proven). This overgeneration of possible means of expression can lead to clear semantic or usage-based differentiation, creating new distinctions or reinterpreting old ones. Sometimes, however, it seems to fail to do so, or at least fails to do so in any consistent manner. Instead, we observe a more or less stable state where, for a single function, multiple forms or entries survive and prosper with no apparent regular distinction between the two. We hope to contribute towards elucidating why this state of affairs persists.

Czech corpus imageThe material for our study comes from Czech, a morphologically rich language that offers many examples of such overgeneration without clear differentiation of usage. In many situations, Czech speakers have at their disposal two or sometimes three case endings that are theoretically possible, and significant variation is noted in the basic vocabulary. These tightly defined parameters make an ideal test case for studies of morphological variation. In a previous study, we considered how actual production in written language (as evidenced by the material from a large-scale, balanced corpus) maps onto acceptability (as evidenced by questionnaire answers from several hundred users). This yielded important insights about the reliability of corpus data, but also raised the question: why do some forms that are attested rarely in the corpus compared to their competing form still garner high ratings of acceptability? The answer to this question potentially tells us a lot about the way those choices are made and how such minority forms continue to be used at low levels.

This study aims to map in parallel three phenomena: data from a large-scale representative corpus of Czech; users’ acceptability judgements of particular forms in a variety of contexts; and their decisions when faced with a choice between the same forms. As this project runs its course in 2012-15, we will be collecting data through two large-scale surveys under controlled conditions, and comparing the results to see what the relationship is between people’s acceptability judgements and their actual linguistic production.

The results could shed new light – not easily available through English-language data – into how speakers process morphological variation. Our operating hypothesis is that they make use of multiple layers of schemas, at varying levels of generality, that guide their selections (schemas being linguistic templates against which we map actual language situations to create our own language output). If our predictions turn out to be true, it would strongly support our hypothesis that these sorts of stable variation are due to language users accessing different levels of linguistic schemas.