This project will document and explain a grammatical universal, i.e. cross-linguistic coding tendencies in verbs with actional classes and aspect classes by demonstrating a link between cross-linguistic patterns of language form and general trends of language use. Essential components of the analysis of this universal will be carried out by an analysis software tool which will be developed during this project.
The claim of the proposal is that frequently expressed meanings tend to be expressed by zero-coded forms (form-frequency correspondences). The quantities “frequency” and “form” are components of Zipf’s law (1935) which will be pursued by extending the frequency-form relation to a frequency-form/function-relation, since the coding asymmetries express distinctions between actional and aspect classes of verbs and have semantic implications: With zero-coding, telic verbs tend to express perfective aspect and atelic verbs imperfective aspect, respectively, with overt-coding it is the other round. The form (coding)-frequency-correspondence will be examined inspecting the asymmetry in the coding of atelic/telic verbs in a number of European and non-European languages. In general, combinations that occur more frequently tend to be zero-coded overtly across languages, while the combinations that occur more rarely tend to be coded overtly. The proposed explanation is that higher-frequency and thus higher probability items are more predictable than lower-frequency items, and predictable content need not be expressed overtly or can be expressed by shorter forms. The hypothesis is that frequency is just one of a set of factors that constitute predictability, and it is aimed at disclosing and weighting these factors. Form-frequency correspondences make language structure more efficient (Zipf 1949), but it still needs to be shown that there exists a mechanism that creates and maintains these efficient structures: recurrent instances of language change driven by the speakers’ preference for user-friendly utterances. The project thus combines cross-linguistic research on grammar and cross-linguistic corpus research. Form-frequency correspondences are still largely overlooked and ignored by linguists, so the current project will have a significant impact on our general understanding of human language.
The technological output of this project will be analysis software tool. In order to disclose components of “predictability”, the tool provides a couple of statistical entropy based and probability based techniques. This part of the project comprises not only software programming but also evaluation and interpretation of the results and contributes essentially to theory forming. The results will allow for deeper insights what we mean when talking about predictably of coding asymmetries.
Future research will deal with the project’s topics by means of formalisms such as stochastic optimality theory (Bresnan et al. 2001) or evolutionary game theory (Jäger 2007).
At IntelliSys 2019 in London (5.-6. September), our paper Estimation of Average Information Content: Comparison of impact of contexts. Intelligent Systems and Applications was presented.
At BIS 2019, our paper Interaction of Information Content and Frequency as Predictors of Verbs' Lengths will be presented
Talk Surprisal in Texten: Der Shannonsche Informationsgehalt als Merkmal für Text-Zusammenfassungen on 22 May 2019 at Hochschule Anhalt.
January 2018 (25)
January 2018, 25th (Thursday), 9:00-12:30h: Informal mini-workshop on tense-aspect typology and coding asymmetries, with presentations by Martin Haspelmath, Olav Mueller-Reichau, Natalia Levshina, Michael Richter
Leipzig workshop on "Language Universals, typology and corpus-based Research", date: 14th. of November; time: 10 am - 6pm; venue: Mediencampus in Gohlis Mediencampus
At KONVENS, the paper Aspect coding asymmetries of verbs: The case of Russian will be presented.