Program

The group meets on Mondays at 12:30. If you want to receive updates, please contact Radek Šimík or Olga Nádvorníková.

In the academic year 2025/26 we give floor especially to the members of the GREG group. In the absence of an affiliation, the presenters are affiliated with the Faculty of Arts, Charles University.

Winter semester 2025/26

Abstract

What is the nature of my data? What are the consequences for the kind of grammatical analyses I carry out? Which grammatical phenomena are of interest to me? How do I conceive of a grammar of a language / languages?

Abstract

The nominal category of countability has been studied by scholars focusing on English linguistics ever since the early days of the discipline, beginning with, for instance, Jespersen (1924) or Bloomfield (1933). The reasons for this are both practical, countability has been called “locus of difficulty for English language learners” whose first language does not overtly express the countability distinction (Antes, 2019: 1), and also strictly intellectual as countability, because of its flexibility and its liminal position between syntax, morphology and semantics provides a fascinating window into the interaction between language, cognition and physical reality. It is, therefore, quite surprising that relatively little attention has been given to the diachronic development of the category in English and its functioning in Old English. Furthermore, the results of research which has dealt with the topic seem to be rather contradictory. Our study was, therefore, conceived as a baseline probe with the goal of establishing whether countability was a category distinguished in the Old English nominal system and, if so, to discern how it was formally expressed. The analysis confirmed that Old English did, in fact, indicate countability status in several ways, for example, pluralization patterns, quantifier and numeral collocability or in regard to the partitive phrase. The results, however, also suggest that countability was grammaticalised to a much lesser degree than it is in Present Day English and that the formal realisations were much more semantically determined.

Abstract

Contrastive topic (CT) in the notion of information structure denotes what an utterance is about that is contrasted with other alternatives (Büring 2003). CTs in English and German are associated with a rising prosody and tend to appear at the beginning of an utterance (Jackendoff 1972, Féry 2007). The marking of Czech CTs with rising prosody is more optional (Veselá 2007). However, constructions with a topic particle to have not yet been investigated and the use of the typical CT prosody seems more obligatory for such constructions. The presentation will report on an experimental study that was conducted to investigate the potential interaction of the CT type (plain x to + CT), prosody (rising x declining), and the explicitness of the CT alternatives.

Abstract

Romani is a language whose many varieties have not enjoyed much research interest. The aim of this study is to describe the functions of the definite article in the North Central variety of Romani documented by the Polish physician and anthropologist Izydor Kopernicki. The data present a collection of short stories and songs. The analysis demonstrates the following functions: deictic, anaphoric, recognitional, establishing, bridging, situationally unique, contextually unique, and absolutely unique. The distribution of the article in the prepositional phrase reveals some peculiarities. The article tends to occur in nominative prepositional phrases, and never occurs in oblique prepositional phrases. Thus, its distribution seems to be conditioned by not only semantic and pragmatic, but also syntactic factors. However, this issue requires further research due to exceptions and other idiosyncrasies that appear in the data.

Abstract

Based on the article by Petkevič et al. (2025), the author will present the tagging of SYN series corpora of contemporary Czech. He will focus mainly on the automatic rule-based morphological disambiguation as one of the main phases of the complex process of grammatical tagging. He will thus present a view of the syntactic regularities of Czech from an unusual perspective and an unusual way of thinking about language, namely so-called proper and improper morphological ambiguity (including part-of-speech ambiguity).

Petkevič V., Jelínek T. (2025): Automatická morfologická disambiguace korpusů řady SYN: spolupráce lingvistické introspekce a strojového učení [Automatic morphological disambiguation of the SYN series corpora: cooperation of linguistic introspection and machine learning]. Naše řeč 108, 2025, č. 1, 3–40.

Abstract

This presentation introduces syntactic complexity metrics (SCMs) newly available in InterCorp, a large multilingual corpus annotated with Universal Dependencies, and demonstrates their application in research on language and genre variation. The SCMs are computed for individual sentences and texts, offering researchers a way to quantify structural properties of language use across a wide variety of languages and registers. Beyond simple frequency counts, SCMs capture dimensions such as clausal embedding, phrasal expansion, and dependency distance, thereby providing a richer picture of syntactic organization. To illustrate their application, we report on a contrastive study involving 17 languages and four textual genres. The analysis of six SCMs reveals systematic correlations that cluster into clausal and phrasal measures, while mean dependency distance emerges as particularly sensitive to cross-linguistic variation. Using random forest classification, we show that SCMs reliably predict genre, with NP-related measures ranking highest, whereas mean dependency distance and its standard deviation provide the best discrimination among languages. Patterns of misclassification further point to affinities between languages, such as the proximity of English to Romance, previously observed in lexical studies. By linking corpus annotation to empirical findings, the presentation demonstrates how SCMs can inform contrastive linguistics, translation studies, register analysis, and L1/L2 research.

Abstract

Wh-words and wh-constructions come in a variety of flavors: their default use is interrogative (What did she cook? I wonder what she cooked.), but they are well-attested in relative (the meal which she cooked) and correlative uses (Whatever she cooked, I couldn’t resist it.) In my talk I will show that wh-functions are organized in what I call a wh-hierarchy: from the semantically and syntactically simplest interrogatives, via unconditionals and correlatives to the most complex free/headless and headed relatives (Šimík 2025, published here, preprint here). The hierarchy is empirically reflected in a number of implicational patterns, e.g.: relative pronouns can be morphologically derived from interrogative ones, but not vice versa; interrogative and correlative wh-in-situ is cross-linguistically common, but relative wh-in-situ is basically unattested (with some exceptions, possibly apparent, from Tsez, Hittite, or Mandarin Chinese); the use of wh-words in one construction type in a language implies their use in a simper construction type in that language, but not vice versa. I put forth a model in which the hierarchy results from a growing syntactic, semantic, and arguably cognitive complexity of the individual constructions. I also report on the results of two L1-acquisition studies on Czech that provide empirical support for a proper part of the wh-hierarchy: preschoolers‘ production (sentence repetition) of correlatives and light-headed relatives (Šimík et al. 2023) and a corpus study of the use of wh-words in preschoolers and their parents (Šimík et al. 2025), based on the Czech CHILDES corpus (Chromá et al. 2025).

Abstract

TBA

Abstract

Verbal forms in the two branches of Afroasiatic have a common verbal form that in structure corresponds with a nominal base and pronominal enclitics. However, other forms of the verbal system differ in both branches. The structural difference is visible on Egyptian starting with Late Egyptian and consolidating in Coptic. This type of development goes in a rather different direction than the forms in the Semitic language. This process is visible when the forms are morphotactically divided and the functions assigned to particular positions are compared. The structural development in both languages shows differing strategies.

Abstract

Latin is an example of a language that has a large number of documents, but their linguistic exploitability is very low for various reasons. One of these is the way in which ancient literary texts are preserved, almost without exception in copies made by medieval monastic scribes, where morphological and orthographic differences from the classical norm were extensively adjusted. This makes it very difficult to trace the development of Latin at the phonological and morphological levels. The second reason is the formulaic nature of the vast majority of inscriptional production. The third is the rapid and uneven spread of Roman influence and with it Latin, which served various linguistic functions at different times and in different areas, both at the level of society as a whole and of individual families and their members. We usually have very little or no information about the linguistic identity of the writers of the preserved texts. From a sociolinguistic point of view, all this represents an extremely complex situation, which has only been properly taken into account in Latin linguistics in recent decades; this means that much of what was previously processed on large corpora of texts, but treated in an undifferentiated manner, is now being questioned.

Summer semester 2025/26

TBA