Image

Pontic Greek

is a dialect of Greek, originally spoken in various regions of northeastern Anatolia. It is often regarded as a conservative variety of Greek, preserving many features of Middle Greek. From the 18th to 20th century, speakers of Pontic Greek migrated from Anatolia to the Caucasus in several waves.


Contributors

Svetlana Berikashvili

recordings, transcriptions, translations, glossing, analyses

Ilia State University

Stefanie Böhm

instructions, revisions

Bielefeld University

Evgenia Kotanidis

recordings, transcriptions, translations

Ivane Javakishvili Tbilisi State University

Johanna Lorenz

instructions, revisions

Bielefeld University

Stavros Skopeteas

conception, supervision, (few) recordings

University of Göttingen

Language

Genetic affiliation: Indo-European > Greco-Phrygian > Greek > East Greek > Koineic Greek > Pontic-Cappadocian Greek > Pontic (Glottolog 4.0)
Place: Georgia
Language Code: pnt (ISO 639-3)
Population: below 500
Documented variety: Pontic Greek of Georgia, as spoken by Pontic Greek people in Georgia, and from speakers migrated from there to Greece.
Endangerment: threatened (Endangered Languages Project)

Glottolog 4.8 edited by Hammarström, Harald & Forkel, Robert & Haspelmath, Martin & Bank, Sebastian, licensed under a Creative Commons Attribution 4.0 International License.

data

Time and Place

Most recordings were made between 2014 and 2016, with a small number dating back to 2005.

Data were collected in Georgia (Tsikhisjvari, Manglisi, Santa, Shua Kharaba, Tetritskaro, Tbilisi) and in Greece (Athens and Thessaloniki, from Pontic speakers who had emigrated from Georgia). The map shows the recording locations in Georgia as well as the places of birth of the speakers in our sample.

PNT-MAP

Speakers

  • subcollection TXT, women = 13, men = 11, Birthyear range = 1925-1994, average = 1966,5
  • subcollection VA1, women = 4, Birthyear range = 1933-1981, average = 1955,5
  • subcollection VA2, women = 1, men = 1, Birthyear range = 1930-1940, average = 1935

Instructions

The speakers were instructed in Pontic Greek by a native instructor. The instructions (TXT, VA1 subcollections) are listed below in English translation.

Abbreviation Text Instruction
General instruction Please answer the following questions spontaneously. Just speak normally, as if you are speaking to a friend. It does not matter if you are not sure about details, just give a natural answer.
AN Ancestor Story How did your ancestors come to Georgia?
FM Family Tell us the history of your family (for speakers who do not live in the original settlements: how did your family came from the villages to Tbilisi and from Tbilisi to further destinations)?
VL Village Please describe the village where your family comes from.
CD Comparative Description Please tell us how your people are different from the other people in the village/city (Russian, Greek)?
CL Culture Please tell me a fairy tail or a poem in your native language. (If you do not know any fairy tail/poem, please tell me what you find most important in the culture of your people).
MR Marriage Please tell us how your people celebrate an engagement/marriage and what is the difference to the way other people in this village/city feel celebrate a marriage.
FE Feast Tell us a difference between the way you celebrate a particular feast in your group and the groups of the other people of your environment? (Christmas, Easter, Panajia).
LG Language Please tell me how you perceive the major differences between your language and Russian.

Recordings

File names

Labelling scheme: language-collection-instruction-00000-speaker

Annotations

Layers

The annotation layers of the corpus files (.eaf) contain the following layers:

Tier name Content
ref identifier of the sentence: contains the name of the text and the number of the sentence in the last two digits; it provides the reference to cite examples and to store references in searching results
tx PARENT=ref
corrected transcription on word level: makes it possible to search for single words containing particular glosses, that would be not possible if only the tx-a tier would exist; besides, the word boundaries can be used later to align word boundaries and sound
mb PARENT=tx
corrected transcription on morpheme level
ge PARENT=mb
morpheme-aligned glosses in English
ps PARENT=mb
parts of speech
ft PARENT=ref
free translation
nt PARENT=ref
comments / notes about the sentence
tx_a PARENT=ref
transcription on sentence level: this tier makes it possible to search for discontinuous strings of words hosted by a single ref-entry and with an undefined sequence of words in between; it also facilitates reading the whole sentence in the object language
ge_a PARENT=ref
(a = associated) sentence-aligned glosses in English: this tier makes it possible to search for discontinuous strings hosted by a single ref-entry and with an undefined sequence of words in between at the functional level (not at the level of forms, unlike tx_a)
id no content; needed for back-conversion to Toolbox

In files with more than one speakers, the speaker label is merged to the content label (this applies to subcollections VA1 and VA2):

...@speaker1
...@speaker2
...@speaker3


Transcriptions

class orthography IPA
vowelsaa
äæ
eɛ
ii
oɔ
uu
plosivespp
tt
kk/c
bb
dd
gg/ɟ
fricativesff
θθ
ss
shʃ
xχ
vv
ðð
zz
zhʒ
jj
ɣɣ
affricatestsʧ
chʧ
dzdz
dzhʤ
nasalsmm
nn/ŋ
liquidsrɾ
ll/ł

Notes:

Categories

The abbreviations for glosses follow the Leipzig Glossing Rules.

Nominal template (Adjectives, Substantives, Pronouns, adjectival Participles): [Gender, Number, Case], e.g.:

Verbal template: [Voice, Mood, Aspect, Tense, Finiteness, Person/Number], e.g.:

Abbreviations:

Category Abbreviation Meaning
Gloss0epenthesis
Gloss1first person
Gloss2second person
Gloss3third person
GlossABILability
GlossABLablative
GlossACCaccusative
GlossADJRadjectivalizer
GlossAORaorist
GlossCONDconditional
GlossCOND.COPconditional copula
GlossCONVconverb
GlossDATdative, genitive
GlossEPST.COPepistemic copula
GlossEV.PSTevidential past
GlossFUTfuture
GlossGENgenitive
GlossGERgerund
GlossINFinfinitive
GlossINSTRinstrumental
GlossIPFVimperfective
GlossNEGnegative
GlossNEG.COPnegative copula
GlossNEG.EXISTnegative existential
GlossNRnominalizer
GlossOPToptative
GlossPASSpassive
GlossPLplural
GlossPOSSpossessive
GlossPOTpotential
GlossPROCprocedural
GlossPSTpast
GlossPTCPparticiple
GlossSGsingular
GlossVOCvocative
Part of speechNnoun
Part of speechVverb
Part of speechAadjective
Part of speechAdvadverb
Part of speechPadposition
Part of speechQquantifier
Part of speechAQordinal nominal
Part of speechCconjunction
Part of speechPNpronoun
Part of speechPRTparticle
Part of speechXunclear
Miscellaneousxxxunidentified words (mb layer)
Miscellaneousxxxunknown meaning (ge layer)
MiscellaneousHESIThesitation
Miscellaneous((coughs))coughing
Miscellaneous((laughs))laughing
Miscellaneous((smiles))smiling

queries

The Pontic Greek corpus is online available in ANNIS that allows for visualizations and queries in multimodal annotations. It comes with a powerful query language (AQL=ANNIS query language) that allows to retrieve complex data patterns in multilayered annotations (Krause, Thomas & Zeldes, Amir 2016: ANNIS3: A new architecture for generic corpus query and visualization. in: Digital Scholarship in the Humanities 2016 (31). http://dsh.oxfordjournals.org/content/31/1/118).

You can access the corpus in the SPW installation at: https://spw.uni-goettingen.de/annis/.

Before the query, you need to select a corpus: e.g., "PNT-TXT-0.1-mp3". You write your query in the query window:


Plain-text queries in AQL

In your queries, you need to specify the annotation layer and define the expression that you are looking for. Notice that the query tool only retrieves sentences that equal the queried expression (and not sentences that contain the queried expression).

Query Explanation
tokall tokens of the corpus (not very useful, but illustrative)
mb="ke"all tokens in the morpheme layer (mb: words with morphemic boundaries) that contain exactly the form ke 'and'.
mb="aðaká"all tokens in the morpheme layer (mb) that contain exactly the form "aðaká".
ge="be:PST:3.PL"all tokens in the gloss layer (ge) that contain exactly the string "be:PST:3.PL".
ge="LOC"all tokens in the gloss layer (ge) that contain exactly the string "LOC" (= locative).
ps="V"all tokens in the POS layer (ps) that contain exactly "V" (=verb)
mb="s" _=_ ge="LOC"all tokens that contain exactly "s" in the morpheme layer and exactly "LOC" in the gloss layer - in the same slot (_=_).


Regular expressions in AQL

Regular expressions are included in slashes. You find some illustrative examples below. More details about the regular expressions in AQL are found here.

Query Regular expression Explanation
mb=/eksér[oi]/"[...]" contains alternative charactersall tokens in the morpheme layer (mb) that contain "ekséro" or "ekséri".
mb=/(pos|pu)/"(...|...)" contains alternative stringsall tokens in the morpheme layer (mb) that contain "pos" or "pu".
mb=/ekséro?/"?" stands for "the last character is optional"all tokens in the morpheme layer (mb) that contain the string "ekséro" or "eksér".
mb=/a+/"+" stands for "at least one occurrence"all tokens in the morpheme layer (mb) that contain the at least one occurrence of the character "a", which includes "a" and "aa".
mb=/a*/"*" stands for "zero or more occurrences"all tokens in the morpheme layer (mb) that contain "a", "aa", "aaa", "aaaa", etc.
mb=/ti./"." stands for "whatever character"all tokens in the morpheme layer (mb) that contain the string "ti" and a character (.), e.g., "tin", "tis", etc.
ge=/.*3.PL/".*" stands for "zero or more occurrences of whatever character"all tokens in the gloss layer (ge) that contain "...3.PL"
ge=/.*PFV.*/".*" stands for "zero or more occurrences of whatever character"all tokens in the gloss layer (ge) that contain "...PFV..."

More about AQL: AQL documentation site.

publications using this corpus


Berikashvili, Svetlana. 2022. Contact-Induced Change in the Domain of Grammatical Gender in Pontic Greek spoken in Georgia. Languages 7(2): 79. https://doi.org/ 10.3390/languages7020079.

Berikashvili, Svetlana. 2019. Verb Adaptation in Pontic Greek spoken in Georgia. In Tzitzilis, Ch. & G. Papanastassiou (eds). Language Contact in the Balkans and Asia Minor, Series: Greek Language: Synchrony and Diachrony 2. Thessaloniki: Institute of Modern Greek Studies (M. Triandaphyllidis Foundation), 262-279.

Berikashvili, Svetlana. 2018. Several Features of Aorist and Verbal System in Pontic Greek spoken in Georgia. Arxeion Pontou (Pontic Archive), Vol. 58. Athens: Committee for Pontic Studies, 195-229.

Berikashvili, Svetlana. 2017. Morphological Aspects of Pontic Greek spoken in Georgia. Series: Languages of the World 54, Munich: LINCOM 2017, 168pp.

Berikashvili, Svetlana. 2016. Morphological Integration of Russian and Turkish Nouns in Pontic Greek. Language Typology and Universals, 69.2, 255–276, DOI: 10.1515/stuf-2016-0012.

Berikashvili, Svetlana & Lobzhanidze, Irina. 2017. Number in Pontic Greek spoken in Georgia. In M. Chondrogianni, S. Courtenage, G. Horrocks, A. Arvaniti, I. Tsimpli (Eds.), Proceedings of the 13th International Conference on Greek Linguistics. London: University of Westminster, 51-61.

citations

TXT subcollection

Kotanidi, Evgenia (collection/transcription/translation), Svetlana Berikashvili (revision/glossing), Stefanie Böhm, Johanna Lorenz (supervision), Stavros Skopeteas (design/supervision). 2019. Pontic data collection. The Language Archive, Corpus resource; persistent identifier: https://hdl.handle.net/1839/00-0000-0000-0021-4DA4-3.

VA1 subcollection

Berikashvili, Svetlana. 2019. Pontic data collection 2. The Language Archive, Corpus resource; persistent identifier: https://hdl.handle.net/1839/00-0000-0000-0021-4DA4-3.

VA2 subcollection

Berikashvili, Svetlana (transcription) and Stavros Skopeteas (recordings). 2019. Pontic data collection 3. The Language Archive, Corpus resource; persistent identifier: https://hdl.handle.net/1839/00-0000-0000-0021-4DA4-3.