•  
  • Clarin-LT
  • News
  • Repository
  • Legal information
 
 Login
LT | EN
  • CLARIN-LT Repository Home
  • View Item
  •  
  • CLARIN-LT logo
  •   What can you do?
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   Statistics  
    •    Piwik StatisticsBETA
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 
 

Pedagogic Corpus of Lithuanian

 
CLARIN-LT
  Authors
Rimkutė, Erika ; Kamandulytė-Merfeldienė, Laura ; Aleksandravičiūtė, Gabrielė ; Anglickienė, Laimutė ; Barkauskaitė, Giedrė ; Bielinskienė, Agnė ; Boizou, Loïc ; Grigonytė, Gintarė ; Kovalevskaitė, Jolanta ; Virbickienė, Gabrielė
 Project URL
https://kalbu.vdu.lt/mokymosi-priemones/mokomasis-tekstynas/
 Date issued
2022-08-29
 Type
toolService
 Language(s)
Lithuanian
 Description
The Pedagogic Corpus of Lithuanian is a monolingual specialized corpus, prepared for learning and teaching Lithuanian in a foreign language classroom. The pedagogic corpus includes authentic Lithuanian texts, selected using such criteria as a learner-relevant communicative function and genre. Spoken language as well as written language are represented in the corpus. The size of the corpus is 669,000 tokens: 111,000 tokens from texts and spoken language for A1-A2 levels, 558,000 tokens from texts and spoken language for B1-B2 levels (according to the Common European Framework of Reference for Languages). The spoken component constitutes appr. 7.5 % of the Corpus. The written subpart of the corpus (containing 620,000 tokens) includes levelled texts from coursebooks and unlevelled texts from other sources. The texts from coursebooks and other sources could be classified into 29 text types (dialogs, narratives, information, etc.) and 4 groups according to the communicative aims: informational texts, educational texts, advertising and fiction. There are two types of searches in the corpus: simple and advanced (see „Search Tips“). Simple Search allows you to find instances of a search item (word form, lemma, two words) in the whole corpus, or particular part of the corpus (spoken or written texts). After selecting the written subcorpus, you can further select the text type (coursebooks or non-coursebook texts) and/or the genre of the written texts. Advanced Search allows you to use all the features of simple search and find some additional options. Since the Pedagogic Corpus is morphologically annotated, the advanced search allows you to search by grammatical features (e.g. part of speech, case, number, verb form, etc.). At https://kalbu.vdu.lt/mokymosi-priemones/mokomasis-tekstynas/ you can find truncated wordlists: list of lemmas, word forms (for the whole corpus, spoken and written components, and for each level), lists of particular part of speech in the whole corpus. The lists can be downloaded as .xlsx files. REFERENCE Kovalevskaitė, Jolanta and Rimkutė, Erika. "Pedagogic Corpus of Lithuanian: A New Resource for Learning and Teaching Lithuanian as a Foreign Language" Sustainable Multilingualism, vol.17, no.1, 2020, pp.197-230. https://doi.org/10.2478/sm-2020-0019
 Publisher
Vytautas Magnus University
 Acknowledgement

Europeans Social Fund

Project code: 09.3.1-ESFA-V-709-01-0002

Project name: Lithuanian Academic Scheme for International Cooperation in Baltic Studies

 Subject(s)
Pedagogic corpus Lithuanian
 Collection(s)
CLARIN-LT
Show full item record
 
 

Partners

  • Vytautas Magnus University (VDU)
  • Kaunas University of Technology (KTU)
  • Vilnius University (VU)
  • Mykolas Romeris University (MRU)
  • Baltic Institute of Advanced Technology (BPTI)
  • Institute of Baltic Region History and Archaeology (BRIAI)

Sponsors

  • Ministry of Education and Science
  • Research Council of Lithuania
  • Powered by LINDAT

Repository

  • Main page
  • Submission Lifecycle
  • FAQ
  • About and Legal

©2023 CLARIN-LT. All rights reserved.