•  
  • Clarin-LT
  • News
  • Repository
  • Legal information
 
 Login
LT | EN
  • CLARIN-LT Repository Home
  • View Item
  •  
  • CLARIN-LT logo
  •   What can you do?
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   Statistics  
    •    Piwik StatisticsBETA
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 
 

DELFI.lt corpus

 
CLARIN-LT
  Authors
Bielinskienė, Agnė ; Boizou, Loïc ; Bumbulienė, Ieva ; Kovalevskaitė, Jolanta ; Krilavičius, Tomas ; Mandravickaitė, Justina ; Rimkutė, Erika ; Vilkaitė-Lozdienė, Laura
 Project URL
http://mwe.lt/
 Date issued
2019
 Type
corpus
 Size
70000000 tokens
 Language(s)
Lithuanian
 Description
DELFI.lt is corpus made of articles published by news portal DELFI.lt since March 2014 till November 2016. Metadata was collected with articles as well: author, title, date, source, link, category, number of words. This corpus is made of 190 000 news articles from 12 thematic categories: DELFI Faces (DELFI Veidai), Projects (Projektai), DELFI Science (DELFI Mokslas), DELFI Auto, Unidentified category, Sport, DELFI Life (DELFI Gyvenimas), DELFI People (DELFI Žmonės), DELFI CItizen (DELFI Pilietis), Business (Verslas), DELFI FIT, DELFI News (DELFI Žinios). All in all DELFI.lt corpus consists of 70 million words. The corpus is morphologically annotated with Universal Dependencies tags and is freely accessible for online search at http://tekstynas.mwe.lt/.
 Publisher
Baltic Institute of Advanced Technology
 
Vytautas Magnus University
 Acknowledgement

Research Council of Lithuania

Project code: LIP-027/2016

Project name: Automatic Identification of Lithuanian Multi-word Expressions (PASTOVU)

 Subject(s)
Lithuanian news articles media corpus POS tagged DELFI corpus corpus
 Collection(s)
CLARIN-LT
Show full item record
 
 

Partners

  • Vytautas Magnus University (VDU)
  • Kaunas University of Technology (KTU)
  • Vilnius University (VU)
  • Mykolas Romeris University (MRU)
  • Baltic Institute of Advanced Technology (BPTI)
  • Institute of Baltic Region History and Archaeology (BRIAI)

Sponsors

  • Ministry of Education and Science
  • Research Council of Lithuania
  • Powered by LINDAT

Repository

  • Main page
  • Submission Lifecycle
  • FAQ
  • About and Legal

©2023 CLARIN-LT. All rights reserved.