What's New

corpus
corpus
Description:
English-Lithuanian parallel corpus DVITAS v2 includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. Version 1 of the corpus was compiled for the bilingual terminology ...
 This item contains 3 files (9.1 MB).
 
Publicly Available
corpus
corpus
Description:
Two Lithuanian language children’s corpora, collected during the EMVAKA project, consist of the Lithuanian language production by children aged 7–13: (1) spoken (73 files, c. 31,000 tokens) and written (77 files, c. 7,600 ...
 This item contains 4 files (245.91 KB).
 
Academic Use Attribution Required Noncommercial
corpus
corpus
Description:
MATAS corpus (version 3.0) DESCRIPTION Updated, manually checked, morphologically annotated corpus MATAS LANGUAGE Lithuanian PREVIOUS VERSIONS 1. MATAS v0.2 (http://hdl.handle.net/20.500.11821/9) 2. MATAS v1.0 ...
 This item contains 3 files (23.1 MB).
 
Publicly Available

Most Viewed Items

Top Last Week
toolService
toolService
Description:
Trilingual BERT-like (Bidirectional Encoder Representations from Transformers) model, trained on Lithuanian, Latvian, and English data. State of the art tool representing words/tokens as contextually dependent word embeddings, ...
 This item contains 3 files (1.83 GB).
 
Publicly Available
corpus
corpus
Description:
Specialised "Corpus of Discourse on Crime" is synchronic, monolingual, unannotated, consists of two subcorpora. Subcorpus 1: all texts on crime, published in criminal columns on the most popular Lithuanian web portals ...
 This item contains 1 file (1.11 MB).
 
Publicly Available
corpus
corpus
Description:
Corpus of the Contemporary Lithuanian Language, which comprises 208 million words, is a collection of texts designed to represent the current Lithuanian. The corpus has been compiled since 1990. The corpus is designed to ...
 This item contains no files.