What's New

corpus
corpus
Description:
The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data.
 This item contains 1 file (4.3 MB).
 
Publicly Available
corpus
corpus
Description:
MATAS corpus (version 1.0) DESCRIPTION Manually checked, morphologically annotated corpus MATAS FORMATS 1. CoNLL-U (CONLLU, conllu) 2. SketchEngine - tab delimited word per line (TAB-WPL, txt) SIZE Wordform ...
 This item contains 3 files (32.95 MB).
 
Publicly Available
toolService
toolService
Description:
Colloc -- a tool for automatic identification of multiword expressions (MWE) is freely available for online use at http://resursai.mwe.lt/atpazintuvas. As material for training DELFI.lt corpus (http://tekstynas.mwe.lt/) ...
 This item contains no files.

Most Viewed Items

Top Last Week
corpus
corpus
Description:
MATAS corpus (version 1.0) DESCRIPTION Manually checked, morphologically annotated corpus MATAS FORMATS 1. CoNLL-U (CONLLU, conllu) 2. SketchEngine - tab delimited word per line (TAB-WPL, txt) SIZE Wordform ...
 This item contains 3 files (32.95 MB).
 
Publicly Available
corpus
corpus
Description:
The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data.
 This item contains 1 file (4.3 MB).
 
Publicly Available
lexicalConceptualResource
lexicalConceptualResource
Description:
GloVe type word vectors (embeddings) for Lithuanian. Delfi.lt corpus (~70 million words) and StanfordNLP were used for training. The training consisted of several stages: 1) the vocabulary was compiled, eliminating words ...
 This item contains 1 file (228.05 MB).
 
Publicly Available