Show simple item record

 
dc.contributor.author Bielinskienė, Agnė
dc.contributor.author Boizou, Loïc
dc.contributor.author Bumbulienė, Ieva
dc.contributor.author Kovalevskaitė, Jolanta
dc.contributor.author Krilavičius, Tomas
dc.contributor.author Mandravickaitė, Justina
dc.contributor.author Rimkutė, Erika
dc.contributor.author Vilkaitė-Lozdienė, Laura
dc.date.accessioned 2019-11-27T13:57:19Z
dc.date.available 2019-11-27T13:57:19Z
dc.date.issued 2019
dc.identifier.uri http://hdl.handle.net/20.500.11821/26
dc.description GloVe type word vectors (embeddings) for Lithuanian. Delfi.lt corpus (~70 million words) and StanfordNLP were used for training. The training consisted of several stages: 1) the vocabulary was compiled, eliminating words the the frequency less than 5; 2) word co-occurrence matrix was generated with window size of 5; 3) this matrix was randomly shuffled; 4) word vectors were generated (100 iterations, 200 dimensions). The final result consists of 331 203 unique word vectors.
dc.language.iso lit
dc.publisher Baltic Institute of Advanced Technology
dc.publisher Vytautas Magnus University
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.uri https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm
dc.rights.label PUB
dc.source.uri http://mwe.lt/
dc.subject word embeddings
dc.subject Lithuanian
dc.subject embeddings
dc.title Lithuanian Word embeddings
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType other
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN-LT
contact.person Tomas Krilavičius tomas.krilavicius@vdu.lt Baltic Institute of Advanced Technology; Vytautas Magnus University
sponsor The Research Council of Lithuania LIP-027/2016 Automatic Identification of Lithuanian Multi-word Expressions (PASTOVU) nationalFunds
size.info 331 203 entries
files.size 239127890
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
Icon
Name
vectors.zip
Size
228.05 MB
Format
application/zip
Description
Lithuanian word vectors
 Download file

Show simple item record