Show simple item record

 
dc.contributor.author Bielinskienė, Agnė
dc.contributor.author Boizou, Loïc
dc.contributor.author Bumbulienė, Ieva
dc.contributor.author Kovalevskaitė, Jolanta
dc.contributor.author Krilavičius, Tomas
dc.contributor.author Mandravickaitė, Justina
dc.contributor.author Rimkutė, Erika
dc.contributor.author Vilkaitė-Lozdienė, Laura
dc.date.accessioned 2019-11-27T13:57:44Z
dc.date.available 2019-11-27T13:57:44Z
dc.date.issued 2019
dc.identifier.uri http://hdl.handle.net/20.500.11821/27
dc.description Dataset of 1-grams with frequencies extracted from Delfi.lt corpus (~ 70 million words, period: March 2014 - November 2016). Firstly corpus was split into sentences, then symbol analysis as well as analysis of intended structures made of symbols were performed. Also, dictionary of abbreviations was used in order to preserve various abbreviations. Finally, 1-grams generated, making all in all 72 million entries. Frequencies of all entries were added to the dataset as well.
dc.language.iso lit
dc.publisher Baltic Institute of Advanced Technology
dc.publisher Vytautas Magnus University
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.uri https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm
dc.rights.label PUB
dc.source.uri http://mwe.lt/
dc.subject n-grams
dc.subject Lithuanian
dc.title Lithuanian 1-gram dataset
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType other
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN-LT
contact.person Tomas Krilavičius tomas.krilavicius@vdu.lt Baltic Institute of Advanced Technology; Vytautas Magnus University
sponsor Research Council of Lithuania LIP-027/2016 Automatic Identification of Lithuanian Multi-word Expressions (PASTOVU) nationalFunds
size.info 72000000 entries
files.size 5623336
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
Icon
Name
1gram.zip
Size
5.36 MB
Format
application/zip
Description
Lithuanian 1gram dataset
 Download file

Show simple item record