Show simple item record Dadurkevičius, Virginijus 2020-10-27T06:33:59Z 2020-10-27T06:33:59Z 2020-10-27
dc.description The resource is a wordlist of lemmas from the Joint Corpus of Lithuanian (JCL). The JCL is a merge of three corpora: 1) Vilnius university corpus compiled out of the Lithuanian internet content from 2014 and primarily used for machine translation (779,2m tokens), 2) legal document corpus in a form of wordlist (courtesy of the Office of the Seimas of the Republic of Lithuania, 2011) (443,1m tokens) and 3) corpus of the contemporary Lithuanian language (CCLL) of Vytautas Magnus University (112,6m tokens). Total size of the JCL is more than 1,3 billion tokens. The size of the frequency list of lemmas is 169,787 lemmas.
dc.language.iso lit
dc.publisher Vilnius university
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.label PUB
dc.subject frequency list
dc.subject wordlist
dc.subject Lithuanian
dc.subject lemmatized frequency list
dc.subject lemmatized wordlist
dc.title Wordlist of Lemmas from the Joint Corpus of Lithuanian
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType wordList
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding CLARIN-LT
contact.person Virginijus Dadurkevičius Vilnius university 169787 entries
files.size 953273
files.count 1

 Files in this item

This item is
Publicly Available
and licensed under:
930.93 KB
 Download file

Show simple item record