Show simple item record

 
dc.contributor.author Dadurkevičius, Virginijus
dc.date.accessioned 2020-10-27T06:33:59Z
dc.date.available 2020-10-27T06:33:59Z
dc.date.issued 2020-10-27
dc.identifier.uri http://hdl.handle.net/20.500.11821/41
dc.description The resource is a wordlist of lemmas from the Joint Corpus of Lithuanian (JCL). The JCL is a merge of three corpora: 1) Vilnius university corpus compiled out of the Lithuanian internet content from 2014 and primarily used for machine translation (779,2m tokens), 2) legal document corpus in a form of wordlist (courtesy of the Office of the Seimas of the Republic of Lithuania, 2011) (443,1m tokens) and 3) corpus of the contemporary Lithuanian language (CCLL) of Vytautas Magnus University (112,6m tokens). Total size of the JCL is more than 1,3 billion tokens. The size of the frequency list of lemmas is 169,787 lemmas.
dc.language.iso lit
dc.publisher Vilnius university
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.uri https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm
dc.rights.label PUB
dc.subject frequency list
dc.subject wordlist
dc.subject Lithuanian
dc.subject lemmatized frequency list
dc.subject lemmatized wordlist
dc.title Wordlist of Lemmas from the Joint Corpus of Lithuanian
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType wordList
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding CLARIN-LT
contact.person Virginijus Dadurkevičius dadurka@gmail.com Vilnius university
size.info 169787 entries
files.size 953273
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
Icon
Name
JCL_Wordlist.zip
Size
930.93 KB
Format
application/zip
 Download file

Show simple item record