dc.contributor.author |
Dadurkevičius, Virginijus |
dc.date.accessioned |
2020-10-27T06:33:59Z |
dc.date.available |
2020-10-27T06:33:59Z |
dc.date.issued |
2020-10-27 |
dc.identifier.uri |
http://hdl.handle.net/20.500.11821/41 |
dc.description |
The resource is a wordlist of lemmas from the Joint Corpus of Lithuanian (JCL). The JCL is a merge of three corpora: 1) Vilnius university corpus compiled out of the Lithuanian internet content from 2014 and primarily used for machine translation (779,2m tokens), 2) legal document corpus in a form of wordlist (courtesy of the Office of the Seimas of the Republic of Lithuania, 2011) (443,1m tokens) and 3) corpus of the contemporary Lithuanian language (CCLL) of Vytautas Magnus University (112,6m tokens). Total size of the JCL is more than 1,3 billion tokens. The size of the frequency list of lemmas is 169,787 lemmas. |
dc.language.iso |
lit |
dc.publisher |
Vilnius university |
dc.rights |
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT |
dc.rights.uri |
https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm |
dc.rights.label |
PUB |
dc.subject |
frequency list |
dc.subject |
wordlist |
dc.subject |
Lithuanian |
dc.subject |
lemmatized frequency list |
dc.subject |
lemmatized wordlist |
dc.title |
Wordlist of Lemmas from the Joint Corpus of Lithuanian |
dc.type |
lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType |
wordList |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
hidden |
false |
hasMetadata |
false |
has.files |
yes |
branding |
CLARIN-LT |
contact.person |
Virginijus Dadurkevičius dadurka@gmail.com Vilnius university |
size.info |
169787 entries |
files.size |
953273 |
files.count |
1 |