Show simple item record

 
dc.contributor.author Utka, Andrius
dc.date.accessioned 2016-11-17T12:12:13Z
dc.date.available 2016-11-17T12:12:13Z
dc.date.issued 2016-11-17
dc.identifier.uri http://hdl.handle.net/20.500.11821/8
dc.description Dabartinės lietuvių kalbos tekstyno žodžių formų dažniniai sąrašai Worlists of Wordforms of the Contemporary Corpus of Lithuanian language Tekstyno struktūra/Corpus Structure Patekstynis/Subcorpus Words,m Proportion Grožinė lit./Fiction 15.54 12.6% Negrožinė lit./Non-fiction 19.99 16.2% Administracinė lit./ Documents 11.19 9.1% Periodika/Periodicals 76.24 61.8% Sakytinė kalba/Speech Corpus 0.49 0.4% --- Visas/Total 123.45 100% Tinklalapiai/Website: tekstynas.vdu.lt corpus.vdu.lt Data/Date: 2016.10.17 2022.11.15* * upgraded method of handling punctuation and format Metodas/Method: sed -e 's/<[^>]*>//g' *.txt | tr q'[:punct:]' ' ' | tr -s ' ' | tr ' ' '\n' | tr '[:upper:]' '[:lower:]' | grep -v '[^a-z]' | grep -v "^\s*$" | sort | uniq -c | sort -rn > freq-visas.txt Kaip cituoti/Reference Rimkutė E., Kovalevskaitė J., Melninkaitė V., Utka A., Vitkutė-Adžgauskienė D. 2010: Corpus of Contemporary Lithuanian Language – the Standardised Way. Proceedings of the Fourth International Conference Human Language Technologies – The Baltic Perspective, 154–160. Licencija/Licence: CLARIN-LT PUB
dc.language.iso lit
dc.publisher Vytautas Magnus University
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.uri https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm
dc.rights.label PUB
dc.subject wordlist
dc.subject Lithuanian
dc.title Wordlist of the Contemporary Corpus of Lithuanian language
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType wordList
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding CLARIN-LT
contact.person Andrius Utka andrius.utka@vdu.lt Vytautas Magnus University
size.info 1850477 entries
files.size 34766572
files.count 2


 Files in this item  Download all files in item (33.16 MB)

This item is
Publicly Available
and licensed under:
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
Icon
Name
CCLL-Wordlists-2022.zip
Size
33.15 MB
Format
application/zip
 Download file
Icon
Name
0readme.txt
Size
1.21 KB
Format
Text file
 Download file

Show simple item record