Show simple item record

 
dc.contributor.author Kapočiūtė-Dzikienė, Jurgita
dc.contributor.author Šarkutė, Ligita
dc.contributor.author Utka, Andrius
dc.date.accessioned 2017-10-11T06:10:25Z
dc.date.available 2017-10-11T06:10:25Z
dc.date.issued 2017-10-05
dc.identifier.uri http://hdl.handle.net/20.500.11821/17
dc.description 23.9 m word Lithuanian Parliament corpus is specially designed for authorship attribution task. The corpus consists of 111 thousand samples of speech transcripts by 147 parliamentarians in Lithuanian Seimas. It covers the period of March, 1990 – December, 2013. Each line in a corpus file contains a different text feature that can be used in the authorship attribution task (Kapočiūtė Dzikienė et al. 2014). References: Kapočiūtė-Dzikienė, Jurgita, Utka, Andrius, Šarkutė, Ligita. 2014. Feature exploration for authorship attribution of Lithuanian parliamentary speeches. Text, speech and dialogue: 17th international conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014: proceedings, 93-100. Kapočiūtė-Dzikienė, Jurgita; Nivre, Joakim; Krupavičius, Algis. 2013. Lithuanian Dependency Parsing with Rich Morphological Features. Empirical Methods in Natural Language Processing - 4th Workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL'2013), psl. 12-21. Zinkevičius, Vytautas. 2000. Lemuoklis - morfologinei analizei. Gudaitis, L. (ed.) Darbai ir Dienos, 24: 246-273.
dc.language.iso lit
dc.publisher Vytautas Magnus University
dc.rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
dc.rights.uri https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm
dc.rights.label PUB
dc.source.uri http://dangus.vdu.lt/~jkd/eng/
dc.subject corpus
dc.subject authorship attribution
dc.subject Lithuanian
dc.subject supervised machine learning
dc.title Lithuanian Parliament Corpus for Authorship Attribution
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding CLARIN-LT
contact.person Andrius Utka andrius.utka@vdu.lt Vytautas Magnus University
contact.person Jurgita Kapočiūtė-Dzikienė jurgita.kapociute-dzikiene@vdu.lt Vytautas Magnus University
sponsor Research Council of Lithuania LIT-8-69 Automatic Authorship Attribution and Author Profiling for the Lithuanian Language (ASTRA) nationalFunds
size.info 23908302 words
size.info 147 classes
size.info 110908 texts
files.size 1844091961
files.count 4


 Files in this item

This item is
Publicly Available
and licensed under:
PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT
Icon
Name
EN-about.pdf
Size
349.51 KB
Format
PDF
Description
Description
 Download file
Icon
Name
LT-apie.pdf
Size
432.79 KB
Format
PDF
Description
aprašas
 Download file
Icon
Name
A-M_Lithuanian_Parliament_Corpus.zip
Size
1018.65 MB
Format
application/zip
Description
Unknown
 Download file
Icon
Name
N-V_Lithuanian_Parliament_Corpus.zip
Size
739.24 MB
Format
application/zip
Description
zip archive
 Download file

Show simple item record