| dc.contributor.author | Dadurkevičius, Virginijus |
| dc.date.accessioned | 2024-03-21T17:53:16Z |
| dc.date.available | 2024-03-21T17:53:16Z |
| dc.date.issued | 2024-03-13 |
| dc.identifier.uri | http://hdl.handle.net/20.500.11821/57 |
| dc.description | We present the comparative wordlist based on the Corpus of the Contemporary Lithuanian Language (CCLL2 version 2, pre-2020), supplemented by the media (courtesy of the news media company 15min – www.15min.lt) and social networks lexicons of the war in Ukraine period (Feb 2022 to Feb 2024). For a fair comparison, all word counts have been normalized as if they were 100m words in each source. CCLL2 has 162m words, wartime media – 36m words and wartime social networks – 2m words. The term "word" does not apply here to punctuation, numbers, dates, URL's, hashtags, popular English words, etc. The data itself is in the form of a tab-separated-values (TSV) text file consisting of the following columns: word(token), CCLL2 count, CCLL2 docs, media count, media docs, social networks count, social networks docs. Where "docs" mean number (normalized) of documents with a particular word. All words are written as case-insensitive using capital letters. |
| dc.language.iso | lit |
| dc.publisher | SITTI |
| dc.rights | PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT |
| dc.rights.uri | https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm |
| dc.rights.label | PUB |
| dc.subject | wordlist |
| dc.subject | Lithuanian |
| dc.subject | Ukraine |
| dc.subject | war |
| dc.subject | wartime |
| dc.subject | frequency counts |
| dc.title | Wordlist of the Contemporary Corpus of Lithuanian Language in the Face of War in Ukraine |
| dc.type | lexicalConceptualResource |
| metashare.ResourceInfo#ContentInfo.detailedType | wordList |
| metashare.ResourceInfo#ContentInfo.mediaType | text |
| hidden | false |
| hasMetadata | false |
| has.files | yes |
| branding | CLARIN-LT |
| contact.person | Virginijus Dadurkevičius virginijus.dadurkevicius@vdu.lt SITTI |
| size.info | 2264779 entries |
| size.info | 2264780 entries |
| files.size | 11111146 |
| files.count | 1 |