DIGIRES COVID-19 ML dataset v.1 is a tab-separated (.tsv) file prepared for training machine learning algorithms. The training dataset was compiled from various internet public Lithuanian media sources. It contains 351 records and has the following attributes:
"Title": the title of a news article
"Text": the text of the article
"Label": a label that marks the article as 1: unreliable; 0: reliable
1) "unrealiable" marks articles, which were identified by professional fact checkers as fake news; 2) "reliable" marks trustworthy articles.
Classes Labels Word tokens
Reliable: 175 67902
Unreliable: 176 118747
Total 351 186649