dc.contributor.author |
Lamb, William |
dc.contributor.author |
Boizou, Loïc |
dc.date.accessioned |
2021-06-02T14:32:43Z |
dc.date.available |
2021-06-02T14:32:43Z |
dc.date.issued |
2020 |
dc.identifier.uri |
http://hdl.handle.net/20.500.11821/44 |
dc.description |
A linguistic analyser for tagging, lemmatisation and parsing of Scottish Gaelic texts. Morphological and syntactic analyses are available directly from the webpage (through the text area window) or as a web service. A simple tagger option using a restricted tagset is also provided.
LANGUAGE DATA
The tagger was trained with the ARCOSG corpus (https://github.com/Gaelic-Algorithmic-Research-Group/ARCOSG) using Conditional Random Fields with scikit-learn (https://scikit-learn.org). The lemmatiser was build on the top of a lexicon provided by Michael Bauer and Will Robertson (www.faclair.com). The integrated UDPipe parser (http://ufal.mff.cuni.cz/udpipe) was trained with link2 option on Colin Batchelor's UD Gaelic Treebank (https://universaldependencies.org/).
OUTPUT FORMAT
Vertical tabular:
- simple tabbed text for direct html page results,
- simple tabbed text file or conllu file for web service results.
Grammatical information encoded through ARCOSG tagset and UD tagset.
EVALUATION
Full tagger accuracy of 90.7% (measured on about 4.6% of the ARCOSG corpus)
Simple tagger accuracy of 94.7% (measured on about 4.6% of the ARCOSG corpus)
Lemmatisation and Parsing not evaluated yet. |
dc.language.iso |
gla |
dc.publisher |
University of Edinburgh |
dc.publisher |
Vytautas Magnus University, Centre of Computational Linguistics |
dc.source.uri |
https://klc.vdu.lt/sgtoolkit/en |
dc.subject |
tagger |
dc.subject |
parser |
dc.subject |
Scottish Gaellic |
dc.title |
The Scottish Gaelic Linguistic Toolkit |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
service |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
has.files |
no |
branding |
CLARIN-LT |
contact.person |
William Lamb w.lamb@ed.ac.uk University of Edinburgh |
contact.person |
Loïc Boizou lboizou@gmail.com Vytautas Magnus University, Centre of Computational Linguistics |
files.size |
0 |
files.count |
0 |