tagger
Table of Contents
Tagger and lemmatizer HOWTO
Installation
> git clone https://github.com/ufal/morphodita > cd src/ > vim Makefile.builtem - C_FLAGS += -std=c++11 -W -Wall -mtune=generic -msse -msse2 -mfpmath=sse -fvisibility=hidden -U_FORTIFY_SOURCE + C_FLAGS += -std=c++11 -W -Wall -march=native -fvisibility=hidden -U_FORTIFY_SOURCE > make
Models
Download, unzip:
Czech: https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-68D8-1
English: https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-68D9-0
(download link is at the bottom of the page)
(beware, the models may have a non-free license)
Run tagger
echo "Červený střízlíček a střapatá žluva ďobali šťavnaté ocúny" \ | ./run_tagger czech-morfflex-pdt-131112-raw_lemmas.tagger-best_accuracy
Run lemmatizer
echo "Červený střízlíček a střapatá žluva ďobali šťavnaté ocúny." \ | ./run_tagger --input=untokenized --output=vertical \ czech-morfflex-pdt-131112-pos_only-raw_lemmas.tagger 2>/dev/null \ | cut -f 2 | tr "\n" " "
Problems
Loading big models takes several seconds, but the tagging itself is very fast. The new version contains REST server, so it can be started once and handle multiple requests.
tagger.txt · Last modified: 2019-06-21 13:32:06 by 127.0.0.1