GitHub - alvations/sacremoses: Python port of Moses ...
https://github.com/alvations/sacremoses$ sacremoses tokenize --help Usage: sacremoses tokenize [OPTIONS] Options: -a, --aggressive-dash-splits Triggers dash split rules. -x, --xml-escape Escape special characters for XML. -p, --protected-patterns TEXT Specify file with patters to be protected in tokenisation. -c, --custom-nb-prefixes TEXT Specify a custom non-breaking prefixes file, add prefixes to the default ones …
fast-mosestokenizer · PyPI
https://pypi.org/project/fast-mosestokenizer29.10.2021 · c++ mosestokenizer Project description fast-mosestokenizer is a C++ implementation of the moses tokenizer which is a favourite among the folks in NLP research. The reason for using this package over the original perl implementation is for the purpose of portability. With the C++ source code, you can use this library basically in every language.
mosestokenizer · PyPI
pypi.org › project › mosestokenizerOct 22, 2021 · This package provides wrappers for some pre-processing Perl scripts from the Moses toolkit, namely, normalize-punctuation.perl, tokenizer.perl, detokenizer.perl and split-sentences.perl. Sample Usage All provided classes are importable from the package mosestokenizer .
fast-mosestokenizer · PyPI
pypi.org › project › fast-mosestokenizerOct 29, 2021 · fast-mosestokenizer is a C++ implementation of the moses tokenizer which is a favourite among the folks in NLP research. The reason for using this package over the original perl implementation is for the purpose of portability. With the C++ source code, you can use this library basically in every language.
mosestokenizer · PyPI
https://pypi.org/project/mosestokenizer22.10.2021 · Sample Usage. All provided classes are importable from the package mosestokenizer. >>> from mosestokenizer import * All classes have a constructor that takes a two-letter language code as argument ('en', 'fr', 'de', etc) and the resulting objects are callable.When created, these wrapper objects launch the corresponding Perl script as a background process.
GitHub - luismsgomes/mosestokenizer
https://github.com/luismsgomes/mosestokenizer22.10.2021 · mosestokenizer This package provides wrappers for some pre-processing Perl scripts from the Moses toolkit, namely, normalize-punctuation.perl, tokenizer.perl , detokenizer.perl and split-sentences.perl. Sample Usage All provided classes are importable from the package mosestokenizer. >>> from mosestokenizer import *