summaryrefslogtreecommitdiff
path: root/corpus/support/README
blob: fdbd523e7ed8d12f822de296c0151be448c97080 (plain)
1
2
Run ./tokenize.sh to tokenize text
Edit eng_token_patterns and eng_token_list to add rules for things not to segment