summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-05-21train-test-split: remove unused paramterPatrick Simianer
2021-05-21de-bpe: updatePatrick Simianer
2020-06-30tmx-extract.py: replace newlinesPatrick Simianer
2020-05-11Merge pull request #2 from lilt/feature/tmxPython3pks
Update tmx-extract.py to use python3
2020-05-11Update tmx-extract.py to use python3thomasZen
2020-03-09de-sgm: match more stuffPatrick Simianer
2020-02-19TSV utilsPatrick Simianer
2020-02-19misc. scriptspks
2020-02-19sentencepiece-decodePatrick Simianer
2020-02-03langid-polyglotPatrick Simianer
2020-02-03zh-ko-or-jaPatrick Simianer
2020-02-03NFC normalization in pythonPatrick Simianer
2020-02-03print out all charsPatrick Simianer
2020-02-03de-sgmPatrick Simianer
2020-01-14Merge pull request #1 from lilt/masterpks
Remove paragraph opening and closing tag (#1)
2020-01-13Remove paragraph opening and closing tag (#1)thomasZen
2019-12-24Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-12-24de-sgm: tags w/ propertiesPatrick Simianer
2019-12-24mkidxPatrick Simianer
2019-12-24biuniq: uniquify a parallel corpus with a dictionaryPatrick Simianer
2019-12-24formattingPatrick Simianer
2019-09-03cma: flushPatrick Simianer
2019-08-09Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-08-09percentilePatrick Simianer
2019-08-05trollop -> optimistPatrick Simianer
2019-07-28infix for repeatedPatrick Simianer
2019-07-28trollop -> optimistPatrick Simianer
2019-03-17toks-per-line: countPatrick Simianer
2019-03-17per-sentence-ter: fixPatrick Simianer
2019-03-16vocabPatrick Simianer
2018-10-19mvPatrick Simianer
2018-08-14moving-sumPatrick Simianer
2018-06-21mvPatrick Simianer
2018-04-27Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-27filter-len: more paramsPatrick Simianer
2018-04-17invPatrick Simianer
2018-04-17Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-17bitext-filter-length improvedPatrick Simianer
2018-04-11READMEPatrick Simianer
2018-04-11rmPatrick Simianer
2018-04-11substractPatrick Simianer
2018-04-11Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-11firstPatrick Simianer
2018-04-03fixPatrick Simianer
2018-03-29cmaPatrick Simianer
2018-03-29bitext-filter-lengthPatrick Simianer
2018-01-30Merge branch 'master' of github.com:pks/nlp_scriptsPatrick Simianer
2018-01-30tmx-extract.pyPatrick Simianer
2017-12-14length-ratioPatrick Simianer
2017-12-14bitext-filter-lengthPatrick Simianer