summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2024-04-08select-from: large filespks
2024-01-11de-sgm: fixPatrick Simianer
2024-01-11avg: running averagePatrick Simianer
2022-12-22Merge branch 'master' of ssh://github.com/pks/nlp_scriptspks
2022-12-22tmx-to-plain.pypks
2021-11-11de-sgm: use egrepPatrick Simianer
2021-05-21biuniq: fixPatrick Simianer
2021-05-21remove-test-from-bitextPatrick Simianer
2021-05-21tmx-extract-original-py2.pyPatrick Simianer
2021-05-21tsv-excludePatrick Simianer
2021-05-21remove-devtestPatrick Simianer
2021-05-21mergePatrick Simianer
2021-05-21train-test-split: remove unused paramterPatrick Simianer
2021-05-21de-bpe: updatePatrick Simianer
2020-09-27train-test-split: proper implementationPatrick Simianer
2020-08-12Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2020-08-02moving-averagePatrick Simianer
2020-06-30tmx-extract.py: replace newlinesPatrick Simianer
2020-05-11Merge pull request #2 from lilt/feature/tmxPython3pks
2020-05-11Update tmx-extract.py to use python3thomasZen
2020-03-09de-sgm: match more stuffPatrick Simianer
2020-02-19TSV utilsPatrick Simianer
2020-02-19misc. scriptspks
2020-02-19sentencepiece-decodePatrick Simianer
2020-02-03langid-polyglotPatrick Simianer
2020-02-03zh-ko-or-jaPatrick Simianer
2020-02-03NFC normalization in pythonPatrick Simianer
2020-02-03print out all charsPatrick Simianer
2020-02-03de-sgmPatrick Simianer
2020-01-14Merge pull request #1 from lilt/masterpks
2020-01-13Remove paragraph opening and closing tag (#1)thomasZen
2019-12-24Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-12-24de-sgm: tags w/ propertiesPatrick Simianer
2019-12-24mkidxPatrick Simianer
2019-12-24biuniq: uniquify a parallel corpus with a dictionaryPatrick Simianer
2019-12-24formattingPatrick Simianer
2019-09-03cma: flushPatrick Simianer
2019-08-09Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-08-09percentilePatrick Simianer
2019-08-05trollop -> optimistPatrick Simianer
2019-07-28infix for repeatedPatrick Simianer
2019-07-28trollop -> optimistPatrick Simianer
2019-03-17toks-per-line: countPatrick Simianer
2019-03-17per-sentence-ter: fixPatrick Simianer
2019-03-16vocabPatrick Simianer
2018-10-19mvPatrick Simianer
2018-08-14moving-sumPatrick Simianer
2018-06-21mvPatrick Simianer
2018-04-27Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-27filter-len: more paramsPatrick Simianer