Age | Commit message (Collapse) | Author | |
---|---|---|---|
2024-04-08 | select-from: large files | pks | |
2024-01-11 | de-sgm: fix | Patrick Simianer | |
2024-01-11 | avg: running average | Patrick Simianer | |
2022-12-22 | Merge branch 'master' of ssh://github.com/pks/nlp_scripts | pks | |
2022-12-22 | tmx-to-plain.py | pks | |
2021-11-11 | de-sgm: use egrep | Patrick Simianer | |
2021-05-21 | biuniq: fix | Patrick Simianer | |
2021-05-21 | remove-test-from-bitext | Patrick Simianer | |
2021-05-21 | tmx-extract-original-py2.py | Patrick Simianer | |
2021-05-21 | tsv-exclude | Patrick Simianer | |
2021-05-21 | remove-devtest | Patrick Simianer | |
2021-05-21 | merge | Patrick Simianer | |
2021-05-21 | train-test-split: remove unused paramter | Patrick Simianer | |
2021-05-21 | de-bpe: update | Patrick Simianer | |
2020-09-27 | train-test-split: proper implementation | Patrick Simianer | |
2020-08-12 | Merge branch 'master' of ssh://github.com/pks/nlp_scripts | Patrick Simianer | |
2020-08-02 | moving-average | Patrick Simianer | |
2020-06-30 | tmx-extract.py: replace newlines | Patrick Simianer | |
2020-05-11 | Merge pull request #2 from lilt/feature/tmxPython3 | pks | |
Update tmx-extract.py to use python3 | |||
2020-05-11 | Update tmx-extract.py to use python3 | thomasZen | |
2020-03-09 | de-sgm: match more stuff | Patrick Simianer | |
2020-02-19 | TSV utils | Patrick Simianer | |
2020-02-19 | misc. scripts | pks | |
2020-02-19 | sentencepiece-decode | Patrick Simianer | |
2020-02-03 | langid-polyglot | Patrick Simianer | |
2020-02-03 | zh-ko-or-ja | Patrick Simianer | |
2020-02-03 | NFC normalization in python | Patrick Simianer | |
2020-02-03 | print out all chars | Patrick Simianer | |
2020-02-03 | de-sgm | Patrick Simianer | |
2020-01-14 | Merge pull request #1 from lilt/master | pks | |
Remove paragraph opening and closing tag (#1) | |||
2020-01-13 | Remove paragraph opening and closing tag (#1) | thomasZen | |
2019-12-24 | Merge branch 'master' of ssh://github.com/pks/nlp_scripts | Patrick Simianer | |
2019-12-24 | de-sgm: tags w/ properties | Patrick Simianer | |
2019-12-24 | mkidx | Patrick Simianer | |
2019-12-24 | biuniq: uniquify a parallel corpus with a dictionary | Patrick Simianer | |
2019-12-24 | formatting | Patrick Simianer | |
2019-09-03 | cma: flush | Patrick Simianer | |
2019-08-09 | Merge branch 'master' of ssh://github.com/pks/nlp_scripts | Patrick Simianer | |
2019-08-09 | percentile | Patrick Simianer | |
2019-08-05 | trollop -> optimist | Patrick Simianer | |
2019-07-28 | infix for repeated | Patrick Simianer | |
2019-07-28 | trollop -> optimist | Patrick Simianer | |
2019-03-17 | toks-per-line: count | Patrick Simianer | |
2019-03-17 | per-sentence-ter: fix | Patrick Simianer | |
2019-03-16 | vocab | Patrick Simianer | |
2018-10-19 | mv | Patrick Simianer | |
2018-08-14 | moving-sum | Patrick Simianer | |
2018-06-21 | mv | Patrick Simianer | |
2018-04-27 | Merge branch 'master' of https://github.com/pks/nlp_scripts | Patrick Simianer | |
2018-04-27 | filter-len: more params | Patrick Simianer | |