summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-02-19sentencepiece-decodePatrick Simianer
2020-02-03langid-polyglotPatrick Simianer
2020-02-03zh-ko-or-jaPatrick Simianer
2020-02-03NFC normalization in pythonPatrick Simianer
2020-02-03print out all charsPatrick Simianer
2020-02-03de-sgmPatrick Simianer
2020-01-14Merge pull request #1 from lilt/masterpks
Remove paragraph opening and closing tag (#1)
2020-01-13Remove paragraph opening and closing tag (#1)thomasZen
2019-12-24Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-12-24de-sgm: tags w/ propertiesPatrick Simianer
2019-12-24mkidxPatrick Simianer
2019-12-24biuniq: uniquify a parallel corpus with a dictionaryPatrick Simianer
2019-12-24formattingPatrick Simianer
2019-09-03cma: flushPatrick Simianer
2019-08-09Merge branch 'master' of ssh://github.com/pks/nlp_scriptsPatrick Simianer
2019-08-09percentilePatrick Simianer
2019-08-05trollop -> optimistPatrick Simianer
2019-07-28infix for repeatedPatrick Simianer
2019-07-28trollop -> optimistPatrick Simianer
2019-03-17toks-per-line: countPatrick Simianer
2019-03-17per-sentence-ter: fixPatrick Simianer
2019-03-16vocabPatrick Simianer
2018-10-19mvPatrick Simianer
2018-08-14moving-sumPatrick Simianer
2018-06-21mvPatrick Simianer
2018-04-27Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-27filter-len: more paramsPatrick Simianer
2018-04-17invPatrick Simianer
2018-04-17Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-17bitext-filter-length improvedPatrick Simianer
2018-04-11READMEPatrick Simianer
2018-04-11rmPatrick Simianer
2018-04-11substractPatrick Simianer
2018-04-11Merge branch 'master' of https://github.com/pks/nlp_scriptsPatrick Simianer
2018-04-11firstPatrick Simianer
2018-04-03fixPatrick Simianer
2018-03-29cmaPatrick Simianer
2018-03-29bitext-filter-lengthPatrick Simianer
2018-01-30Merge branch 'master' of github.com:pks/nlp_scriptsPatrick Simianer
2018-01-30tmx-extract.pyPatrick Simianer
2017-12-14length-ratioPatrick Simianer
2017-12-14bitext-filter-lengthPatrick Simianer
2017-12-13filter-tokensPatrick Simianer
2017-12-05bishuf: proper fixed source of randomnessPatrick Simianer
2017-12-05bishuf: simplistic synchronized shuffing of two filesPatrick Simianer
2017-12-05langPatrick Simianer
2017-12-03select-from: fixPatrick Simianer
2017-12-03langPatrick Simianer
2017-12-03filter-lenPatrick Simianer
2017-12-03hist-tok: +xPatrick Simianer