index
:
cdec-dtrain
cmake
master
sa_mmap
word-alignment
Mirror of https://github.com/pks/cdec-dtrain.git
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
corpus
Age
Commit message (
Expand
)
Author
2013-09-05
Unbuffered mode, flush after each line where possible, skip otherwise
Michael Denkowski
2013-09-04
Detokenizer
Michael Denkowski
2013-04-19
Merge branch 'master' of https://github.com/redpony/cdec
Chris Dyer
2013-04-19
hindi
Chris Dyer
2013-03-26
swahili abbreviations
Chris Dyer
2013-03-17
fix possible utf8 bug
Chris Dyer
2013-03-08
Merge branch 'master' of https://github.com/redpony/cdec
Chris Dyer
2013-03-08
few preproc fixes
Chris Dyer
2013-02-27
quick fix
Chris Dyer
2013-02-23
one missing quote type
Chris Dyer
2013-01-22
russian abbrevs
Chris Dyer
2013-01-21
tokenizer support for utf8 patterns
Chris Dyer
2013-01-21
a little bit of cleanup
Chris Dyer
2013-01-20
control max len
Chris Dyer
2013-01-19
updated version of boost.m4 and automatically build kenneth's LM builder
Chris Dyer
2013-01-15
corpus files
Chris Dyer
2012-12-05
slight tokenization bug fix
Chris Dyer
2012-12-05
remove logging, you should be using pv
Chris Dyer
2012-12-04
more flexible corpus cutting
Chris Dyer
2012-11-16
fix
Chris Dyer
2012-11-16
readme
Chris Dyer
2012-11-14
major mert clean up, stuff for simple system demo
Chris Dyer
2012-11-06
Merge branch 'master' of github.com:redpony/cdec
Chris Dyer
2012-11-06
add lowercase script
Chris Dyer
2012-11-05
script to add sos/eos
Chris Dyer
2012-10-25
add self translation
Chris Dyer
2012-07-28
script to paste files together with the triple pipe separator
Chris Dyer
2012-07-28
a couple of tools for cleaning corpora
Chris Dyer