diff options
author | Kenneth Heafield <github@kheafield.com> | 2012-10-22 12:07:20 +0100 |
---|---|---|
committer | Kenneth Heafield <github@kheafield.com> | 2012-10-22 12:07:20 +0100 |
commit | 5f98fe5c4f2a2090eeb9d30c030305a70a8347d1 (patch) | |
tree | 9b6002f850e6dea1e3400c6b19bb31a9cdf3067f /python/README.md | |
parent | cf9994131993b40be62e90e213b1e11e6b550143 (diff) | |
parent | 21825a09d97c2e0afd20512f306fb25fed55e529 (diff) |
Merge remote branch 'upstream/master'
Conflicts:
Jamroot
bjam
decoder/Jamfile
decoder/cdec.cc
dpmert/Jamfile
jam-files/sanity.jam
klm/lm/Jamfile
klm/util/Jamfile
mira/Jamfile
Diffstat (limited to 'python/README.md')
-rw-r--r-- | python/README.md | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/python/README.md b/python/README.md index da9f9387..bea6190a 100644 --- a/python/README.md +++ b/python/README.md @@ -12,6 +12,10 @@ Compile a parallel corpus and a word alignment into a suffix array representatio python -m cdec.sa.compile -f f.txt -e e.txt -a a.txt -o output/ -c extract.ini +Or, if your parallel corpus is in a single-file format (with source and target sentences on a single line, separated by a triple pipe `|||`), use: + + python -m cdec.sa.compile -b f-e.txt -a a.txt -o output/ -c extract.ini + Extract grammar rules from the compiled corpus: cat input.txt | python -m cdec.sa.extract -c extract.ini -g grammars/ |