From 4236729897ab454f6b28613364b06e94ebbb080e Mon Sep 17 00:00:00 2001 From: mjdenkowski Date: Fri, 18 Apr 2014 15:15:40 -0400 Subject: Stream mode for grammar extractor --- python/README.md | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) (limited to 'python/README.md') diff --git a/python/README.md b/python/README.md index 953971d3..37c7b78e 100644 --- a/python/README.md +++ b/python/README.md @@ -23,7 +23,25 @@ Extract grammar rules from the compiled corpus: cat input.txt | python -m cdec.sa.extract -c extract.ini -g grammars/ -z This will create per-sentence grammar files in the `grammars` directory and output annotated input suitable for translation with cdec. - + +Extract rules in stream mode: + + python -m cdec.sa.extract -c extract.ini -t -z + +This will enable stdio interaction with the following types of lines: + +Extract grammar: + + context ||| sentence ||| grammar_file + +Learn (online mode, specify context name): + + context ||| sentence ||| reference ||| alignment + +Drop (online mode, specify context name): + + context ||| drop + ## Library usage A basic demo of pycdec's features is available in `examples/test.py`. -- cgit v1.2.3 From 72e46b00edf847b9bd4b6299788586ea57da037c Mon Sep 17 00:00:00 2001 From: armatthews Date: Sun, 18 May 2014 17:22:56 -0400 Subject: Added information on how to recompile pycdec from the pyx files --- python/README.md | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'python/README.md') diff --git a/python/README.md b/python/README.md index 37c7b78e..2cc77037 100644 --- a/python/README.md +++ b/python/README.md @@ -8,6 +8,13 @@ Build and install pycdec: Alternatively, run `python setup.py build_ext --inplace` and add the `python/` directory to your `PYTHONPATH`. +To re-build pycdec from the cython source, modify setup.py in the following ways: + * Add this input statement: from Cython.Build import cythonize + * Change the source file from cdec/\_cdec.cpp to cdec/\_cdec.pyx + * Add language='c++' as a property to ext\_modules (e.g. right after extra\_link\_args) + * In the final setup block, change ext\_modules=ext\_modules to ext\_modules=cythonize(ext\_modules) +Then just build and install normally, as described above. + ## Grammar extractor Compile a parallel corpus and a word alignment into a suffix array representation: -- cgit v1.2.3 From f520628b468b57a3642ee63a72690b120f4e94e7 Mon Sep 17 00:00:00 2001 From: armatthews Date: Sun, 18 May 2014 17:30:29 -0400 Subject: Added a newline --- python/README.md | 1 + 1 file changed, 1 insertion(+) (limited to 'python/README.md') diff --git a/python/README.md b/python/README.md index 2cc77037..03d9f31d 100644 --- a/python/README.md +++ b/python/README.md @@ -13,6 +13,7 @@ To re-build pycdec from the cython source, modify setup.py in the following ways * Change the source file from cdec/\_cdec.cpp to cdec/\_cdec.pyx * Add language='c++' as a property to ext\_modules (e.g. right after extra\_link\_args) * In the final setup block, change ext\_modules=ext\_modules to ext\_modules=cythonize(ext\_modules) + Then just build and install normally, as described above. ## Grammar extractor -- cgit v1.2.3