summaryrefslogtreecommitdiff
path: root/extractor/README.md
diff options
context:
space:
mode:
authorPatrick Simianer <p@simianer.de>2013-12-04 20:13:07 +0100
committerPatrick Simianer <p@simianer.de>2013-12-04 20:13:07 +0100
commit9ff43d7c8e076aaa8790bacbd4b2cfe636a55a97 (patch)
treee1e0265b18ffc854f24209cb36b2c836100f099b /extractor/README.md
parente59cdac5253df7ab57296d347245d1a8f4d8b287 (diff)
parent407b100cd3e4ae987504b53101151fba287ad999 (diff)
fix merge conflict
Diffstat (limited to 'extractor/README.md')
-rw-r--r--extractor/README.md10
1 files changed, 8 insertions, 2 deletions
diff --git a/extractor/README.md b/extractor/README.md
index c9db8de8..642fbd1d 100644
--- a/extractor/README.md
+++ b/extractor/README.md
@@ -1,8 +1,14 @@
C++ implementation of the online grammar extractor originally developed by [Adam Lopez](http://www.cs.jhu.edu/~alopez/).
-To run the extractor you need to:
+The grammar extraction takes place in two steps: (a) precomputing a number of data structures and (b) actually extracting the grammars. All the flags below have the same meaning as in the cython implementation.
- cdec/extractor/run_extractor -t <num_threads> -a <alignment> -b <parallel_corpus> -g <grammar_output_path> < <input_sentences> > <sgm_file>
+To compile the data structures you need to run:
+
+ cdec/extractor/compile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory>
+
+To extract the grammars you need to run:
+
+ cdec/extract/extract -t <num_threads> -c <compile_config_file> -g <grammar_output_path> < <input_sentencs> > <sgm_file>
To run unit tests you need first to configure `cdec` with the [Google Test](https://code.google.com/p/googletest/) and [Google Mock](https://code.google.com/p/googlemock/) libraries: