diff options
author | Paul Baltescu <pauldb89@gmail.com> | 2013-11-30 00:49:20 +0000 |
---|---|---|
committer | Paul Baltescu <pauldb89@gmail.com> | 2013-11-30 00:49:20 +0000 |
commit | cff2ccabf1328108aeead1fef67131377dd02194 (patch) | |
tree | 2d8ebd034f5ded979a7211464335dd9736c7bb36 | |
parent | f8837a7c29c3049420d9713c628e11a01a649bbe (diff) |
Update extractor README.
-rw-r--r-- | extractor/README.md | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/extractor/README.md b/extractor/README.md index c9db8de8..642fbd1d 100644 --- a/extractor/README.md +++ b/extractor/README.md @@ -1,8 +1,14 @@ C++ implementation of the online grammar extractor originally developed by [Adam Lopez](http://www.cs.jhu.edu/~alopez/). -To run the extractor you need to: +The grammar extraction takes place in two steps: (a) precomputing a number of data structures and (b) actually extracting the grammars. All the flags below have the same meaning as in the cython implementation. - cdec/extractor/run_extractor -t <num_threads> -a <alignment> -b <parallel_corpus> -g <grammar_output_path> < <input_sentences> > <sgm_file> +To compile the data structures you need to run: + + cdec/extractor/compile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory> + +To extract the grammars you need to run: + + cdec/extract/extract -t <num_threads> -c <compile_config_file> -g <grammar_output_path> < <input_sentencs> > <sgm_file> To run unit tests you need first to configure `cdec` with the [Google Test](https://code.google.com/p/googletest/) and [Google Mock](https://code.google.com/p/googlemock/) libraries: |