summaryrefslogtreecommitdiff
path: root/extractor
diff options
context:
space:
mode:
authorWu, Ke <wuke@cs.umd.edu>2014-12-17 16:15:13 -0500
committerWu, Ke <wuke@cs.umd.edu>2014-12-17 16:15:13 -0500
commit17dbb7d5ab1544899b1b9e867d2246a0a93e3aa8 (patch)
tree7fa2a51763a1b67fb325e86b0e3f764dd119cd70 /extractor
parent1983c75c35b7f5dc3f356a2f9a9345d632b87650 (diff)
parent1613f1fc44ca67820afd7e7b21eb54b316c8ce55 (diff)
Merge branch 'const_reorder_2' into softsyn_2
Diffstat (limited to 'extractor')
-rw-r--r--extractor/README.md4
-rw-r--r--extractor/sacompile.cc1
2 files changed, 3 insertions, 2 deletions
diff --git a/extractor/README.md b/extractor/README.md
index 642fbd1d..b83ff900 100644
--- a/extractor/README.md
+++ b/extractor/README.md
@@ -1,10 +1,10 @@
-C++ implementation of the online grammar extractor originally developed by [Adam Lopez](http://www.cs.jhu.edu/~alopez/).
+A simple and fast C++ implementation of a SCFG grammar extractor using suffix arrays. The implementation is described in this [paper](https://ufal.mff.cuni.cz/pbml/102/art-baltescu-blunsom.pdf). The original cython extractor is described in [Adam Lopez](http://www.cs.jhu.edu/~alopez/)'s PhD [thesis](http://www.cs.jhu.edu/~alopez/papers/adam.lopez.dissertation.pdf).
The grammar extraction takes place in two steps: (a) precomputing a number of data structures and (b) actually extracting the grammars. All the flags below have the same meaning as in the cython implementation.
To compile the data structures you need to run:
- cdec/extractor/compile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory>
+ cdec/extractor/sacompile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory>
To extract the grammars you need to run:
diff --git a/extractor/sacompile.cc b/extractor/sacompile.cc
index 3ee668ce..d80ab64d 100644
--- a/extractor/sacompile.cc
+++ b/extractor/sacompile.cc
@@ -114,6 +114,7 @@ int main(int argc, char** argv) {
stop_write = Clock::now();
write_duration += GetDuration(start_write, stop_write);
+ stop_time = Clock::now();
cerr << "Constructing suffix array took "
<< GetDuration(start_time, stop_time) << " seconds" << endl;