summaryrefslogtreecommitdiff
path: root/extractor
diff options
context:
space:
mode:
authorWu, Ke <wuke@cs.umd.edu>2014-12-17 16:15:13 -0500
committerWu, Ke <wuke@cs.umd.edu>2014-12-17 16:15:13 -0500
commit6829a0bc624b02ebefc79f8cf9ec89d7d64a7c30 (patch)
tree125dfb20f73342873476c793995397b26fd202dd /extractor
parentb455a108a21f4ba5a58ab1bc53a8d2bf4d829067 (diff)
parent7468e8d85e99b4619442c7afaf4a0d92870111bb (diff)
Merge branch 'const_reorder_2' into softsyn_2
Diffstat (limited to 'extractor')
-rw-r--r--extractor/README.md4
-rw-r--r--extractor/sacompile.cc1
2 files changed, 3 insertions, 2 deletions
diff --git a/extractor/README.md b/extractor/README.md
index 642fbd1d..b83ff900 100644
--- a/extractor/README.md
+++ b/extractor/README.md
@@ -1,10 +1,10 @@
-C++ implementation of the online grammar extractor originally developed by [Adam Lopez](http://www.cs.jhu.edu/~alopez/).
+A simple and fast C++ implementation of a SCFG grammar extractor using suffix arrays. The implementation is described in this [paper](https://ufal.mff.cuni.cz/pbml/102/art-baltescu-blunsom.pdf). The original cython extractor is described in [Adam Lopez](http://www.cs.jhu.edu/~alopez/)'s PhD [thesis](http://www.cs.jhu.edu/~alopez/papers/adam.lopez.dissertation.pdf).
The grammar extraction takes place in two steps: (a) precomputing a number of data structures and (b) actually extracting the grammars. All the flags below have the same meaning as in the cython implementation.
To compile the data structures you need to run:
- cdec/extractor/compile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory>
+ cdec/extractor/sacompile -a <alignment> -b <parallel_corpus> -c <compile_config_file> -o <compile_directory>
To extract the grammars you need to run:
diff --git a/extractor/sacompile.cc b/extractor/sacompile.cc
index 3ee668ce..d80ab64d 100644
--- a/extractor/sacompile.cc
+++ b/extractor/sacompile.cc
@@ -114,6 +114,7 @@ int main(int argc, char** argv) {
stop_write = Clock::now();
write_duration += GetDuration(start_write, stop_write);
+ stop_time = Clock::now();
cerr << "Constructing suffix array took "
<< GetDuration(start_time, stop_time) << " seconds" << endl;