summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPatrick Simianer <p@simianer.de>2015-03-23 11:03:30 +0100
committerPatrick Simianer <p@simianer.de>2015-03-23 11:03:30 +0100
commitd9aeddeaa9bc99c6e3f0cb25ebfd47a2f6cadaa6 (patch)
tree07c322934efbb10dfba803f12194e7adb6253f9d
parent775a1055d2ad4e05c007440c467a3f5345bfb0e7 (diff)
README
-rw-r--r--README.md83
1 files changed, 32 insertions, 51 deletions
diff --git a/README.md b/README.md
index e462d41..5a83f0e 100644
--- a/README.md
+++ b/README.md
@@ -1,55 +1,36 @@
+Not quite finished machine translation decoder.
+(For Linux only)
+
TODO
- * sparse vector (unordered_map) -> where to store?
- * parser
- * Rule -> ChartItem -> Node ?
- * k-best
- * other semirings
- * include language model
- * compress/hash words/feature strings?
- * cast? Rule -> Edge, ChartItem -> Node
- * feature factory, observer
+====
+ * proper parsing (Rico Sennrich's [1][2]?)
+ * k-best derivations [3]
+ * serialization for sparse vectors
+ * Rule-ChartItem-Node transition?
+ * cube pruning [4] and integrate kenlm [5]
+ * feature factory and observer patterns
+ * map all strings to ints?
+ * glue grammar [6] alright?
+ * read/writed gzipped files [11]
+ * integrate some BLAS lib for vector ops [12][13]
Dependencies:
- * MessagePack for object serialization [1]
- * kenlm language model [2]
-
-This is Linux only.
-
-
-[1] http://msgpack.org
-[2] http://kheafield.com/code/kenlm/
-
-
-stuff to have a look at:
-http://math.nist.gov/spblas/
-http://lapackpp.sourceforge.net/
-http://www.cvmlib.com/
-http://sourceforge.net/projects/lpp/
-http://math-atlas.sourceforge.net/
-http://www.netlib.org/lapack/
-http://bytes.com/topic/c/answers/702569-blas-vs-cblas-c
-http://www.netlib.org/lapack/#_standard_c_language_apis_for_lapack
-http://www.osl.iu.edu/research/mtl/download.php3
-http://scicomp.stackexchange.com/questions/351/recommendations-for-a-usable-fast-c-matrix-library
-https://software.intel.com/en-us/tbb_4.2_doc
-http://goog-perftools.sourceforge.net/doc/tcmalloc.html
-http://www.sgi.com/tech/stl/Rope.html
-http://www.cs.unc.edu/Research/compgeom/gzstream/
-https://github.com/facebook/folly/blob/6e46d468cf2876dd59c7a4dddcb4e37abf070b7a/folly/docs/Overview.md
----
-not much to see here, yet
-(SCFG machine translation decoder in ruby, currently implements CKY+ parsing and hypergraph viterbi)
-
-helpful stuff
- * https://github.com/jweese/thrax/wiki/Glue-grammar
- * http://aclweb.org/aclwiki/index.php?title=Hypergraph_Format
- * http://kheafield.com/code/kenlm/developers/
-
-todo
-====
- * integrate with HG (chart to json)
- * kbest
- * feature interface
- * (global) word ids instead of strings
- * animate parsing
+ * MessagePack for object serialization [8]
+ * Google's gperftools [9]
+ * json-cpp [10]
+
+
+[1] http://aclweb.org/anthology/W/W14/W14-4011.pdf
+[2] https://github.com/redpony/cdec/commit/448b451aa481b1509566ddb11abc3476466def6a
+[3] http://www.cis.upenn.edu/~lhuang3/huang-iwpt-correct.pdf
+[4] http://cui.unige.ch/~gesmundo/papers/gesmundo-iwslt10-fcp.pdf
+[5] http://kheafield.com/code/kenlm/developers/2
+[6] https://github.com/jweese/thrax/wiki/Glue-grammar
+[7] http://aclweb.org/aclwiki/index.php?title=Hypergraph_Format
+[8] http://msgpack.org
+[9] https://code.google.com/p/gperftools/
+[10] https://github.com/ascheglov/json-cpp
+[11] http://www.cs.unc.edu/Research/compgeom/gzstream/
+[12] http://scicomp.stackexchange.com/questions/351/recommendations-for-a-usable-fast-c-matrix-library
+[13] http://www.cvmlib.com/