summaryrefslogtreecommitdiff
path: root/dtrain/README.md
diff options
context:
space:
mode:
authorChris Dyer <cdyer@allegro.clab.cs.cmu.edu>2012-11-18 13:35:42 -0500
committerChris Dyer <cdyer@allegro.clab.cs.cmu.edu>2012-11-18 13:35:42 -0500
commit1b8181bf0d6e9137e6b9ccdbe414aec37377a1a9 (patch)
tree33e5f3aa5abff1f41314cf8f6afbd2c2c40e4bfd /dtrain/README.md
parent7c4665949fb93fb3de402e4ce1d19bef67850d05 (diff)
major restructure of the training code
Diffstat (limited to 'dtrain/README.md')
-rw-r--r--dtrain/README.md48
1 files changed, 0 insertions, 48 deletions
diff --git a/dtrain/README.md b/dtrain/README.md
deleted file mode 100644
index 7edabbf1..00000000
--- a/dtrain/README.md
+++ /dev/null
@@ -1,48 +0,0 @@
-This is a simple (and parallelizable) tuning method for cdec
-which is able to train the weights of very many (sparse) features.
-It was used here:
- "Joint Feature Selection in Distributed Stochastic
- Learning for Large-Scale Discriminative Training in
- SMT"
-(Simianer, Riezler, Dyer; ACL 2012)
-
-
-Building
---------
-Builds when building cdec, see ../BUILDING .
-To build only parts needed for dtrain do
-```
- autoreconf -ifv
- ./configure [--disable-gtest]
- cd dtrain/; make
-```
-
-Running
--------
-To run this on a dev set locally:
-```
- #define DTRAIN_LOCAL
-```
-otherwise remove that line or undef, then recompile. You need a single
-grammar file or input annotated with per-sentence grammars (psg) as you
-would use with cdec. Additionally you need to give dtrain a file with
-references (--refs) when running locally.
-
-The input for use with hadoop streaming looks like this:
-```
- <sid>\t<source>\t<ref>\t<grammar rules separated by \t>
-```
-To convert a psg to this format you need to replace all "\n"
-by "\t". Make sure there are no tabs in your data.
-
-For an example of local usage (with the 'distributed' format)
-the see test/example/ . This expects dtrain to be built without
-DTRAIN_LOCAL.
-
-Legal
------
-Copyright (c) 2012 by Patrick Simianer <p@simianer.de>
-
-See the file ../LICENSE.txt for the licensing terms that this software is
-released under.
-