summaryrefslogtreecommitdiff
path: root/dtrain/README.md
diff options
context:
space:
mode:
authorPatrick Simianer <p@simianer.de>2012-03-13 09:37:10 +0100
committerPatrick Simianer <p@simianer.de>2012-03-13 09:37:10 +0100
commit1cc08348b38c44f0d2251e6edc44de5af61a3755 (patch)
treece4293fc09ff9c8c95311b4fceb83401b26e4e54 /dtrain/README.md
parentc3a9ea64251605532c7954959662643a6a927bb7 (diff)
readme
Diffstat (limited to 'dtrain/README.md')
-rw-r--r--dtrain/README.md10
1 files changed, 3 insertions, 7 deletions
diff --git a/dtrain/README.md b/dtrain/README.md
index c39d94d2..0240a694 100644
--- a/dtrain/README.md
+++ b/dtrain/README.md
@@ -12,24 +12,20 @@ builds when building cdec, see ../BUILDING
Running
-------
To run this on a dev set locally (default):
-<code>
-#define DTRAIN_LOCAL
-</code>
+ #define DTRAIN_LOCAL
otherwise remove that line or undef. You need a single grammar file
or per-sentence-grammars (psg) as you would use with cdec.
Additionally you need to give dtrain a file with
references (--refs).
The input for use with hadoop streaming looks like this:
-<code>
-<id>\t<source>\t<ref>\t<grammar rules separated by tab>
-</code>
+ <sid>\t<source>\t<ref>\t<grammar rules separated by \t>
To convert a psg to this format you need to replace all "\n"
by "\t". Make sure there are no tabs in your data.
For an example of local usage (with 'distributed' format)
the see test/example/ . This expects dtrain to be built without
-DTRAIN_LOCAL param.
+DTRAIN_LOCAL.
Legal stuff
-----------