summaryrefslogtreecommitdiff
path: root/dtrain
diff options
context:
space:
mode:
Diffstat (limited to 'dtrain')
-rw-r--r--dtrain/README.md (renamed from dtrain/README)40
1 files changed, 27 insertions, 13 deletions
diff --git a/dtrain/README b/dtrain/README.md
index 997c5ff3..dc980faf 100644
--- a/dtrain/README
+++ b/dtrain/README.md
@@ -1,4 +1,5 @@
-TODO
+IDEAS
+=====
MULTIPARTITE ranking (108010, 1 vs all, cluster modelscore;score)
what about RESCORING?
REMEMBER kbest (merge) weights?
@@ -21,16 +22,29 @@ TODO
non deterministic, high variance, RANDOM RESTARTS
use separate TEST SET
-KNOWN BUGS, PROBLEMS
- doesn't select best iteration for weigts
- if size of candidate < N => 0 score
- cdec kbest vs 1best (no -k param), rescoring? => ok(?)
- no sparse vector in decoder => ok
- ? ok
- sh: error while loading shared libraries: libreadline.so.6: cannot open shared object file: Error 24
- PhraseModel_* features (0..99 seem to be generated, why 99?)
- flex scanner jams on malicious input, we could skip that
+Uncertain, known bugs, problems
+===============================
+* cdec kbest vs 1best (no -k param), rescoring? => ok(?)
+* no sparse vector in decoder => ok/fixed
+* PhraseModel_* features (0..99 seem to be generated, why 99?)
+* flex scanner jams on malicious input, we could skip that
+
+FIXME
+=====
+* merge
+* ep data
+
+Data
+====
+<pre>
+nc-v6.de-en peg
+nc-v6.de-en.loo peg
+nc-v6.de-en.giza.loo peg
+nc-v6.de-en.symgiza.loo pe
+nv-v6.de-en.cs pe
+nc-v6.de-en.cs.loo pe
+--
+ep-v6.de-en.cs p
+ep-v6.de-en.cs.loo p
+</pre>
-FIX
- merge
- ep data