summaryrefslogtreecommitdiff
path: root/dtrain/README
diff options
context:
space:
mode:
Diffstat (limited to 'dtrain/README')
-rw-r--r--dtrain/README11
1 files changed, 1 insertions, 10 deletions
diff --git a/dtrain/README b/dtrain/README
index 137c1b48..42b91b9b 100644
--- a/dtrain/README
+++ b/dtrain/README
@@ -1,13 +1,4 @@
-NOTES
- learner gets all used features (binary! and dense (logprob is sum of logprobs!))
- weights: see decoder/decoder.cc line 548
- (40k sents, k=100 = ~400M mem, 1 iteration 45min)?
- utils/weights.cc: why wv_?
- FD, Weights::wv_ grow too large, see utils/weights.cc;
- decoder/hg.h; decoder/scfg_translator.cc; utils/fdict.cc
-
TODO
- enable kbest FILTERING (nofiler vs unique)
MULTIPARTITE ranking (108010, 1 vs all, cluster modelscore;score)
what about RESCORING?
REMEMBER kbest (merge) weights?
@@ -30,7 +21,7 @@ TODO
non deterministic, high variance, RANDOM RESTARTS
use separate TEST SET
-KNOWN BUGS PROBLEMS
+KNOWN BUGS, PROBLEMS
doesn't select best iteration for weigts
if size of candidate < N => 0 score
cdec kbest vs 1best (no -k param), rescoring? => ok(?)