diff options
author | Patrick Simianer <p@simianer.de> | 2011-09-25 20:23:09 +0200 |
---|---|---|
committer | Patrick Simianer <p@simianer.de> | 2011-09-25 20:23:09 +0200 |
commit | fe471bb707226052551d75b043295ca5f57261c0 (patch) | |
tree | 73ba37bf8d5c1de6de50f63888a49e918e4a8cd4 /dtrain/README | |
parent | 5e1ab3481551607f1c2a10027049044cd41f78ab (diff) |
removed some quirks, less boost, prettier code, score_t
Diffstat (limited to 'dtrain/README')
-rw-r--r-- | dtrain/README | 11 |
1 files changed, 1 insertions, 10 deletions
diff --git a/dtrain/README b/dtrain/README index 137c1b48..42b91b9b 100644 --- a/dtrain/README +++ b/dtrain/README @@ -1,13 +1,4 @@ -NOTES - learner gets all used features (binary! and dense (logprob is sum of logprobs!)) - weights: see decoder/decoder.cc line 548 - (40k sents, k=100 = ~400M mem, 1 iteration 45min)? - utils/weights.cc: why wv_? - FD, Weights::wv_ grow too large, see utils/weights.cc; - decoder/hg.h; decoder/scfg_translator.cc; utils/fdict.cc - TODO - enable kbest FILTERING (nofiler vs unique) MULTIPARTITE ranking (108010, 1 vs all, cluster modelscore;score) what about RESCORING? REMEMBER kbest (merge) weights? @@ -30,7 +21,7 @@ TODO non deterministic, high variance, RANDOM RESTARTS use separate TEST SET -KNOWN BUGS PROBLEMS +KNOWN BUGS, PROBLEMS doesn't select best iteration for weigts if size of candidate < N => 0 score cdec kbest vs 1best (no -k param), rescoring? => ok(?) |