From fe471bb707226052551d75b043295ca5f57261c0 Mon Sep 17 00:00:00 2001 From: Patrick Simianer
Date: Sun, 25 Sep 2011 20:23:09 +0200 Subject: removed some quirks, less boost, prettier code, score_t --- dtrain/README | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) (limited to 'dtrain/README') diff --git a/dtrain/README b/dtrain/README index 137c1b48..42b91b9b 100644 --- a/dtrain/README +++ b/dtrain/README @@ -1,13 +1,4 @@ -NOTES - learner gets all used features (binary! and dense (logprob is sum of logprobs!)) - weights: see decoder/decoder.cc line 548 - (40k sents, k=100 = ~400M mem, 1 iteration 45min)? - utils/weights.cc: why wv_? - FD, Weights::wv_ grow too large, see utils/weights.cc; - decoder/hg.h; decoder/scfg_translator.cc; utils/fdict.cc - TODO - enable kbest FILTERING (nofiler vs unique) MULTIPARTITE ranking (108010, 1 vs all, cluster modelscore;score) what about RESCORING? REMEMBER kbest (merge) weights? @@ -30,7 +21,7 @@ TODO non deterministic, high variance, RANDOM RESTARTS use separate TEST SET -KNOWN BUGS PROBLEMS +KNOWN BUGS, PROBLEMS doesn't select best iteration for weigts if size of candidate < N => 0 score cdec kbest vs 1best (no -k param), rescoring? => ok(?) -- cgit v1.2.3