diff options
Diffstat (limited to 'graehl/NOTES.beam')
-rwxr-xr-x | graehl/NOTES.beam | 29 |
1 files changed, 0 insertions, 29 deletions
diff --git a/graehl/NOTES.beam b/graehl/NOTES.beam deleted file mode 100755 index a48d1ab7..00000000 --- a/graehl/NOTES.beam +++ /dev/null @@ -1,29 +0,0 @@ -(graehl, comments on code) - -passive chart: completion of actual translation rules (X or S NT in Hiero), have -rule features. Hyperedge inserted with copy of rule feature vector -(non-sparse). Inefficient; should be postponed on intermediate parses with -global pruning; just keep pointer to rules and models must provide an interface -to build a (sparse) feat. vector on demand later for the stuff we keep. - -multithreading: none. list of hyperarcs for refinement would need to be -segregated into subforest blocks and have own output lists for later merging. -e.g. bottom up count number of tail-reachable nodes under each hypernode, then -assign to workers. - -ngram caching: trie, no locks, for example. for threading, LRU hashing w/ locks per bucket is probably better, or per-thread caches. probably cache is reset per sentence? - -randlm worth using? guess not. - -actually get all 0-state models in 1st pass parse and prune passive edges per span. - -allocate cube pruning budget per prev pass - -(has been tried in ISI decoder) models with nonstandard state comparison, -typically (partially) greedy forest scoring, some part of the state is excluded -from equality/hashing. Within virtual ff interface, would just add equals, hash -to vtable (rather than the faster raw state compare). If this is done often, -then add a nonvirtual flag to interface instead, saying whether to use the -virt. methods or not. or: simple flag by user of ApplyModels (per feature?) -saying whether to be 100% greedy or 0% - no halfway e.g. item name uses bigram -context, but score using 5gram state. |