(graehl, comments on code)

passive chart: completion of actual translation rules (X or S NT in Hiero), have
rule features.  Hyperedge inserted with copy of rule feature vector
(non-sparse).  Inefficient; should be postponed on intermediate parses with
global pruning; just keep pointer to rules and models must provide an interface
to build a (sparse) feat. vector on demand later for the stuff we keep.

multithreading: none.  list of hyperarcs for refinement would need to be
segregated into subforest blocks and have own output lists for later merging.
e.g. bottom up count number of tail-reachable nodes under each hypernode, then
assign to workers.

ngram caching: trie, no locks, for example.  for threading, LRU hashing w/ locks per bucket is probably better, or per-thread caches.  probably cache is reset per sentence?

randlm worth using?  guess not.

actually get all 0-state models in 1st pass parse and prune passive edges per span.

allocate cube pruning budget per prev pass

(has been tried in ISI decoder) models with nonstandard state comparison,
typically (partially) greedy forest scoring, some part of the state is excluded
from equality/hashing.  Within virtual ff interface, would just add equals, hash
to vtable (rather than the faster raw state compare).  If this is done often,
then add a nonvirtual flag to interface instead, saying whether to use the
virt. methods or not.  or: simple flag by user of ApplyModels (per feature?)
saying whether to be 100% greedy or 0% - no halfway e.g. item name uses bigram
context, but score using 5gram state.