| Age | Commit message (Collapse) | Author |
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- Node::mark and Node::score uninitialized, causing segfaults in
topological_sort — add default initializers (0, 0.0)
- odenwald.cc called incomplete sv_path() + exit(1) instead of
viterbi_path()
- viterbi_path: add reset() before topological_sort, initialize
best_edge to nullptr
- derive: off-by-one in NT order indexing — start j at 1 and
use order[j]-1 (1-indexed alignment map)
- read: ifs.readsome() returns 0 on macOS — use ifs.read() +
ifs.gcount()
- manual() signature: add missing Vocabulary parameter
- Remove gperftools/tcmalloc dependency from Makefile
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
1. Inner visit at span (0,1) yielded no sub-spans because
visit(1, 0, 1, 1) iterates span from 1 to r-x=0, which is
empty. This prevented unary rules like [S] ||| [X,1] from
completing at the leftmost span, so S(0,1) was never created.
Drop the x=1 parameter (default x=0); scan already handles
bounds checking.
2. Self-filling step searched remaining_items for unary NT rules,
but those rules could be absent if consumed (advanced) during
the parse loop. Look up grammar.start_nt directly instead,
which covers all cases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Rust port of the Ruby prototype decoder with performance
optimizations for real Hiero-style grammars:
- Rule indexing by first terminal/NT symbol for fast lookup
- Chart symbol interning (u16 IDs) instead of string hashing
- Passive chart index by (symbol, left) for direct right-endpoint lookup
- Items store rule index instead of cloned rule data
Includes CKY+ parser, chart-to-hypergraph conversion, Viterbi
decoding, derivation extraction, and JSON hypergraph I/O.
Self-filling step in parse uses grammar lookup (not just
remaining active items) to handle rules that were consumed
during the parse loop or skipped by the has_any_at optimization.
Produces identical output to the Ruby prototype on all test examples.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
When creating an Item from a Rule (not an Item), tail_spans doesn't exist.
Check with is_a?(Item) instead of catching the exception silently.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
The [S] -> [S] [X] concatenation rule was duplicated for every non-S LHS
symbol. Move it out of the loop so it's added once.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
The Cartesian product over all nodes produces duplicate derivations when
edges differ only in nodes unreachable from the top. Walk reachable edges
from the top edge of each path and drop paths with identical reachable sets.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
derive used a sequential counter to index into the source-side NT map,
which only worked for monotone rules. Now looks up tails by the target
NT's own index via map.index(i.index).
Adds toy-reorder example (German verb-final -> English SVO) to exercise
the fix. Also updates trollop -> optimist and guards xmlsimple require.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|