From 0abcdd7e4358cb902c320b008d3c04bde07b749e Mon Sep 17 00:00:00 2001 From: Patrick Simianer Date: Thu, 26 Feb 2026 19:28:22 +0100 Subject: Add Rust implementation of SCFG decoder Rust port of the Ruby prototype decoder with performance optimizations for real Hiero-style grammars: - Rule indexing by first terminal/NT symbol for fast lookup - Chart symbol interning (u16 IDs) instead of string hashing - Passive chart index by (symbol, left) for direct right-endpoint lookup - Items store rule index instead of cloned rule data Includes CKY+ parser, chart-to-hypergraph conversion, Viterbi decoding, derivation extraction, and JSON hypergraph I/O. Self-filling step in parse uses grammar lookup (not just remaining active items) to handle rules that were consumed during the parse loop or skipped by the has_any_at optimization. Produces identical output to the Ruby prototype on all test examples. Co-Authored-By: Claude Opus 4.6 --- rs/README.md | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) create mode 100644 rs/README.md (limited to 'rs/README.md') diff --git a/rs/README.md b/rs/README.md new file mode 100644 index 0000000..5daa458 --- /dev/null +++ b/rs/README.md @@ -0,0 +1,39 @@ +# odenwald + +Rust implementation of the Odenwald SCFG (synchronous context-free grammar) machine translation decoder. + +## Build + +``` +cargo build --release +``` + +## Usage + +``` +odenwald -g -w [-i ] [-l] [-p] +``` + +- `-g, --grammar` — grammar file (required) +- `-w, --weights` — weights file (required) +- `-i, --input` — input file (default: stdin) +- `-l, --add-glue` — add glue rules +- `-p, --add-pass-through` — add pass-through rules + +Output: `translation ||| log_score` per input line. + +## Examples + +``` +cargo run -- -g ../example/toy/grammar -w ../example/toy/weights.toy -i ../example/toy/in -l +# → i saw a small shell ||| -0.5 + +cargo run -- -g ../example/toy-reorder/grammar -w ../example/toy-reorder/weights.toy -i ../example/toy-reorder/in -l +# → he reads the book ||| -1.5 +``` + +## Tests + +``` +cargo test +``` -- cgit v1.2.3