summaryrefslogtreecommitdiff
path: root/rs/README.md
diff options
context:
space:
mode:
authorPatrick Simianer <patrick@lilt.com>2026-02-26 19:28:22 +0100
committerPatrick Simianer <patrick@lilt.com>2026-02-26 19:28:22 +0100
commit0abcdd7e4358cb902c320b008d3c04bde07b749e (patch)
treef26bd36cc16b792ef4acf5450ef9293b55179167 /rs/README.md
parent4e62908a1757f83ff703399252ad50758c4eb237 (diff)
Add Rust implementation of SCFG decoder
Rust port of the Ruby prototype decoder with performance optimizations for real Hiero-style grammars: - Rule indexing by first terminal/NT symbol for fast lookup - Chart symbol interning (u16 IDs) instead of string hashing - Passive chart index by (symbol, left) for direct right-endpoint lookup - Items store rule index instead of cloned rule data Includes CKY+ parser, chart-to-hypergraph conversion, Viterbi decoding, derivation extraction, and JSON hypergraph I/O. Self-filling step in parse uses grammar lookup (not just remaining active items) to handle rules that were consumed during the parse loop or skipped by the has_any_at optimization. Produces identical output to the Ruby prototype on all test examples. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to 'rs/README.md')
-rw-r--r--rs/README.md39
1 files changed, 39 insertions, 0 deletions
diff --git a/rs/README.md b/rs/README.md
new file mode 100644
index 0000000..5daa458
--- /dev/null
+++ b/rs/README.md
@@ -0,0 +1,39 @@
+# odenwald
+
+Rust implementation of the Odenwald SCFG (synchronous context-free grammar) machine translation decoder.
+
+## Build
+
+```
+cargo build --release
+```
+
+## Usage
+
+```
+odenwald -g <grammar> -w <weights> [-i <input>] [-l] [-p]
+```
+
+- `-g, --grammar` — grammar file (required)
+- `-w, --weights` — weights file (required)
+- `-i, --input` — input file (default: stdin)
+- `-l, --add-glue` — add glue rules
+- `-p, --add-pass-through` — add pass-through rules
+
+Output: `translation ||| log_score` per input line.
+
+## Examples
+
+```
+cargo run -- -g ../example/toy/grammar -w ../example/toy/weights.toy -i ../example/toy/in -l
+# → i saw a small shell ||| -0.5
+
+cargo run -- -g ../example/toy-reorder/grammar -w ../example/toy-reorder/weights.toy -i ../example/toy-reorder/in -l
+# → he reads the book ||| -1.5
+```
+
+## Tests
+
+```
+cargo test
+```