blob: 9667a80386664eec8c43fca4d8ecf60f7c094c39 (
plain)
1
2
3
4
5
6
7
8
9
|
This (horrid) hack reads cdec's "--show_derivations" and "--extract_rules" into
data structures and tries to align "groups" in source and target sides
of rules in a smart, presentable way. The result resembles a phrase-based
system, given that the word alignment gives enough hints.
To run:
./derivation_to_json.rb < <one of the .raw files>
(first line of stdout is json data, source and target strings follow after that)
|