summaryrefslogtreecommitdiff
path: root/derivation_to_json/README.md
blob: 9667a80386664eec8c43fca4d8ecf60f7c094c39 (plain)
1
2
3
4
5
6
7
8
9
This (horrid) hack reads cdec's "--show_derivations" and "--extract_rules" into
data structures  and tries to align "groups" in source and target sides
of rules in a smart, presentable way. The result resembles a phrase-based
system, given that the word alignment gives enough hints.

To run:
  ./derivation_to_json.rb < <one of the .raw files>
(first line of stdout is json data, source and target strings follow after that)