blob: 5dffd16465b208e88f37c2bc6d92a7082a518e8d (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
1. word_pair_keys
group rules by source/target word pairs
input is a cdec grammar (with int index), one rule per line
2. rules_cross_product
build cross product of rules w/ same key
input is output of 1
3. merge_rules
mapred version of merge_rules.rb
NOTE
cross product doesn't even work with g120:
319078851 megabytes ~= 300 terabytes
|