1. word_pair_keys group rules by source/target word pairs input is a cdec grammar (with int index), one rule per line 2. rules_cross_product build cross product of rules w/ same key input is output of 1 3. merge_rules mapred version of merge_rules.rb NOTE cross product doesn't even work with g120: 319078851 megabytes ~= 300 terabytes