Age | Commit message (Collapse) | Author | |
---|---|---|---|
2015-12-23 | make_rule_features: produce cdec's rule features (ids and bigrams) from a ↵ | Patrick Simianer | |
grammar | |||
2015-12-23 | hadoop_uniq: uniq with hadoop-streaming | Patrick Simianer | |
2015-12-23 | toks_per_line: # tokens per line | Patrick Simianer | |
2015-12-19 | corrected stddev | Patrick Simianer | |
2015-11-12 | Merge branch 'master' of github.com:pks/scripts | Patrick Simianer | |
2015-11-12 | README | Patrick Simianer | |
2015-11-12 | preprocessing without lowercasing | Patrick Simianer | |
2015-11-12 | normalize on char level | Patrick Simianer | |
2015-11-12 | map lines to number of token they contain | Patrick Simianer | |
2015-11-12 | script to normalize hyphens | Patrick Simianer | |
2015-11-12 | script to remove private use area chars | Patrick Simianer | |
2015-11-12 | add moses' truecaser | Patrick Simianer | |
2015-11-12 | sample: tab as separator | Patrick Simianer | |
2015-06-10 | undo unfortunate variable naming: cfg -> conf! | Patrick Simianer | |
2015-05-30 | fake_svm_light: invert data in svm light format | Patrick Simianer | |
2015-05-29 | feature_dict, convert_to_svmlight_format: stderr output | Patrick Simianer | |
2015-05-29 | tf-idf: glob handling | Patrick Simianer | |
2015-05-29 | add_ln: add line numbers, filter_features: filter text reps of sparse ↵ | Patrick Simianer | |
vectors, split_*: split kbest lists and by line | |||
2015-05-13 | norm | Patrick Simianer | |
2015-01-31 | tools | Patrick Simianer | |
2015-01-31 | kendalls_tau | Patrick Simianer | |
2015-01-31 | add_seg: fix | Patrick Simianer | |
2015-01-25 | zipf v1.2.2 compat | Patrick Simianer | |
2015-01-25 | div | Patrick Simianer | |
2015-01-15 | fix | Patrick Simianer | |
2015-01-15 | split_pipes: to param | Patrick Simianer | |
2015-01-14 | select_from: invert | Patrick Simianer | |
2015-01-07 | fix | Patrick Simianer | |
2015-01-07 | select_from, max_len | Patrick Simianer | |
2014-10-09 | alles neu macht der mai | Patrick Simianer | |
2014-10-03 | pot sqrt | Patrick Simianer | |
2014-09-21 | add_seg: fix | Patrick Simianer | |
2014-09-21 | add_seg: option to use pre-defined index | Patrick Simianer | |
2014-09-21 | add select | Patrick Simianer | |
2014-09-21 | rm sample_n | Patrick Simianer | |
2014-09-21 | sample | Patrick Simianer | |
2014-08-16 | memusg, to_ascii | Patrick Simianer | |
2014-07-22 | compound-splitter.perl (taken from moses v2.1.1) | Patrick Simianer | |
2014-07-22 | collapse_tags.rb | Patrick Simianer | |
2014-06-18 | fix | Patrick Simianer | |
2014-06-16 | nlp_ruby -> zipf | Patrick Simianer | |
2014-06-14 | steal tokenizer from moses' scripts | Patrick Simianer | |
2014-06-03 | withdraw previous change | Patrick Simianer | |
2014-06-01 | hg2json.py: add rule and span to json output | Patrick Simianer | |
2014-04-24 | parse-stanford | Patrick Simianer | |
2014-03-17 | fix | Patrick Simianer | |
2014-03-17 | a lot of ... and --- cause moses' compound splitter to hang | Patrick Simianer | |
2014-03-16 | better no_non_printables | Patrick Simianer | |
2014-03-16 | no non printables in preproc | Patrick Simianer | |
2014-03-16 | filter by rule shape | Patrick Simianer | |