diff options
author | Chris Dyer <cdyer@cs.cmu.edu> | 2011-02-13 22:47:57 -0500 |
---|---|---|
committer | Chris Dyer <cdyer@cs.cmu.edu> | 2011-02-13 22:47:57 -0500 |
commit | 6a8ba7a4c00ed011798a7b597c1e65cb1d9053ca (patch) | |
tree | 1befec44c0a83d7b3e32a039d81d45bcf01f359e /phrasinator | |
parent | 78f1644ae064dffabc59b1bf7bf3ded3dc3171db (diff) |
add readme
Diffstat (limited to 'phrasinator')
-rw-r--r-- | phrasinator/README | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/phrasinator/README b/phrasinator/README new file mode 100644 index 00000000..fb5b93ef --- /dev/null +++ b/phrasinator/README @@ -0,0 +1,16 @@ +The "phrasinator" uses a simple Bayesian nonparametric model to segment +text into chunks. The inferred model is then saved so that it can rapidly +predict segments in new (but related) texts. + + Input will be a corpus of sentences, e.g.: + + economists have argued that real interest rates have fallen . + + The output will be a model that, when run with cdec, will produce + a segmentation into phrasal units, e.g.: + + economists have argued that real_interest_rates have fallen . + + +To train a model, run ./train-phrasinator.pl and follow instructions. + |