diff options
author | Chris Dyer <redpony@gmail.com> | 2009-12-17 13:57:54 -0500 |
---|---|---|
committer | Chris Dyer <redpony@gmail.com> | 2009-12-17 13:57:54 -0500 |
commit | bba4ff830c8722cdcaf29e36c1ff5821a912ae5d (patch) | |
tree | 268f2f8118aca09b3cc40dca8b2be7de8295acd5 /decoder/trule.h | |
parent | 04ae1beeaeceb0161a64d33112f21956f9741bde (diff) |
added non-pruning intersection and a CRF tagger
- the linear-chain tagger is more of a proof of concept than a real tagger-- the context-free assumptions made in a number of places mean that the algorithms used may not be as efficient as they could be, but the model is as powerful as any CRF
- it would be easy to add latent variables or semi-CRF support (or both!)
- i've added a couple basic features that are often used for POS tagging
- non-pruning intersection is useful for lexical word alignment models and the tagger
- a sample POS tagger model will be committed later
Diffstat (limited to 'decoder/trule.h')
-rw-r--r-- | decoder/trule.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/decoder/trule.h b/decoder/trule.h index d2b1babe..42edfa2c 100644 --- a/decoder/trule.h +++ b/decoder/trule.h @@ -39,6 +39,10 @@ class TRule { // [LHS] ||| term1 [NT] term2 [OTHER_NT] [YET_ANOTHER_NT] static TRule* CreateRuleMonolingual(const std::string& rule); + static TRule* CreateLexicalRule(const WordID& src, const WordID& trg) { + return new TRule(src, trg); + } + void ESubstitute(const std::vector<const std::vector<WordID>* >& var_values, std::vector<WordID>* result) const { int vc = 0; @@ -116,6 +120,7 @@ class TRule { short int prev_j; private: + TRule(const WordID& src, const WordID& trg) : e_(1, trg), f_(1, src), lhs_(), arity_(), prev_i(), prev_j() {} bool SanityCheck() const; }; |