Age | Commit message (Collapse) | Author | |
---|---|---|---|
2013-02-21 | Merge branch 'master' of https://github.com/pauldb89/cdec | Paul Baltescu | |
2013-02-19 | Timing every part of the extractor. | Paul Baltescu | |
2013-01-28 | For now, don't use online bilex counts | Michael Denkowski | |
2013-01-28 | Bilexical scores for online rules | Michael Denkowski | |
2013-01-26 | Online grammars now diff with incremental suffix array (except lex, TODO) | Michael Denkowski | |
2013-01-24 | Scored grammars from online extraction. Don't trust them yet. | Michael Denkowski | |
2013-01-24 | Clean up dead code | Michael Denkowski | |
2013-01-07 | Online rule extractor output diffs w/ sa extractor | Michael Denkowski | |
2013-01-07 | code cleanup | Michael Denkowski | |
2013-01-04 | Track source span to keep accurate phrase counts | Michael Denkowski | |
2013-01-04 | Fixed issue with overlapping alignment links | Michael Denkowski | |
2013-01-03 | Michael remembers how hiero phrase extraction works. Not totally | Michael Denkowski | |
debugged, use -o with caution, etc. | |||
2012-12-28 | Collect/store stats from new training instances | Michael Denkowski | |
2012-12-28 | Fix TypeError on __str__ | Michael Denkowski | |
2012-12-27 | Online phrase extraction speaks rulefactory's language. | Michael Denkowski | |
2012-12-27 | Hooks for online grammar extraction | Michael Denkowski | |
2012-12-23 | Working test | Michael Denkowski | |
2012-12-23 | debugging | Michael Denkowski | |
2012-12-23 | Boundary non-terminals | Michael Denkowski | |
2012-12-23 | NonTerminal class | Michael Denkowski | |
2012-12-22 | Merge branch 'master' of git://github.com/redpony/cdec | Michael Denkowski | |
2012-12-22 | New feature: correct word alignments. I am on an airplane. | Michael Denkowski | |
2012-12-21 | New classy version | Michael Denkowski | |
2012-12-20 | hiero phrase extraction. Don't trust the word alignments yet. | Michael Denkowski | |
2012-12-13 | Enable loose phrase extraction parameter | Victor Chahuneau | |
(default is still tight) use --loose when compiling corpus or tight_phrases = False in config | |||
2012-09-06 | add FeatureContext.input_span | Adam Lopez | |
2012-09-06 | [cdec.sa] Allow sentence annotation and initial configuration | Victor Chahuneau | |
2012-09-06 | [cdec.sa] Fix API to make everyone happy | Victor Chahuneau | |
2012-09-06 | [cdec.sa] Fix the list of matching training source sentence | Victor Chahuneau | |
2012-09-06 | [cdec.sa] Make list of word ids <-> sentence string mapping easy | Victor Chahuneau | |
2012-09-06 | Make Data_Array.data accessible via getter | Adam Lopez | |
2012-09-06 | Merge | Adam Lopez | |
2012-09-05 | Revert to the "old style" pair count... | Victor Chahuneau | |
+ API naming fixes + Multiple feature definition files can be passed to the extractor | |||
2012-09-05 | Pass F, E texts to features | Adam Lopez | |
2012-09-05 | Change FeatureContext.input_span to return slice indices | Adam Lopez | |
2012-09-05 | Fix bug in initialization of FeatureContext.input_span | Adam Lopez | |
2012-09-05 | Expose new feature extraction API | Victor Chahuneau | |
2012-09-05 | Merge alopez/context-features | Victor Chahuneau | |
2012-09-03 | Support Python 2.6 | Victor Chahuneau | |
2012-08-14 | [cdec.sa] Explicit feature names in grammar extractor output | Victor Chahuneau | |
+ sparse features in extractor + hg.intersect(string) + basestring = str|unicode | |||
2012-07-28 | [python] Suffix array compiler can read bitext (-b) | Victor Chahuneau | |
2012-07-27 | [python] Move python files to avoid pythonpath conflicts | Victor Chahuneau | |
2012-07-27 | [python] conversion from cdec.sa.Rule to cdec.TRule | Victor Chahuneau | |
+ remove configobj dependency + re-structure packages (no more top-level library) + "const" stuff + use __new__ instead of constructor for some objects | |||
2012-07-27 | [python] Fork of the suffix-array extractor with surface improvements | Victor Chahuneau | |
Available as the cdec.sa module, with commande-line helpers: python -m cdec.sa.compile -f ... -e ... -a ... -o sa-out/ -c extract.ini python -m cdec.sa.extract -c extract.ini -g grammars-out/ < input.txt > input.sgml + renamed cdec.scfg -> cdec.sa + Python README |