summaryrefslogtreecommitdiff
path: root/python/src/sa
AgeCommit message (Collapse)Author
2013-02-22Merge branch 'master' into experimentPaul Baltescu
2013-02-22Memory analysis pointless code.Paul Baltescu
2013-02-22Updated unit tests for data array.Paul Baltescu
2013-02-21Merge branch 'master' of https://github.com/pauldb89/cdecPaul Baltescu
2013-02-19Timing every part of the extractor.Paul Baltescu
2013-01-28For now, don't use online bilex countsMichael Denkowski
2013-01-28Bilexical scores for online rulesMichael Denkowski
2013-01-26Online grammars now diff with incremental suffix array (except lex, TODO)Michael Denkowski
2013-01-24Scored grammars from online extraction. Don't trust them yet.Michael Denkowski
2013-01-24Clean up dead codeMichael Denkowski
2013-01-07Online rule extractor output diffs w/ sa extractorMichael Denkowski
2013-01-07code cleanupMichael Denkowski
2013-01-04Track source span to keep accurate phrase countsMichael Denkowski
2013-01-04Fixed issue with overlapping alignment linksMichael Denkowski
2013-01-03Michael remembers how hiero phrase extraction works. Not totallyMichael Denkowski
debugged, use -o with caution, etc.
2012-12-28Collect/store stats from new training instancesMichael Denkowski
2012-12-28Fix TypeError on __str__Michael Denkowski
2012-12-27Online phrase extraction speaks rulefactory's language.Michael Denkowski
2012-12-27Hooks for online grammar extractionMichael Denkowski
2012-12-23Working testMichael Denkowski
2012-12-23debuggingMichael Denkowski
2012-12-23Boundary non-terminalsMichael Denkowski
2012-12-23NonTerminal classMichael Denkowski
2012-12-22Merge branch 'master' of git://github.com/redpony/cdecMichael Denkowski
2012-12-22New feature: correct word alignments. I am on an airplane.Michael Denkowski
2012-12-21New classy versionMichael Denkowski
2012-12-20hiero phrase extraction. Don't trust the word alignments yet.Michael Denkowski
2012-12-13Enable loose phrase extraction parameterVictor Chahuneau
(default is still tight) use --loose when compiling corpus or tight_phrases = False in config
2012-09-06add FeatureContext.input_spanAdam Lopez
2012-09-06[cdec.sa] Allow sentence annotation and initial configurationVictor Chahuneau
2012-09-06[cdec.sa] Fix API to make everyone happyVictor Chahuneau
2012-09-06[cdec.sa] Fix the list of matching training source sentenceVictor Chahuneau
2012-09-06[cdec.sa] Make list of word ids <-> sentence string mapping easyVictor Chahuneau
2012-09-06Make Data_Array.data accessible via getterAdam Lopez
2012-09-06MergeAdam Lopez
2012-09-05Revert to the "old style" pair count...Victor Chahuneau
+ API naming fixes + Multiple feature definition files can be passed to the extractor
2012-09-05Pass F, E texts to featuresAdam Lopez
2012-09-05Change FeatureContext.input_span to return slice indicesAdam Lopez
2012-09-05Fix bug in initialization of FeatureContext.input_spanAdam Lopez
2012-09-05Expose new feature extraction APIVictor Chahuneau
2012-09-05Merge alopez/context-featuresVictor Chahuneau
2012-09-03Support Python 2.6Victor Chahuneau
2012-08-14[cdec.sa] Explicit feature names in grammar extractor outputVictor Chahuneau
+ sparse features in extractor + hg.intersect(string) + basestring = str|unicode
2012-07-28[python] Suffix array compiler can read bitext (-b)Victor Chahuneau
2012-07-27[python] Move python files to avoid pythonpath conflictsVictor Chahuneau
2012-07-27[python] conversion from cdec.sa.Rule to cdec.TRuleVictor Chahuneau
+ remove configobj dependency + re-structure packages (no more top-level library) + "const" stuff + use __new__ instead of constructor for some objects
2012-07-27[python] Fork of the suffix-array extractor with surface improvementsVictor Chahuneau
Available as the cdec.sa module, with commande-line helpers: python -m cdec.sa.compile -f ... -e ... -a ... -o sa-out/ -c extract.ini python -m cdec.sa.extract -c extract.ini -g grammars-out/ < input.txt > input.sgml + renamed cdec.scfg -> cdec.sa + Python README