diff options
author | redpony <redpony@ec762483-ff6d-05da-a07a-a48fb63a330f> | 2010-06-22 05:12:27 +0000 |
---|---|---|
committer | redpony <redpony@ec762483-ff6d-05da-a07a-a48fb63a330f> | 2010-06-22 05:12:27 +0000 |
commit | 0172721855098ca02b207231a654dffa5e4eb1c9 (patch) | |
tree | 8069c3a62e2d72bd64a2cdeee9724b2679c8a56b /README | |
parent | 37728b8be4d0b3df9da81fdda2198ff55b4b2d91 (diff) |
initial checkin
git-svn-id: https://ws10smt.googlecode.com/svn/trunk@2 ec762483-ff6d-05da-a07a-a48fb63a330f
Diffstat (limited to 'README')
-rw-r--r-- | README | 57 |
1 files changed, 57 insertions, 0 deletions
@@ -0,0 +1,57 @@ +cdec is a fast decoder. + +SPEED COMPARISON +------------------------------------------------------------------------------ + +Here is a comparison with a couple of other decoders doing SCFG decoding: + + Decoder Lang. BLEU Run-Time Memory + cdec c++ 31.47 0.37 sec/sent 1.0-1.1GB + Joshua Java 31.55 2.34 sec/sent 4.0-4.8GB + Hiero Python 31.22 27.2 sec/sent 1.7-1.9GB + +The maximum number of pops from candidate heap at each node is k=30, no other +pruning, 3gm LM, Chinese-English translation task. + + +GETTING STARTED +------------------------------------------------------------------------------ + +See the BUILDING file for instructions on how to build the software. To +explore the decoder's features, the best way to get started is to look +at cdec's command line options or to have a look at the test cases in +the tests/system_tests/ directory. Each of these can be run with a command +like ./cdec -c cdec.ini -i input.txt -w weights . The files should be +self explanatory. + + +EXTRACTING A SYNCHRONOUS GRAMMAR / PHRASE TABLE +------------------------------------------------------------------------------ +cdec does not include code for generating grammars. To build these, you will +need to write your own software or use an existing package like Joshua, Hiero, +or Moses. + + +OPTIMIZING / TRAINING MODELS +------------------------------------------------------------------------------ +cdec does include code for optimizing models, according to a number of +training criteria, including training models as CRFs (with latent derivation +variables), MERT (over hypergraphs) to opimize BLEU, TER, etc. + +Eventually, I will provide documentation for this. + + +ALIGNMENT / SYNCHRONOUS PARSING / CONSTRAINED DECODING +------------------------------------------------------------------------------ +cdec can be used as an aligner. For examples, see the test cases. + + +COPYRIGHT AND LICENSE +------------------------------------------------------------------------------ +Copyright (c) 2009 by Chris Dyer <redpony@gmail.com> + +See the file LICENSE.txt for the licensing terms that this software is +released under. This software also includes the file m4/boost.m4 which is +licensed under the LGPL v3, for more information refer to the comments +in that file. + |