From 7cc92b65a3185aa242088d830e166e495674efc9 Mon Sep 17 00:00:00 2001 From: redpony Date: Tue, 22 Jun 2010 05:12:27 +0000 Subject: initial checkin git-svn-id: https://ws10smt.googlecode.com/svn/trunk@2 ec762483-ff6d-05da-a07a-a48fb63a330f --- README | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 README (limited to 'README') diff --git a/README b/README new file mode 100644 index 00000000..47b52355 --- /dev/null +++ b/README @@ -0,0 +1,57 @@ +cdec is a fast decoder. + +SPEED COMPARISON +------------------------------------------------------------------------------ + +Here is a comparison with a couple of other decoders doing SCFG decoding: + + Decoder Lang. BLEU Run-Time Memory + cdec c++ 31.47 0.37 sec/sent 1.0-1.1GB + Joshua Java 31.55 2.34 sec/sent 4.0-4.8GB + Hiero Python 31.22 27.2 sec/sent 1.7-1.9GB + +The maximum number of pops from candidate heap at each node is k=30, no other +pruning, 3gm LM, Chinese-English translation task. + + +GETTING STARTED +------------------------------------------------------------------------------ + +See the BUILDING file for instructions on how to build the software. To +explore the decoder's features, the best way to get started is to look +at cdec's command line options or to have a look at the test cases in +the tests/system_tests/ directory. Each of these can be run with a command +like ./cdec -c cdec.ini -i input.txt -w weights . The files should be +self explanatory. + + +EXTRACTING A SYNCHRONOUS GRAMMAR / PHRASE TABLE +------------------------------------------------------------------------------ +cdec does not include code for generating grammars. To build these, you will +need to write your own software or use an existing package like Joshua, Hiero, +or Moses. + + +OPTIMIZING / TRAINING MODELS +------------------------------------------------------------------------------ +cdec does include code for optimizing models, according to a number of +training criteria, including training models as CRFs (with latent derivation +variables), MERT (over hypergraphs) to opimize BLEU, TER, etc. + +Eventually, I will provide documentation for this. + + +ALIGNMENT / SYNCHRONOUS PARSING / CONSTRAINED DECODING +------------------------------------------------------------------------------ +cdec can be used as an aligner. For examples, see the test cases. + + +COPYRIGHT AND LICENSE +------------------------------------------------------------------------------ +Copyright (c) 2009 by Chris Dyer + +See the file LICENSE.txt for the licensing terms that this software is +released under. This software also includes the file m4/boost.m4 which is +licensed under the LGPL v3, for more information refer to the comments +in that file. + -- cgit v1.2.3