blob: da9f938709f8d601f963711beb8423da921e3eb7 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
pycdec is a Python interface to cdec
## Installation
Build and install pycdec:
python setup.py install
## Grammar extractor
Compile a parallel corpus and a word alignment into a suffix array representation:
python -m cdec.sa.compile -f f.txt -e e.txt -a a.txt -o output/ -c extract.ini
Extract grammar rules from the compiled corpus:
cat input.txt | python -m cdec.sa.extract -c extract.ini -g grammars/
This will create per-sentence grammar files in the `grammars` directory and output annotated input suitable for translation with cdec.
## Library usage
A basic demo of pycdec's features is available in `test.py`
More documentation will come as the API becomes stable.
---
pycdec was contributed by [Victor Chahuneau](http://victor.chahuneau.fr)
|