From 4f27006ca3699d28cbdec6e3e8dd44d73afbc3ba Mon Sep 17 00:00:00 2001 From: "ccb@cs.jhu.edu" Date: Tue, 17 Aug 2010 21:57:17 +0000 Subject: Added more images to the SCFG section git-svn-id: https://ws10smt.googlecode.com/svn/trunk@583 ec762483-ff6d-05da-a07a-a48fb63a330f --- report/SCFGs.tex | 4 ++-- report/SCFGs/hiero-phrase-extraction.pdf | Bin 0 -> 80699 bytes report/SCFGs/hiero-tree.pdf | Bin 0 -> 32664 bytes report/SCFGs/samt-tree.pdf | Bin 0 -> 34497 bytes report/SCFGs/scfg-ccg-phrase-extraction.pdf | Bin 0 -> 130817 bytes report/SCFGs/scfg-phrase-extraction.pdf | Bin 0 -> 136133 bytes 6 files changed, 2 insertions(+), 2 deletions(-) create mode 100644 report/SCFGs/hiero-phrase-extraction.pdf create mode 100644 report/SCFGs/hiero-tree.pdf create mode 100644 report/SCFGs/samt-tree.pdf create mode 100644 report/SCFGs/scfg-ccg-phrase-extraction.pdf create mode 100644 report/SCFGs/scfg-phrase-extraction.pdf (limited to 'report') diff --git a/report/SCFGs.tex b/report/SCFGs.tex index 0810c95e..a0eb2752 100644 --- a/report/SCFGs.tex +++ b/report/SCFGs.tex @@ -71,7 +71,7 @@ The use of SCFGs for statistical machine translation was popularized by \citet{C Rather than using the full power of the SCFG formalism, the Hiero system instead uses a simple grammar with one non-terminal symbol, X, to extend conventional phrase-based models to allow phrases with gaps in them. The Hiero system is technically a grammar-based approach to translation, but does not incorporate any linguistic information in its grammars. Its process of decoding is also one of parsing, and it employs the Cocke-Kasami-Younger (CKY) dynamic programming algorithm to find the best derivation using its probabilistic grammar rules. However, because Hiero-style parses are devoid of linguistic information, they fail to capture facts about Urdu like that it is post-positional or verb final. -\subsection{Syntax-augmented SCFGs extracted from supervised parses}\label{samt} +\subsection{Enriching SCFGs with syntactic labels extracted from supervised parses}\label{samt} \begin{figure} @@ -111,7 +111,7 @@ In the standard phrase-based and hierarchical phrase-based approaches to machine \end{figure} -\subsection{SCFGs with syntactic labels extracted from supervised parses}\label{samt} +\subsection{Our approach: enriching SCFGs labels in an unsupervised fashion}\label{samt} Note that one of the major advantages of extracting the linguistic SCFG for an automatically parsed parallel corpus is that only one side of the parallel corpus needs to be parsed. To extract an Urdu-English SCFG we therefore could use an English parser without the need for an Urdu parser. During translation the Urdu input text gets parsed with the projected rules, but a stand-alone Urdu parser is never required. However, all of the current approaches require that a parser, trained on supervised data, exist for at least one of the languages. diff --git a/report/SCFGs/hiero-phrase-extraction.pdf b/report/SCFGs/hiero-phrase-extraction.pdf new file mode 100644 index 00000000..6e17218d Binary files /dev/null and b/report/SCFGs/hiero-phrase-extraction.pdf differ diff --git a/report/SCFGs/hiero-tree.pdf b/report/SCFGs/hiero-tree.pdf new file mode 100644 index 00000000..dc53e837 Binary files /dev/null and b/report/SCFGs/hiero-tree.pdf differ diff --git a/report/SCFGs/samt-tree.pdf b/report/SCFGs/samt-tree.pdf new file mode 100644 index 00000000..57cdec06 Binary files /dev/null and b/report/SCFGs/samt-tree.pdf differ diff --git a/report/SCFGs/scfg-ccg-phrase-extraction.pdf b/report/SCFGs/scfg-ccg-phrase-extraction.pdf new file mode 100644 index 00000000..cc2cc5f0 Binary files /dev/null and b/report/SCFGs/scfg-ccg-phrase-extraction.pdf differ diff --git a/report/SCFGs/scfg-phrase-extraction.pdf b/report/SCFGs/scfg-phrase-extraction.pdf new file mode 100644 index 00000000..29263b9c Binary files /dev/null and b/report/SCFGs/scfg-phrase-extraction.pdf differ -- cgit v1.2.3