summaryrefslogtreecommitdiff
path: root/report/introduction.tex
diff options
context:
space:
mode:
authorccb@cs.jhu.edu <ccb@cs.jhu.edu@ec762483-ff6d-05da-a07a-a48fb63a330f>2010-08-13 20:42:41 +0000
committerccb@cs.jhu.edu <ccb@cs.jhu.edu@ec762483-ff6d-05da-a07a-a48fb63a330f>2010-08-13 20:42:41 +0000
commit9f35941087e1f4d54a9275bf943c98fe6f444b22 (patch)
tree0e7c1bc2e2787684dff59f7b18fbe65b7c757c0d /report/introduction.tex
parent19b0982a82473fe17b27c080b19a4f3addad9d5e (diff)
Editing intro
git-svn-id: https://ws10smt.googlecode.com/svn/trunk@545 ec762483-ff6d-05da-a07a-a48fb63a330f
Diffstat (limited to 'report/introduction.tex')
-rw-r--r--report/introduction.tex8
1 files changed, 4 insertions, 4 deletions
diff --git a/report/introduction.tex b/report/introduction.tex
index 21e0e907..3b673c8e 100644
--- a/report/introduction.tex
+++ b/report/introduction.tex
@@ -1,12 +1,12 @@
\chapter{Introduction}
-Automatically generating high quality translations for foreign texts remains a central challenge for Natural Language Processing research.
-Recent advances in Statistical Machine Translation (SMT) has enabled these technologies to move out of research labs an become viable commercial products and ubiquitous online tools. \footnote{e.g., translate.google.com, www.systran.co.uk, www.languageweaver.com}
+Automatically generating high quality translations for foreign texts remains a central challenge for natural language processing research.
+Recent advances in statistical machine translation (SMT) has enabled these technologies to move out of research labs an become viable commercial products and useful online tools. \footnote{e.g., translate.google.com, www.systran.co.uk, www.languageweaver.com}
However these successes have not been uniform;
current state-of-the-art translation output varies markedly in quality depending on the languages being translated.
-Those language pairs that are closely related language pairs (e.g., English and French) can be translated with a high degree of precision, while for distant pairs (e.g., English and Chinese) the result is far from acceptable.
+Those language pairs that are closely related language pairs (e.g., English and French) can be translated with high quality, while for distant pairs (e.g., English and Chinese) the result tends to be much lower quality.
It is tempting to argue that SMT's current limitations can be overcome simply by increasing the amount of data on which the systems are trained.
-However, large scale evaluation campaigns for Chinese~$\rightarrow$~English translation have not yielded the expected gains despite the increasing size of the models.
+However, large scale evaluation campaigns for Chinese~$\rightarrow$~English translation, such as the DARPA GALE program, have not yielded high quality translation despite providing hundreds of millions of words worth of training data.
\begin{figure}[t]
\centering \includegraphics[scale=0.55]{urdu_example_translation.pdf}