git-svn-id: https://ws10smt.googlecode.com/svn/trunk@559 ec762483-ff6d-05da-a07a-a48fb63a330f

author: desaicwtf <desaicwtf@ec762483-ff6d-05da-a07a-a48fb63a330f> 2010-08-16 02:28:29 +0000
committer: desaicwtf <desaicwtf@ec762483-ff6d-05da-a07a-a48fb63a330f> 2010-08-16 02:28:29 +0000
commit: d523a48ff2a7097ec5c33054af82f9395774d2d2 (patch)
tree: e1153defc7300283455f56e7367a24c4b785557f
parent: 60b865d0ec060f2eda4803f168a02d6b26945869 (diff)
2 files changed, 31 insertions, 0 deletions
diff --git a/report/pr-clustering/EMVSPR.pdf b/report/pr-clustering/EMVSPR.pdf
new file mode 100644
index 00000000..c03b41f2
--- /dev/null
+++ b/report/pr-clustering/EMVSPR.pdf
diff --git a/report/pr-clustering/posterior.tex b/report/pr-clustering/posterior.tex
index 73c15dba..7597c8e1 100644
--- a/report/pr-clustering/posterior.tex
+++ b/report/pr-clustering/posterior.tex
@@ -191,3 +191,34 @@ where $\mathcal{L}_1$ and $\mathcal{L}_2$
 are log-likelihood of
 two models.
 \section{Experiments}
+As a sanity check, we looked at a few examples produced by
+the basic model (EM) 
+and the posterior regularization (PR) model
+with sparsity constraints. Table \ref{tab:EMVSPR}
+shows a few examples.
+
+\begin{table}[h]
+  \centering
+  \includegraphics[width=3.5in]{pr-clustering/EMVSPR}
+  \caption[A few examples comparing EM and PR]
+  {A few examples comparing EM and PR. 
+    Count of most frequent category shows how 
+    many instances of a phrase are concetrated on 
+    the single most frequent tag. 
+    Number of categories shows how many categories
+    a phrase is labelled with. By experience as mentioned before, 
+    we want a phrase to use fewer categories. 
+	These numbers are fair indicators of sparsity.
+    }
+  \label{tab:EMVSPR}
+\end{table}
+
+The models are formally evaluated with two kinds
+of metrics. We feed the clustering output
+through the whole translation pipeline 
+to obtain a BLEU score. We also came up 
+with an intrinsic evaluation of clustering quality
+by comparing against a supervised CFG parser trained on the
+tree bank.
+
+
author	desaicwtf <desaicwtf@ec762483-ff6d-05da-a07a-a48fb63a330f>	2010-08-16 02:28:29 +0000
committer	desaicwtf <desaicwtf@ec762483-ff6d-05da-a07a-a48fb63a330f>	2010-08-16 02:28:29 +0000
commit	d523a48ff2a7097ec5c33054af82f9395774d2d2 (patch)
tree	e1153defc7300283455f56e7367a24c4b785557f
parent	60b865d0ec060f2eda4803f168a02d6b26945869 (diff)