summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--README.md15
-rwxr-xr-xscripts/geoquery/cv.sh50
2 files changed, 58 insertions, 7 deletions
diff --git a/README.md b/README.md
index 182195e..3e509f4 100644
--- a/README.md
+++ b/README.md
@@ -2,30 +2,30 @@ rebol
=====
Code for grounded SMT on geoquery data.
-(This has nothing to do with the programming language REBOL! http://www.rebol.com/ )
+(This has nothing to do with the programming language REBOL! [0])
Dependencies
------------
-*WASP*-1.0 includes the geoquery knowledge base and scripts for querying it.
+_WASP_-1.0 includes the geoquery knowledge base and scripts for querying it.
The evaluation scripts were slightly modified to include the full output.
These scripts are in data/geoquery/wasp/, they go into wasp-1.0/data/geo-funql/eval/.
WASP-1.0 can be downloaded from here [1].
-You'll also need some *Prolog* environment, e.g. SWI-Prolog [2].
+You'll also need some _Prolog_ environment, e.g. SWI-Prolog [2].
-We use the a slightly modified implementation of *smt-semparse*,
+We use the a slightly modified implementation of _smt-semparse_,
as described in 'Semantic parsing as machine translation' (Andreas et al, ACL 2013).
Our fork can be found here [3]. This depends on more stuff, e.g. the Moses decoder
and SRILM.
-For translation we use the *cdec* toolkit, [4].
+For translation we use the _cdec_ toolkit, [4].
As semantic parsing is quite slow and rebol does it quite often,
-results are cached with *memcached* [5].
+results are cached with _memcached_ [5].
-You'll also need the following *ruby gems*:
+You'll also need the following _ruby gems_:
* https://rubygems.org/gems/memcached
* http://rubygems.org/gems/nlp_ruby
* http://trollop.rubyforge.org/
@@ -33,6 +33,7 @@ You'll also need the following *ruby gems*:
---
+[0] http://www.rebol.com/
[1] http://www.cs.utexas.edu/~ml/wasp/wasp-1.0.tar.bz2
[2] http://www.swi-prolog.org/
[3] https://github.com/pks/smt-semparse
diff --git a/scripts/geoquery/cv.sh b/scripts/geoquery/cv.sh
new file mode 100755
index 0000000..bd41474
--- /dev/null
+++ b/scripts/geoquery/cv.sh
@@ -0,0 +1,50 @@
+#!/bin/bash
+
+function wait_for()
+{
+ echo "Waiting for ${#WAITFOR[@]} procs..."
+ echo ${WAITFOR[*]}
+ for pid in ${WAITFOR[@]}; do
+ wait $pid;
+ done
+}
+
+killall memcached
+memcached &
+
+K=100
+J=10
+STOPWORDS=/path/to/stopwords.en
+
+for VARIANT in rebol rampion exec; do
+for E in 0.3 0.1 0.01 0.03 0.003 0.001 0.0003 0.0001; do
+for INI in /paths/to/cdec/inis; do
+for INIT_WEIGHTS in /paths/to/weight/files; do
+WAITFOR=()
+for FOLD in {0..9}; do
+
+NAME="v=$VARIANT.fold=$FOLD.e=$E.c=$(basename $INI).w=$(basename $INIT_WEIGHTS)"
+
+../rampfion.rb \
+ -k $K \
+ -i /path/to/folds600/$FOLD/train.in \
+ -r /path/tod/folds600/$FOLD/train.en \
+ -g /path/to/folds600/$FOLD/train.gold \
+ -h /path/to/folds600/$FOLD/train.funql \
+ -w $INIT_WEIGHTS \
+ -t $STOPWORDS \
+ -c $INI \
+ -b $(pwd)/cfg.rb \
+ -e $E \
+ -j $J \
+ -v $VARIANT \
+ -o $NAME.weights &> $NAME.output &
+WAITFOR+=( $! )
+
+done
+wait_for $WAITFOR
+done
+done
+done
+done
+