From 3e62bba15db6dbbc83514bfcbf6cb49f0b4fe2ec Mon Sep 17 00:00:00 2001 From: karpathy Date: Thu, 18 Dec 2014 12:29:45 -0800 Subject: first commit --- character_demo.html | 1991 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1991 insertions(+) create mode 100644 character_demo.html (limited to 'character_demo.html') diff --git a/character_demo.html b/character_demo.html new file mode 100644 index 0000000..dc84bc4 --- /dev/null +++ b/character_demo.html @@ -0,0 +1,1991 @@ + + +RecurrentJS Sentence Memorization Demo + + + + + + + + + + + + + + + +Fork me on GitHub + + +
+

Deep Recurrent Nets character generation demo

+
+ This demo shows usage of the recurrentjs library that allows you to train deep Recurrent Neural Networks (RNN) and Long Short-Term Memory Networks (LSTM) in Javascript. The library is actually more general and allows you to set up arbitrary expression graphs and perform automatic backpropagation through symbolic differentiation.

+ + In this demo we take a dataset of sentences as input and learn to memorize the sentences character by character. That is, the RNN/LSTM takes a character, its context from previous time steps (as mediated by the hidden layers) and predicts the next character in the sequence. Here is an example:

+ +
+ + In the image above, every character has an associated "letter vector" that we will train with backpropgation. These letter vectors are combined through a Matrix Vector multiply transformation into the first hidden layer representation (yellow), then into second hidden layer representation (purple), and finally into the output space (blue). The output space has dimensionality equal to the number of characters in the dataset and every dimension provides the probability of the next character in the sequence. The network is therefore trained to always predict the next character. The quantity we track during training is called the perplexity, which measures how surprised the network is to see the next character in a sequence. For example, if perplexity is 4.0 then it's as if the network was guessing uniformly at random from 4 possible characters for next letter (i.e. lowest it can be is 1). At test time, the prediction is done interatively character by character.

+ + The demo is pre-filled with sentences from Paul Graham's essays, in an attempt to encode Paul Graham's knowledge into the weights of the Recurrent Networks. The long-term goal of the project then is to generate startup wisdom at will. Feel free to train on whatever data you wish, and to experiment with the parameters. If you want more impressive models you have to increase the sizes of hidden layers, and maybe slightly the letter vectors. However, this will take longer to train.

+ + For suggestions/bugs ping me at @karpathy.

+ +
+
+
Input sentences:
+ +
+
+ +
Controls/Options:
+ + + + +
+ +
+ +
+
Training stats:
+
+
Learning rate:
+
+
+
+ + +
+
+
+
+ +
+
+ +
Model samples:
+
+
+
Softmax sample temperature (low = more peaky predictions)
+
+
+
+
+
+
Greedy argmax prediction:
+
+
+
+

I/O save/load model JSON

+ + +
+ The textarea below is pre-filled with an example pre-trained model. You can choose to load it to see what predictions later on in training look like. +
+ + + You can also load an example pretrained model: + + +
+
+ + + -- cgit v1.2.3