NotMNIST classification
Check out the Udacitycourse in deep learning, made by Google. They use this dataset extensively and show some really powerful techniques. The goal of the last assignment was to experiment with this techniques to find the best accuracy using a regular multi-layer perceptron. I have a pretty beefy machine: 6600K OC, 2x GTX 970 OC, 16gb DDR4, Samsung 950 Pro; so I set up a decent sized network and let it train for a while.
My best network gets:
Test accuracy: 97.4%
Validation accuracy: 91.9%
Minibatch accuracy: 97.9%
First I applied a Phash to every image and removed any with direct collisions. Then I split the large folder into ~320k training and ~80k validation. I used ~17k in the small folder for testing. Trained on mini-batches using SGD on the cross-entropy, dropout between each layer and an exponentially decaying learning rate. The network has three hidden layers with RELU units, plus a standard softmax output layer.
Here are the parameters:
Mini-batch size: 1024
Hidden layer 1 size: 4096
Hidden layer 2 size: 2048
Hidden layer 3 size: 1024
Initial learning rate: 0.1
Dropout probability: 0.5
I ran this for 150k iterations, took an hour and half using one GPU. Learning pretty much stopped at 60k, but the model never began to overfit. I believe that is because the dataset is so large and the dropout. Even at the end of all that training with a good size network the mini-batch accuracy still did not reach 100% so learning could continue, albeit slowly.
The next assignment is to use a convolutional network, which looks promising. I'll try to post those results too.
RNN
The dynamic_rnn() function uses a while_loop() operation to run over the cell the appropriate number of times, and you can set swap_memory=True if you want it to swap the GPU’s memory to the CPU’s memory during backpropagation to avoid OOM errors.
Save Model
You can easily save Scikit-Learn models by using Python’spicklemodule, or usingsklearn.externals.joblib, which is more efficient at serializing large NumPy arrays:
**fromsklearn.externalsimportjoblib
joblib.dump(my_model,"my_model.pkl")
# and later...
my_model_loaded=joblib.load("my_model.pkl")**