Gautam Singaraju

Tuesday, August 2, 2016

Caffe Deep Learning iPython Notebook

Added an ipython notebook at Git: https://github.com/singarajus/caffepythonnotebook

It is based on Caffe's python bindings for classify.py.

In the notebook, set a url for an image, hit the button: "Submit for run:". All the following cells will be computed. The notebook will identify, "Convolution", "InnerProduct" layers and create drop down boxes. Selecting any convolution layer will allow, visualization of that layer.

The model used in the notebook was imagenet.

A sample input file:

Filters at convolution layer 5

Layer activation at Convolution Layer 5:

Output Probability:

Friday, July 8, 2016

Mahabharata Using Recurrent Neural Networks and Long Short Term Memory

Wanted to test RNN [1] on Mahabharata [3]. JCJohnson's implementation [2] a fast implementation recommended by Karpathy.

Mahabharata is available from Project Gutenberg [3]. After data cleanup, ran the experiment to generate some data.

Generating 2000 characters, generates:

A: Drona Parva: Section CCLV

this protect and approached thy breast. Beholding a friend of Santanu, O king's lronding to that effulgen ances died n it. If the common of this estraining every approving of those persons and

heart, on his sons; have with husbanded Burnificate of Madhu is palace of the ruler of that daughter. Sharactents, and discoursed to the very sage

and the carded. Tood (at the basze when they are iking that of Bharata to be milk'Thie kinds of end dragged of Brahmana rids and observarowhere that I regards us

like that of a

lore, who consume having who is one's soul, Krishna himself. The hundred intradavas entered thy tiger and religion of Sahadeva for the lord of the broking in the prevent, settled himself, O thou of arms, or Shikhandi, where) resting from pa fierce known into grief is the ye is the mind. Persons, the clouds. A

auspicious asked his hungly cleansed nook nobodimal our child should as bestood then took along by conduct of gold, with me, K.P been warrior Bhishma most extermination. Thinal. They are bent of arms, weeping, and one with four duties of the great hommaded as only to the act, thy carried that business of great son of Viratishta, I will the quantition that consequently, five the breath (the ocean saying, whose angry

luled (by all. Salutatir, that man, decority as if from which never hermitad of moon. Arrived ing Brahmanas at deep, O gover by the latter. Thy sires, Ketumishma, entrance. Peneemongst that subthe villages.

All of wicked roaring in her as a well-knowled at weapons l Pandu's race.

283:1 There, O thou wilt first

hurl to this is the bare with great on his sinful achieve the Pitris, the prince, live his minds is matter of are enjoymentible, some immediately

no dic Siva and Yay the loud behaviour and the driveranged for the inexhaustible. Of the minded them

which a long of humble slain by mair from their brave vanquished for him all one for his shafts. Aven desirous of fruits. Many fear, that one who used t

RNN generates a good sample. However, data cleaning and formatting the data will make a great sample.

Reference
[1] http://karpathy.github.io/2015/05/21/rnn-effectiveness/
[2] https://github.com/jcjohnson/torch-rnn
[3] http://www.gutenberg.org/cache/epub/7864/pg7864.txt

Friday, May 27, 2016

Increasing performance of Octave with OpenBlas and Multiple Threads

Octave [1] is a great platform to perform Machine Learning tasks. Octave run time performance can be improved by adding OpenBlas [2] and by increasing the number of threads Octave can run using. This can be further improved using NVIDIA's cuBLAS [3]. The following steps would work in Ubuntu OS.

For the purposes of this test, the matrix multiplication Octave code from [3] was used. To increase the number of threads and using OpenBlas, run the octave code in a command line in a terminal as:

export LD_LIBRARY_PATH=/opt/OpenBLAS
OMP_NUM_THREADS=<NumThreads> LD_PRELOAD=/opt/OpenBLAS/lib/libopenblas.so octave code.m

Results

Matrix Multiplication without OpenBlas. Note: Y-axis is in log scale.

Improved Performance Using OpenBlas. Note: Y-axis is in log scale.

The performance is best when OMP_NUM_THREADS=#coresOnTheMachine (Obviously.).

References

[1] GNU Octave: https://www.gnu.org/software/octave/
[2] OpenBlas: http://www.openblas.net/
[3] Drop-in Acceleration of GNU Octave. https://devblogs.nvidia.com/parallelforall/drop-in-acceleration-gnu-octave/

Friday, February 5, 2016

Object Classification using Deep Learning and Raspberry PI

Of late, I have been playing with multimodal interaction [1] with a Raspberry PI (referred to as an IoT edge device) that is being powered by predictions using Deep Learning. In this post, I discuss how I created the pipeline using voice interaction (voice decoding and speaking) and Raspberry PI camera to identify fruits.

Image Dataset

FID30 [5] image dataset contains 30 fruit category of images. These images were augmented using manipulations of the images such as a 90 degree transformation, and negative transformation of the images. Whenever possible, these images were augmented with images along with the transformations.

Deep Learning

Deep Learning [3] can be used for object recognition in images and for decoding voice. In this demo, I have used Convolutional Neural Networks (CNN) [2] for image analysis. For this demo, I have build the CNN model; and used CMU Sphinx with Linux’s espeak for voice interaction.

Specifically, CNN is a type of feed forward artificial neural network that has a wide applicability in image recognition. The models built using NVidia Digits Framework [6] uses Caffe Framework [7] running an NVidia GPU [8]. For the network part, I have chosen to use AlexNet [4] to generate the model; other networks such as GoogLenet were evaluated too. Typically, CNN require a large dataset of labelled images, and the generated models was overfitting. To have a better accuracy and to reduce the overfitting, a huge repository of images is needed. Creation of such image repository with 100’s of thousands of images is a challenge and cannot be addressed by this post.

IoT Edge Device built using the Raspberry PI module

The edge node was built using Raspberry PI using a $2 microphone, a $34 camera, $10 usb wifi adaptor and a $12 speaker system. The system took about $100 to build. The software tech stack includes Python - PyAudio with models were based on CMU PocketSphinx for voice interaction.

Cloud Tech Stack

The model creation and prediction using the model was done in the cloud. Tech Stack included Linux for OS, Python, Java, MongoDB as the backend storage, NVidia GPU for model generation, Caffe Framework for model generation - object recognition and NVidia’s Digits for model generation.

What you will see in the Demo video

When the word “this” is detected by IoT Device, an image is captured and uploaded to the cloud (my laptop) via a REST interface. On the server, the object recognition module pulls the image, identifies the object and updates the entry via the REST Interface. The Raspberry PI polls the REST interface for the prediction result for the object it just uploaded and announces the object on the speakers.

All human interaction is with the IoT Device - through the speaking and listening; and all image recognition is performed on the server. It can be altered such that an initial prediction can be performed at the edge device making it a ‘true’ IoT device. This demo shows and end-to-end pipeline with CNN.

Apple being identified as Apples:

Lemons being identified as Lemons:

Orange being identified as Apricots:

In some cases, the same fruit with with a different background has been identified incorrectly.

References

[1] Multimodal Interaction. https://en.wikipedia.org/wiki/Multimodal_interaction

[2] Convolutional Neural Networks. https://en.wikipedia.org/wiki/Convolutional_neural_network

[3] Deep Learning. https://en.wikipedia.org/wiki/Deep_learning

[4] Krizhevsky, A., Sutskever, I. and Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada.

[5] Škrjanec Marko, Automatic fruit recognition using computer vision. Bsc Thesis, (Mentor: Matej Kristan), Fakulteta za računalništvo in informatiko, Univerza v Ljubljani, 2013.

[6] Digits Framework. https://developer.nvidia.com/digits

[7] Caffe Framework. http://caffe.berkeleyvision.org/

[8] NVIDIA GeForce GTX. http://www.geforce.com/hardware/notebook-gpus/geforce-gtx-760m

Thursday, October 29, 2015

Using Apache Zeppelin for Spark, SparkML and SparkSQL

Apache Zeppelin[1] is quite a handy tool for exploratory analysis of data using Spark, SparkML and SparkSQL.

Sample Visualization in Apache Zeppelin generated using SparkSQL:

Create RDD, dataframe and register as a temp table:
Query your data using SparkSQL:
Use SparkML for analytics:

Once the Spark SQLContext has the data, Zeppelin can be used to visualize the data.
[1] Apache Zeppelin: https://zeppelin.incubator.apache.org/