Monday, October 26, 2015

Deep Learning: Identification of Fruits Using Digits, Caffe Framework

Image classification using Neural Networks is a very interesting topic. During my bachelors,  I had a glimpse of Neural Networks - where I specifically worked on "C" code to convert hand written signatures into features that could be fed into the neural networks.

Following is work to take open datasets and classify fruits.

Classification of Orange using the model developed using Digits 2 and Caffe Framework.


Software:

  1. CUDA: 7.0
  2. Caffe framework [1]
  3. NVidia GeForce GTX 860M with 640 cores
  4. NVidia Digits 2

DataSet:

  1. Fruit DataSet [2]  (30 categories)
Digits does image classification, with 25% of the images are set for validation.

After many modifications and using AlexNet[2], I have received an accuracy of 60%. Anything beyond this has lead to overfitting. Overfitting is part of Convolution Neural Networks. To avoid over fitting, a large amount of images needs to be available.

I posted on multiple forums to see if practitioners/researchers can get larger image data. Following are the next steps that I can envision:

  1. Looking at other datasets, to extract the same features from them, to add to the dataset. 
  2. Downloading non-copyrighted images from Search Engines. Google's API allows only 64 images to be downloaded. I asked on their forum on how to filter images that are copyrighted.
I would like to create a framework that could create automated image datasets (non-copyrighted images) including translations and transformations. 

References:


[1] Caffe Deep Learning Framework: http://caffe.berkeleyvision.org/
[2] Marko Ċ krjanec, Fruit Image Data set http://www.vicos.si/Downloads/FIDS30
[3] Convolution Neural Networks, https://en.wikipedia.org/wiki/Convolutional_neural_network

Tuesday, October 6, 2015

Running CUDA 7 on Ubuntu 14.04: blank screen after CUDA install: Solved

Update: CUDA 8 is awesome; and seems more stable.
--

Installation of CUDA on Ubuntu 14 has resulted in blank screen. After fixing those issues, the GUI login screen shows up, but login would not work. After multiple attempts to fix, either CUDA works or X works.

Trying to use Nouveau driver for GUI and Nvidia driver for CUDA, lead me to a blog [1]. It worked for me and thanks Chen.

Chen suggests to blacklist Nouveau (which CUDA installer does as well), remove and purge nvidia drivers, install CUDA. During this part, install Nvidia driver, but NOT the OpenGL driver.

And everything works. Thanks, Chen.



To fix Ubuntu's blank Screen

If Ubuntu boots to blank screen [2]:
  1. Goto command line: ctrl + alt + f1.
  2. Stop lightdm: sudo service lightdm stop
  3. Disable gpu-manager by backing up /etc/init/gpu-manger.conf and commenting its contents.
  4. Switch to prime mode by installing and using sudo prime-select nvidia
  5. Get the correct busid for nvidia: lspci | grep NVIDIA
  6. Update /etc/X11/xorg.conf with the correct busid. "PCI:01:00.0"
  7. Perform sudo chattr +i /etc/X11/xorg.conf
  8. Start lightdm: sudo service lightdm start

Reference:
[1] Chen, https://chuanwen.wordpress.com/2015/07/19/run-cuda-on-ubuntu-14-04-2/
[2] http://vxlabs.com/2015/02/05/solving-the-ubuntu-14-04-nvidia-346-nvidia-prime-black-screen-issue/

Sunday, August 30, 2015

Caffe Install: ATLAS installation Ubuntu CPU throttling apparently enabled

Caffe Install requires a Math library. Of the 3 options, ATLAS is open source and, hence, the default. Installation of ATLAS was causing issues because intel_pstate governor. Governor does power management for Kernel [1].

To identify if, the intel_pstate is the issue, identify if the governor is intel_pstate. 

  1. cpufreq-info will show the governor.
  2. If intel_pstate is shown as the governor, disable intel_pstate.
  3.  Edit the file /etc/default/grub and add: GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable"
  4. Update grub.cfg ala grub-mkconfig -o /boot/grub/grub.cfg
Reference
[1] https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt
[2] http://unix.stackexchange.com/questions/121410/setting-cpu-governor-to-on-demand-or-conservative
[3] https://wiki.archlinux.org/index.php/CPU_frequency_scaling#Scaling_governors

Friday, February 26, 2010

Calculating distance between two IPs using Maxmind database

This is a faster mechanism to find distance between two IPs instead of using Haversine distance to calculate the distance between two points (each defined by a lat, long tuple) on the globe .

Thursday, February 25, 2010

Hadoop: Effectiveness of a Combiner Class

The execution speed difference for an MR job with and without a combiner class is huge. The Security log analytics without a combiner class  did not complete in 1.5 days. With the addition of a Combiner class, the code finished in 15-20 minutes! Now, the reasons for this performance enhancement are obvious.
  • <K, V> are in memory and network latency and traffic to reducers is decreased.
  • Disk operations are minimal at the reducers as a result of combine operations.

Saturday, February 20, 2010

Hadoop: using ChainMapper

ChainMapper's are a way to perform: [MAP+ / REDUCE MAP*] operations.

  • Find below an example main function written to handle a chainmapper.

Notes:


  • While a chainmapper can be used to simplify processing. Usually be deftly handling the data, most [MAP+ / REDUCE MAP*] can be reduced to [MAP / REDUCE MAP]

Hadoop: MaxMind GeoIP lookup from Distributed Cache

Place GeoCityLite.dat in HDFS:

  • Create a JobClient Object and add the files' URI to the distributedCache.



  • Create a configure method that overrides org.apache.hadoop.mapred.MapReduceBase and org.apache.hadoop.mapred.jobConfigurable


  • create a Location Object and pass the IP to it.
  • A significant IPs might come back with a null city/ country. Use try and catch blocks to catch exceptions and process them accordingly.
Notes:
  • Reduce the creation of the LookupService objects. These are resource intensive.
  • Similarly, reduce the creation of Location Objects.