Deep Learning Image Classification

In this article, we focused on using deep learning to create non-linear features to improve the performance of machine learning. We will also see how transfer learning techniques can be applied to use deep features learned with one dataset to get great performance on a different dataset. In this Ipython notebook, we are going to build new image retrieval models and explore their results on different parts of our image dataset

import graphlab

Load a commom image analysis dataset

# CSV format datasets https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv
# https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv
image_train = graphlab.SFrame('coursera-notebooks/course-1/image_train_data')
image_test = graphlab.SFrame('coursera-notebooks/course-1/image_test_data')

[INFO] This non-commercial license of GraphLab Create is assigned to prashantgonarkar@gmail.com and will expire on February 13, 2017. For commercial licensing options, visit https://dato.com/buy/.

[INFO] Start server at: ipc:///tmp/graphlab_server-1420 - Server binary: /usr/local/lib/python2.7/dist-packages/graphlab/unity_server - Server log: /tmp/graphlab_server_1457450628.log
[INFO] GraphLab Server Version: 1.8

Exploring the image data

graphlab.canvas.set_target('ipynb')

image_train['image'].show()

image_train.head()

id	image	label	deep_features	image_array
24	Height: 32 Width: 32	bird	[0.242871761322, 1.09545373917, 0.0, ...	[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ...
33	Height: 32 Width: 32	cat	[0.525087952614, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ...
36	Height: 32 Width: 32	cat	[0.566015958786, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ...
70	Height: 32 Width: 32	dog	[1.12979578972, 0.0, 0.0, 0.778194487095, 0.0, ...	[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ...
90	Height: 32 Width: 32	bird	[1.71786928177, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ...
97	Height: 32 Width: 32	automobile	[1.57818555832, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ...
107	Height: 32 Width: 32	dog	[0.0, 0.0, 0.220677852631, 0.0, ...	[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ...
121	Height: 32 Width: 32	bird	[0.0, 0.23753464222, 0.0, 0.0, 0.0, 0.0, ...	[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ...
136	Height: 32 Width: 32	automobile	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.5737862587, 0.0, ...	[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ...
138	Height: 32 Width: 32	bird	[0.658935725689, 0.0, 0.0, 0.0, 0.0, 0.0, ...	[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ...

[10 rows x 5 columns]

# Train a classifier on the raw image pixels

raw_pixel_model = graphlab.logistic_classifier.create(image_train,target='label',features=['image_array'])

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 1903
PROGRESS: Number of classes           : 4
PROGRESS: Number of feature columns   : 1
PROGRESS: Number of unpacked features : 3072
PROGRESS: Number of coefficients    : 9219
PROGRESS: Starting L-BFGS
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 6        | 0.000021  | 4.622984     | 0.379401          | 0.382353            |
PROGRESS: | 2         | 8        | 1.000000  | 6.046222     | 0.388334          | 0.411765            |
PROGRESS: | 3         | 9        | 1.000000  | 6.869242     | 0.437730          | 0.450980            |
PROGRESS: | 4         | 10       | 1.000000  | 7.699640     | 0.439832          | 0.480392            |
PROGRESS: | 5         | 11       | 1.000000  | 8.534293     | 0.452444          | 0.480392            |
PROGRESS: | 6         | 12       | 1.000000  | 9.366371     | 0.478192          | 0.500000            |
PROGRESS: | 10        | 16       | 1.000000  | 12.661399    | 0.517078          | 0.490196            |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: TERMINATED: Iteration limit reached.
PROGRESS: This model may not be optimal. To improve it, consider increasing `max_iterations`.

Making prediction with simple model based on raw pixels

image_test[0:3]['image'].show()

image_test[0:3]['label']

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

raw_pixel_model.predict(image_test[0:3])

dtype: str
Rows: 3
['bird', 'cat', 'bird']

Evaluate raw pixel model on test data

raw_pixel_model.evaluate(image_test)

{'accuracy': 0.4775, 'auc': 0.7196601666666673, 'confusion_matrix': Columns:
    target_label    str
    predicted_label str
    count   int

 Rows: 16

 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     bird     |       dog       |  179  |
 |     dog      |       cat       |  236  |
 |     cat      |       cat       |  338  |
 |     bird     |    automobile   |  132  |
 |  automobile  |    automobile   |  625  |
 |     dog      |    automobile   |  102  |
 |     dog      |       dog       |  417  |
 |     cat      |       dog       |  295  |
 |     bird     |       cat       |  159  |
 |  automobile  |       bird      |  110  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.4750893898174211, 'log_loss': 1.20979288252905, 'precision': 0.47388806650241955, 'recall': 0.47750000000000004, 'roc_curve': Columns:
    threshold   float
    fpr float
    tpr float
    p   int
    n   int
    class   int

 Rows: 400004

 Data:
 +-----------+-----+-----+------+------+-------+
 | threshold | fpr | tpr |  p   |  n   | class |
 +-----------+-----+-----+------+------+-------+
 |    0.0    | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 +-----------+-----+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

Can we improve the model using the deep features

len(image_train)

len(image_test)

# takes time, already computed and loaded in sframe
# deep_learning_model = graphlab.load_model('imagenet_model')
# image_train['deep_features'] = deep_learning_model.extract_features(image_train)

Given deep features. lets train a classifier

deep_features_model = graphlab.logistic_classifier.create(image_train,
                                                         features=['deep_features'],
                                                         target='label')

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

PROGRESS: WARNING: Detected extremely low variance for feature(s) 'deep_features' because all entries are nearly the same.
Proceeding with model training using all features. If the model does not provide results of adequate quality, exclude the above mentioned feature(s) from the input dataset.
PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 1907
PROGRESS: Number of classes           : 4
PROGRESS: Number of feature columns   : 1
PROGRESS: Number of unpacked features : 4096
PROGRESS: Number of coefficients    : 12291
PROGRESS: Starting L-BFGS
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 5        | 0.000131  | 4.232286     | 0.758783          | 0.816327            |
PROGRESS: | 2         | 10       | 0.138641  | 8.771076     | 0.767174          | 0.816327            |
PROGRESS: | 3         | 11       | 0.138641  | 9.943608     | 0.768222          | 0.816327            |
PROGRESS: | 4         | 12       | 0.138641  | 11.128205    | 0.772942          | 0.826531            |
PROGRESS: | 5         | 13       | 0.138641  | 12.322463    | 0.785527          | 0.836735            |
PROGRESS: | 6         | 14       | 0.138641  | 13.507242    | 0.803356          | 0.836735            |
PROGRESS: | 7         | 15       | 0.138641  | 14.696790    | 0.829051          | 0.887755            |
PROGRESS: | 8         | 16       | 0.138641  | 15.881160    | 0.835343          | 0.897959            |
PROGRESS: | 9         | 17       | 0.138641  | 17.062399    | 0.848453          | 0.897959            |
PROGRESS: | 10        | 18       | 0.138641  | 18.250449    | 0.865758          | 0.908163            |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: TERMINATED: Iteration limit reached.
PROGRESS: This model may not be optimal. To improve it, consider increasing `max_iterations`.

Apply the deep features model to first few images of test set

image_test[0:3]['image'].show()

deep_features_model.predict(image_test[0:3])

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

Comute the test data accuracy of deep_features model

deep_features_model.evaluate(image_test)

{'accuracy': 0.783, 'auc': 0.9415876666666686, 'confusion_matrix': Columns:
    target_label    str
    predicted_label str
    count   int

 Rows: 16

 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     bird     |       cat       |  152  |
 |     dog      |       dog       |  743  |
 |     cat      |       dog       |  229  |
 |     cat      |       bird      |   58  |
 |     bird     |       dog       |   64  |
 |  automobile  |       dog       |   5   |
 |     cat      |    automobile   |   33  |
 |     dog      |       bird      |   35  |
 |  automobile  |    automobile   |  953  |
 |     bird     |    automobile   |   28  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.783820449077231, 'log_loss': 0.5771744600567088, 'precision': 0.7870366726645404, 'recall': 0.7829999999999999, 'roc_curve': Columns:
    threshold   float
    fpr float
    tpr float
    p   int
    n   int
    class   int

 Rows: 400004

 Data:
 +-----------+----------------+-----+------+------+-------+
 | threshold |      fpr       | tpr |  p   |  n   | class |
 +-----------+----------------+-----+------+------+-------+
 |    0.0    |      1.0       | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   | 0.947666666667 | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   |     0.931      | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   |     0.918      | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   |      0.91      | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   | 0.901333333333 | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 0.892333333333 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 0.885333333333 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   |     0.878      | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 0.872333333333 | 1.0 | 1000 | 3000 |   0   |
 +-----------+----------------+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

image_train['label'].sketch_summary()

+------------------+-------+----------+
|       item       | value | is exact |
+------------------+-------+----------+
|      Length      |  2005 |   Yes    |
| # Missing Values |   0   |   Yes    |
| # unique values  |   4   |    No    |
+------------------+-------+----------+

Most frequent items:
+-------+------------+-----+-----+------+
| value | automobile | cat | dog | bird |
+-------+------------+-----+-----+------+
| count |    509     | 509 | 509 | 478  |
+-------+------------+-----+-----+------+

Credits Machine Learning Foundations: A Case Study Approach