Is that a warbler? Bird classification with Keras CNN in Python

Ever wondered ‘What is that bird?’

Dec 11 ·8min read

QZNBVbu.png!web

I constantly wondered ‘What is that bird?’ when I walked my dog along a park in Boston that was filled with birds at all times of the year: baby ducks during the summer, migratory songbirds in the fall/spring, and waterfowl in the winter. My grandpa (a long-time bird watcher) sent me The Sibley Field Guide to Birds and that sparked a hobby for me. Before you discount this as an old person hobby, I highly recommend going bird watching, especially if you have a camera.

Since most small birds do not sit still long enough for you to flip through 400 pages of the field guide and compare 20+ markings, I started taking pictures of birds in hopes I would have a clear enough picture for identification later. I then discovered this website called eBird that allows you to keep track of which bird species you have seen and where. You can even upload a photo as proof. For those nerds out that love Pokemon, it is just like that but with real live birds!

IFfqEbI.jpg!web

Source: https://www.trollandtoad.com/pokemon/pokemon-miscellaneous-supplies/pokemon-1999-pikachu-meowth-pokedex-2-pocket-folder/1167040

Occasionally, I upload a photo of the wrong bird, but luckily there are eBird volunteers who monitor the bird photos and email you (kindly) saying you flagged the wrong species. Don’t do this too often though because then they will lock your account (oops!). Usually, these volunteers will also tell you the correct species. This is a lot of work for those volunteers!

BfIFneb.png!web

Not a Savannah Sparrow

As a data scientist, I was thinking: what if we could automatically check each bird photo that is uploaded with deep learning? As a proof of principle for a weekend project, I created this predictive model to detect if the bird image is a warbler (my grandpa’s favorite category of birds).

Project Definition

Given an image of a bird, predict if it is a warbler (see below for warbler species tags)

Data Set

The data set in this project comes from Caltech-UCSD Birds-200–2011 ( http://www.vision.caltech.edu/visipedia/CUB-200-2011.html ). This data set has 200 bird species across 11,788 images. Since the number of images for any one species is quite small, I decided as a proof of principle to group all the warbler images.

Set up: load metadata

Let’s start by aggregating all the metadata provided by Caltech-UCSD:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np#path to dataset
data_path = '../CUB_200_2011/CUB_200_2011/'# aggregate datasets
df_images = pd.read_csv(data_path+'images.txt', 
                        sep = ' ',header = None, 
                        names = ['img_num','img'])
df_labels = pd.read_csv(data_path+'image_class_labels.txt', 
                        sep = ' ',header = None, 
                        names = ['img_num','class_id'])
df_classes = pd.read_csv(data_path+'classes.txt', 
                         sep = ' ', header = None, 
                         names = ['class_id','bird_class'])
df_split = pd.read_csv(data_path +'train_test_split.txt', 
                       sep = ' ', header = None, 
                       names = ['img_num','dataset'])df = pd.merge(df_images, df_labels, on = 'img_num', how = 'inner')
df = pd.merge(df, df_classes, on = 'class_id',how = 'inner')
df = pd.merge(df, df_split, on = 'img_num',how = 'inner')

The data has the image number, the img name (with path), the id of the species as well as the text of the species as shown below. Since we will do our own train/test split, we will ignore the split provided by this team which is the last column.

Make Warbler Output Label

I went through the list of species and extracted all the warblers in a list:

warbler_class = ['020.Yellow_breasted_Chat','158.Bay_breasted_Warbler',
       '159.Black_and_white_Warbler', '160.Black_throated_Blue_Warbler',
       '161.Blue_winged_Warbler', '162.Canada_Warbler',
       '163.Cape_May_Warbler', '164.Cerulean_Warbler',
       '165.Chestnut_sided_Warbler', '166.Golden_winged_Warbler',
       '167.Hooded_Warbler', '168.Kentucky_Warbler',
       '169.Magnolia_Warbler', '170.Mourning_Warbler',
       '171.Myrtle_Warbler', '172.Nashville_Warbler',
       '173.Orange_crowned_Warbler', '174.Palm_Warbler',
       '175.Pine_Warbler', '176.Prairie_Warbler',
       '177.Prothonotary_Warbler', '178.Swainson_Warbler',
       '179.Tennessee_Warbler', '180.Wilson_Warbler',
       '181.Worm_eating_Warbler', '182.Yellow_Warbler',
       '183.Northern_Waterthrush', '184.Louisiana_Waterthrush', '200.Common_Yellowthroat']

This allows us to make a binary output label:

df['OUTPUT_LABEL'] = (df.bird_class.isin(warbler_class)).astype('int')

Split data into train and validation

We can split our data into 70% train and 30% validation.

df = df.sample(n = len(df), random_state = 42)
df_train_all = df.sample(frac = 0.7, random_state = 42)
df_valid = df.drop(df_train_all.index)

And check the prevalence is about the same in both groups:

def calc_prevalence(y):
    return sum(y)/ len(y)print('train all %.3f'%calc_prevalence(df_train_all.OUTPUT_LABEL))
print('valid %.3f'%calc_prevalence(df_valid.OUTPUT_LABEL))

Which is approximately 15% in each

Image Augmentation

At this point, we could just train a deep learning model, but the model may end up just dumbly always predicting NOT A WARBLER due to the imbalance. I tried it and it happened to me.

To counter this imbalance, we need to either get or make more warbler images or sub-sample the not-warbler images. For this project, I’m going to use data augmentation (rotate, zoom, crop, flip, etc) to increase the number of warbler images. For a great review on data augmentation see this butterfly detector project .

Let’s grab all the warbler images from our dataframe:

warbler_imgs = df_train_all.loc[df_train_all.OUTPUT_LABEL == 1,’img’].values

We can then use Keras’ ImageDataGenerator to make new augmented images. To keep things simple, I’m just going to save these new images in an augmented warblers folder. In addition, it probably would be a good idea to also add augmentation to the non-warbler images so that the DL model doesn’t learn that ‘augmentation’ is warbler, but I’ll skip this for now. I have also seen other articles that do this augmentation on the fly during training, but I’ll skip this for now too.

yY7Z3uU.png!web

We can then aggregated the augmented images:

from os import listdirwarbler_aug_files = ['aug_warblers/'+ a for a in listdir(data_path+'images/aug_warblers/') if a.endswith('.jpg')]df_aug = pd.DataFrame({'img':warbler_aug_files, 'OUTPUT_LABEL': [1]*len(warbler_aug_files) })

And concatenate with our existing training

df_c = pd.concat([df_train_all[['img','OUTPUT_LABEL']],df_aug],
                 axis = 0, ignore_index = True, sort = False)

Just to be safe, let’s balance the data with a 1:1 ratio between warbler and non-warbler:

rows_pos = df_c.OUTPUT_LABEL == 1
df_pos = df_c.loc[rows_pos]
df_neg = df_c.loc[~rows_pos]
n= min([len(df_pos), len(df_neg)])
df_train = pd.concat([df_pos.sample(n = n,random_state = 42), 
                      df_neg.sample(n = n, random_state = 42)], 
                     axis = 0)
df_train = df_train.sample(frac = 1, random_state = 42)

Build X and Y

We can now build our X and Y for machine learning. In order to do this, let’s make a function for loading all the images given a dataframe that

resizes each image to 224x224
converts to RGB (3 channels)
normalizes from 0 to 1 (i.e. divide by 255)

IMG_SIZE = 224
def load_imgs(df):
    imgs = np.ndarray(shape = (len(df), IMG_SIZE, IMG_SIZE,3), dtype = np.float32)
    for ii in range(len(df)):
        file = df.img.values[ii]
        img = load_img(data_path+'images/'+file, target_size=(IMG_SIZE, IMG_SIZE),color_mode='rgb')
        img = img_to_array(img)/255
        imgs[ii] = img
    return imgs

We can make our X and Y with

X_train = load_imgs(df_train)
X_valid = load_imgs(df_valid)y_train = df_train.OUTPUT_LABEL.values
y_valid = df_valid.OUTPUT_LABEL.values

The X matrix here isn’t exactly what we want for Keras so let’s resize with

# reshape
X_train = X_train.reshape(X_train.shape[0], IMG_SIZE,IMG_SIZE, 3)
X_valid = X_valid.reshape(X_valid.shape[0], IMG_SIZE,IMG_SIZE, 3)

My final X_train has a shape (14104, 224, 224, 3) which means we have 14104 images that are 224 x 224 with 3 colors.

We can plot one of the images with:

ii = 3
plt.imshow(X_train[ii])
plt.title(df_train.img.iloc[ii])
plt.show()

JFFRrij.png!web

Image of augmented yellow-rumped warbler

CNN Machine Learning Model

For simplicity, let’s create an architecture that has two CNN layers with dropout, a dense layer, and a final sigmoid for this binary classifier. Other more complicated architectures could be tried later.

from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropoutmodel = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5,5), 
                 activation = 'relu', 
                 input_shape = X_train.shape[1:]))
model.add(MaxPool2D(pool_size = (3,3)))
model.add(Dropout(rate = 0.25))
model.add(Conv2D(filters = 64, kernel_size = (3,3), 
                 activation = 'relu'))
model.add(MaxPool2D(pool_size = (3,3)))
model.add(Dropout(rate = 0.25))
model.add(Flatten())
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(rate = 0.25))
model.add(Dense(1, activation = 'sigmoid'))

We will compile our model with Adam and a binary cross entropy loss (i.e. log-loss for 2 class).

model.compile(
                loss = 'binary_crossentropy',
                optimizer = 'adam',
                metrics = ['accuracy'])

You can train your classifier with (for speed I just did 2 epochs at this time with a batch size of 64):

model.fit(X_train, y_train, batch_size = 64, epochs= 2, verbose = 1)

Predictions and Model Performance

We can calculate predictions for both training and validation as:

y_train_preds = model.predict_proba(X_train,verbose = 1)
y_valid_preds = model.predict_proba(X_valid,verbose = 1)

I’m going to save the validation predictions in the df_valid for further analysis

df_valid['pred'] = y_valid_preds

We can look at the warbler species that we did the best on (highest average score) with

df_valid.loc[(df_valid.OUTPUT_LABEL == 1) ].groupby('bird_class').pred.mean().sort_values(ascending = False)

YnAFVzN.png!web

FfEn6na.png!web

From looking at a few pictures, it seems the model does better on the warblers with yellow than the warblers without yellow in their colors.

We can also look at the species that the model tends to think are warblers but are not:

Q7jInu3.png!web

which makes sense since Goldfinches are very yellow!

mMBFz2z.png!web

We can calculate the performance across a range of metrics (for tutorial on classification metrics see my posthere)

from sklearn.metrics import roc_auc_score, accuracy_score, \
                            precision_score, recall_score
def calc_specificity(y_actual, y_pred, thresh):
    # calculates specificity
    return sum((y_pred < thresh) & (y_actual == 0)) /sum(y_actual ==0)def print_report(y_actual, y_pred, thresh):
    
    auc = roc_auc_score(y_actual, y_pred)
    accuracy = accuracy_score(y_actual, (y_pred > thresh))
    recall = recall_score(y_actual, (y_pred > thresh))
    precision = precision_score(y_actual, (y_pred > thresh))
    specificity = calc_specificity(y_actual, y_pred, thresh)
    print('AUC:%.3f'%auc)
    print('accuracy:%.3f'%accuracy)
    print('recall:%.3f'%recall)
    print('precision:%.3f'%precision)
    print('specificity:%.3f'%specificity)
    print('prevalence:%.3f'%calc_prevalence(y_actual))
    print('pred pos:%.3f'%(sum(y_pred > thresh)/len(y_actual)))
    print(' ')
    return auc, accuracy, recall, precision, specificity

Since we balanced the data, let’s set a threshold of 0.50 to label as predicted Warbler:

thresh = 0.5
print('train')
print_report(y_train, y_train_preds[:,0], thresh);
print('valid')
print_report(y_valid, y_valid_preds[:,0], thresh);

We can plot the ROC curve with:

from sklearn.metrics import roc_curve, roc_auc_scorefpr_train, tpr_train, t_train = roc_curve(y_train, y_train_preds[:,0])
auc_train = roc_auc_score(y_train, y_train_preds[:,0])fpr_valid, tpr_valid, t_valid = roc_curve(y_valid, y_valid_preds[:,0])
auc_valid = roc_auc_score(y_valid, y_valid_preds[:,0])plt.plot(fpr_train, tpr_train, 'r-', label = 'Train AUC:%.3f'%auc_train)
plt.plot(fpr_valid, tpr_valid, 'b-', label = 'Valid AUC:%.3f'%auc_valid)plt.plot([0,1],[0,1], 'k--')
plt.xlabel('FPR')
plt.ylabel('TPR')
plt.legend()
plt.show()

7n2IJbb.png!web

As we can see the AUC with this simple model is quite high on the validation. That gives me great hope of building a classifier to help me label bird images.

Let’s test it on a few of my own images:

file = 'magnolia2.png'
print(file)
x = load_img(file, target_size=(IMG_SIZE, IMG_SIZE),color_mode='rgb')
x= img_to_array(x)/255
x=x.reshape(1,IMG_SIZE,IMG_SIZE, 3)
print('prob it is warbler:%.3f'%model.predict_proba(x,verbose = 1)[0][0])
plt.imshow(load_img(file))
plt.show()

I can correctly classify the magnolia warbler as a warbler

3QvQVrb.png!web

And classify the Surf Scoter as not a warbler

3equUnm.png!web

Conclusion

In this post, we built a simple CNN model to predict if a bird picture is a warbler!

Ever wondered ‘What is that bird?’

Project Definition

Data Set

Set up: load metadata

Make Warbler Output Label

Split data into train and validation

Image Augmentation

Build X and Y

CNN Machine Learning Model

Predictions and Model Performance

Conclusion

Recommend

UI2CODE系列文章|如何提高“小目标”检测准确率

贾跃亭、温晓东、债委会：破产“三国杀”

Google built its own tiny HDMI 2.1 box to jump-start ‘the next generation of And...

公务员 or 高校老师？集友们会如何选择

四次被微信后台中断操作但也阻挡不了他执意向骗子转账的热情 - 人物 - cnBeta.COM

蔚来李斌：燃油车一公里一块钱电动车一公里一毛 - IT 与交通 - cnBeta.COM

任正非评孟晚舟发公开信：不合适消耗国人太多精力 - Huawei 华为 - cnBeta.COM

冒着挂科的风险也要给你们看的 Spring Cloud 入门总结

Jaywalking Around the Compiler

从零开始入门 K8s：手把手带你理解 etcd

About Joyk