How PyTorch lets you build and experiment with a neural net

We show, step-by-step, a simple example of building a classifier neural network in PyTorch and highlight how easy it is to experiment with advanced concepts such as custom layers and activation functions.

Tirthajyoti Sarkar

Nov 27 ·10min read

Introduction

Deep learning (DL) is hot. It is en vogue . And it has cool tools to play with.

1*HCW356WnrDnS-UYef5Gb-Q.jpeg?q=20

Source: This tweet

Although scores of DL practitioners started their journey with TensorFlow, PyTorch has become an equally popular deep learning framework, since it was introduced by Facebook AI Research (FAIR) team, back in early 2017. Since its introduction, it has caught the attention of AI researchers and practitioners around the world and has matured significantly.

In essence, PyTorch provides tremendous flexibility to a programmer about how to create, combine, and process tensors as they flow through a network (called computational graph) paired with a relatively high-level, object-oriented API.

Raw TensorFlow, of course, provides a similar level of low-level flexibility but it is often difficult to master and troubleshoot.

And to boot, PyTorch provides robust yet simple API methods for automatic differentiation for running the essential backpropagation flow like a breeze.

See its core creator, Soumith Chintala talk about its origin and evolution.

The following article actually does a great job of distilling down the essentials of PyTorch and its key differences with the other highly popular framework Keras/TensorFlow.

PyTorch tutorial distilled

Migrating from TensorFlow to PyTorch

towardsdatascience.com

In the present article, we will show a simple step-by-step process to build a 2-layer neural network classifier (densely connected) in PyTorch, thereby elucidating some key features and styles.

PyTorch provides tremendous flexibility to a programmer about how to create, combine, and process tensors as they flow through a network…

Core components

The core components of PyTorch that will be used for building the neural classifier are,

The Tensor (the central data structure in PyTorch)
The Autograd feature of the Tensor (automatic differentiation formula baked into the
The nn.Module class , that is used to build any other neural classifier class
The Optimizer (of course, there are many of them to choose from)
The Loss function (a big selection is available for your choice)

1*wihdTjgOAnhjAR7i7cqaow.png?q=20

Using these components, we will build the classifier in five simple steps,

Construct our neural network as our custom class (inherited from the nn.Module class), complete with hidden layer tensors and a forward method for propagating the input tensor through various layers and activation functions
Propagate the feature (from a dataset) tensor through the network using this forward method — say we get an output tensor as a result
Calculate the loss by comparing the output to the ground truth and using built-in loss functions
Propagate the gradient of the loss using the automatic differentiation ability ( Autograd ) with the backward method
Update the weights of the network using the gradient of the loss — this is accomplished by executing one step of the so-called optimizer — optimizer.step() .

And that’s it. This five-step process constitutes one complete epoch of training. We just repeat it a bunch of times to drive down the loss and obtain high classification accuracy.

1*Lv0w9YbFamjDPFG34sjpbQ.png?q=20

The five-step process with the five core components.

In PyTorch, we define a neural network as a custom class and thereby, can reap the full benefits of the O bject-Orineted-Programming (OOP) paradigm.

The Tensor

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type. It is the central data structure of the framework. We can create Tensor from Numpy arrays or lists and perform various operations like indexing, mathematics, linear algebra.

Tensors support some additional enhancements which make them unique. Apart from CPU, they can be loaded in the GPU (with extremely simple code change) for faster computations. And they support forming a backward graph that tracks every operation applied to them to calculate the gradients using a dynamic computation graph (DCG) .

Read the official documentation on Tensors here . Or, watch this excellent video introduction.

The Autograd

We are all bad at calculus when it comes to a complex neural network. The high-dimensional space messes with our minds. Fortunately, there is Autograd to save us.

To deal with hyper-planes in a 14-dimensional space, visualize a 3-D space and say ‘fourteen’ to yourself very loudly. Everyone does it — Geoffrey Hinton

The Tensor object supports the magical Autograd feature i.e. automatic differentiation which is achieved by tracking and storing all the operations performed on the Tensor while it flows through a network. You can watch this wonderful tutorial video for a visual explanation.

The `nn.Module` class

In PyTorch, we construct a neural network by defining it as a custom class. However, instead of deriving from the native Python object this class inherits from the nn.Module class . This imbues the neural net class with useful properties and powerful methods. We will see a full example of such a class definition in our article.

The Loss Function

Loss functions define how far the prediction of the neural net is from the ground truth and the quantitive measure of loss helps drives the network to move closer to the configuration which classifies the given dataset best.

PyTorch offers all the usual loss functions for classification and regression tasks —

binary and multi-class cross-entropy,
mean squared and mean absolute errors,
smooth L1 loss,
neg log-likelihood loss, and even
Kullback-Leibler divergence.

A detailed discussion of these can be found in this article .

The Optimizer

Optimization of the weights to achieve the lowest loss is at the heart of the backpropagation algorithm for training a neural network. PyTorch offers a plethora of optimizers to do the job, exposed through the torch.optim module —

Stochastic gradient descent (SGD),
Adam, Adadelta, Adagrad, SpareAdam,
L-BFGS,
RMSprop, etc.

“The five-step process constitutes one complete epoch of training. We just repeat it a bunch of times.”

The neural net class and training

The data

For this example task, we first create some synthetic data with binary classes using a Scikit-learn function. The data classes are distinguished by colors in the following plots. It is clear that the dataset is not separable by a simple linear classifier and a neural net is an appropriate machine learning tool for this problem.

1*7Zl0Lo8BofsEWH_0SfYXdw.png?q=20

The synthetic dataset used for the classification example

The architecture

We chose a simple fully-connected, 2-hidden-layer architecture for this demo. It is shown below,

1*Y0OCzJ86c2kStThPorASRA.png?q=20

The class definition

We define variables corresponding to this architecture and then the main class. The neural net class definition is shown below. As stated earlier, it inherits from the nn.Module base class.

Introduction

Core components

The Tensor

The Autograd

The `nn.Module` class

The Loss Function

The Optimizer

The neural net class and training

The data

The architecture

The class definition

Recommend

ElectronCGI - A solution to cross-platform GUIs for .Net Core

Industrial Grade Data Science

波士顿动力机器狗的首个商业买家，为何是它？

电子产品正在废掉农村娃引发价值观混乱等多重问题

专访 OPPO ColorOS 设计总监陈希：这是一款懂用户情感的系统

GitHub - JohnSundell/Ink: A fast and flexible Markdown parser written in Swift.

业绩快报 | 小米Q3智能手机业务营收下降，价格下跌，但IoT占比大幅增加

我，一个CEO，发现团队越来越难带了

9102 年，蚂蚁金服前端是怎么写图表的?

208.43.231.11 Git - php-src.git/tag

About Joyk

How PyTorch lets you build and experiment with a neural net

Introduction

Core components

The Tensor

The Autograd

The nn.Module class

The Loss Function

The Optimizer

The neural net class and training

The data

The architecture

The class definition

Recommend

About Joyk

The `nn.Module` class