33

How PyTorch lets you build and experiment with a neural net

 4 years ago
source link: https://towardsdatascience.com/how-pytorch-lets-you-build-and-experiment-with-a-neural-net-de079b25a3e0?gi=ffcf1e0fb253
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

We show, step-by-step, a simple example of building a classifier neural network in PyTorch and highlight how easy it is to experiment with advanced concepts such as custom layers and activation functions.

Introduction

Deep learning (DL) is hot. It is en vogue . And it has cool tools to play with.

1*HCW356WnrDnS-UYef5Gb-Q.jpeg?q=20

Source: This tweet

Although scores of DL practitioners started their journey with TensorFlow, PyTorch has become an equally popular deep learning framework, since it was introduced by Facebook AI Research (FAIR) team, back in early 2017. Since its introduction, it has caught the attention of AI researchers and practitioners around the world and has matured significantly.

In essence, PyTorch provides tremendous flexibility to a programmer about how to create, combine, and process tensors as they flow through a network (called computational graph) paired with a relatively high-level, object-oriented API.

Raw TensorFlow, of course, provides a similar level of low-level flexibility but it is often difficult to master and troubleshoot.

And to boot, PyTorch provides robust yet simple API methods for automatic differentiation for running the essential backpropagation flow like a breeze.

See its core creator, Soumith Chintala talk about its origin and evolution.

The following article actually does a great job of distilling down the essentials of PyTorch and its key differences with the other highly popular framework Keras/TensorFlow.

In the present article, we will show a simple step-by-step process to build a 2-layer neural network classifier (densely connected) in PyTorch, thereby elucidating some key features and styles.

PyTorch provides tremendous flexibility to a programmer about how to create, combine, and process tensors as they flow through a network…

Core components

The core components of PyTorch that will be used for building the neural classifier are,

  • The Tensor (the central data structure in PyTorch)
  • The Autograd feature of the Tensor (automatic differentiation formula baked into the
  • The nn.Module class , that is used to build any other neural classifier class
  • The Optimizer (of course, there are many of them to choose from)
  • The Loss function (a big selection is available for your choice)

1*wihdTjgOAnhjAR7i7cqaow.png?q=20

Using these components, we will build the classifier in five simple steps,

  • Construct our neural network as our custom class (inherited from the nn.Module class), complete with hidden layer tensors and a forward method for propagating the input tensor through various layers and activation functions
  • Propagate the feature (from a dataset) tensor through the network using this forward method — say we get an output tensor as a result
  • Calculate the loss by comparing the output to the ground truth and using built-in loss functions
  • Propagate the gradient of the loss using the automatic differentiation ability ( Autograd ) with the backward method
  • Update the weights of the network using the gradient of the loss — this is accomplished by executing one step of the so-called optimizer — optimizer.step() .

And that’s it. This five-step process constitutes one complete epoch of training. We just repeat it a bunch of times to drive down the loss and obtain high classification accuracy.

1*Lv0w9YbFamjDPFG34sjpbQ.png?q=20

The five-step process with the five core components.

In PyTorch, we define a neural network as a custom class and thereby, can reap the full benefits of the O bject-Orineted-Programming (OOP) paradigm.

The Tensor

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type. It is the central data structure of the framework. We can create Tensor from Numpy arrays or lists and perform various operations like indexing, mathematics, linear algebra.

Tensors support some additional enhancements which make them unique. Apart from CPU, they can be loaded in the GPU (with extremely simple code change) for faster computations. And they support forming a backward graph that tracks every operation applied to them to calculate the gradients using a dynamic computation graph (DCG) .

Read the official documentation on Tensors here . Or, watch this excellent video introduction.

The Autograd

We are all bad at calculus when it comes to a complex neural network. The high-dimensional space messes with our minds. Fortunately, there is Autograd to save us.

To deal with hyper-planes in a 14-dimensional space, visualize a 3-D space and say ‘fourteen’ to yourself very loudly. Everyone does it — Geoffrey Hinton

The Tensor object supports the magical Autograd feature i.e. automatic differentiation which is achieved by tracking and storing all the operations performed on the Tensor while it flows through a network. You can watch this wonderful tutorial video for a visual explanation.

The nn.Module class

In PyTorch, we construct a neural network by defining it as a custom class. However, instead of deriving from the native Python object this class inherits from the nn.Module class . This imbues the neural net class with useful properties and powerful methods. We will see a full example of such a class definition in our article.

The Loss Function

Loss functions define how far the prediction of the neural net is from the ground truth and the quantitive measure of loss helps drives the network to move closer to the configuration which classifies the given dataset best.

PyTorch offers all the usual loss functions for classification and regression tasks —

  • binary and multi-class cross-entropy,
  • mean squared and mean absolute errors,
  • smooth L1 loss,
  • neg log-likelihood loss, and even
  • Kullback-Leibler divergence.

A detailed discussion of these can be found in this article .

The Optimizer

Optimization of the weights to achieve the lowest loss is at the heart of the backpropagation algorithm for training a neural network. PyTorch offers a plethora of optimizers to do the job, exposed through the torch.optim module —

  • Stochastic gradient descent (SGD),
  • Adam, Adadelta, Adagrad, SpareAdam,
  • L-BFGS,
  • RMSprop, etc.

“The five-step process constitutes one complete epoch of training. We just repeat it a bunch of times.”

The neural net class and training

The data

For this example task, we first create some synthetic data with binary classes using a Scikit-learn function. The data classes are distinguished by colors in the following plots. It is clear that the dataset is not separable by a simple linear classifier and a neural net is an appropriate machine learning tool for this problem.

1*7Zl0Lo8BofsEWH_0SfYXdw.png?q=20

The synthetic dataset used for the classification example

The architecture

We chose a simple fully-connected, 2-hidden-layer architecture for this demo. It is shown below,

1*Y0OCzJ86c2kStThPorASRA.png?q=20

The class definition

We define variables corresponding to this architecture and then the main class. The neural net class definition is shown below. As stated earlier, it inherits from the nn.Module base class.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK