12

Plan for Deep Learning in .NET · Issue #5918 · dotnet/machinelearning · GitHub

 3 years ago
source link: https://github.com/dotnet/machinelearning/issues/5918
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Plan for Deep Learning in .NET

This past year we've been working on our plan for deep learning (DL) in .NET which is outlined in this issue below.

We came up with this plan by looking at recent deep learning trends, staying in line with current Microsoft strategy around deep learning, and gathering a variety of customer evidence (which you can read about in the Customer evidence section below).

Our goal is to execute this plan by the end of 2022.

Feedback

We would love to hear your feedback about our plan and whether this will fulfill your requirements for deep learning in .NET!

You can leave your feedback directly on this issue.

The plan

The main stages of the plan are:

  1. Make it easier to consume ONNX models in ML.NET using the ONNX Runtime (RT)
  2. Fully support and productionize TorchSharp for building neural networks in .NET
  3. Build a bridge between TorchSharp and ML.NET

As part of this plan, ML.NET will continue to be more top-level and scenario focused, while TorchSharp will power the Deep Learning training and features in .NET.

As we iterate and work on the plan, we will make updates to this issue, including linking to relevant issues and PRs.

Stage 1: Make it easier to consume ONNX models in ML.NET with the ONNX RT

In ML.NET, you can already consume pre-trained ONNX models via the ONNX RT.

However, we have seen many issues from customers around the complexity of ONNX model consumption in ML.NET, particularly when trying to figure out the inputs and outputs of the ONNX model (which are required for consumption in ML.NET).

So, we plan to:

  • Expose internal functionality to get the input and output schema from ONNX models (#5917).
  • Add documentation outlining step-by-step how to get the inputs and outputs of any ONNX model and then how to consume that ONNX model in ML.NET. We will try and cover a variety of scenarios (#25863).
  • Partner with the ONNX RT team to bring relevant ONNX RT advancements to ML.NET.

Stage 2: Fully support and productionize TorchSharp

TorchSharp is an API that provides .NET bindings for the PyTorch engine.

TorchSharp will be the API-based Neural Network training and inference in .NET; the custom deep learning training that you can do in PyTorch, you will be able to do in .NET with TorchSharp.

As mentioned in the TorchSharp repo, TorchSharp will retain Python naming conventions. This makes it very easy to port PyTorch to .NET TorchSharp as well as to bring Python examples to .NET.

In order to bring TorchSharp to production, we will:

  • Move TorchSharp from the Xamarin repo to the .NET Foundation repo
  • Add missing features and optimizers from PyTorch 1.6-1.9
  • Update infrastructure for signed builds, tests, and SDLs

Ongoing support work will include:

  • Issue and PR management of the repo
  • Reacting to new versions of PyTorch (~2x a year)

Stage 3: Build a bridge between TorchSharp and ML.NET

Once TorchSharp is fully productionized, it will power higher-level ML.NET deep learning APIs which will follow .NET naming conventions and "feel" like .NET.

We will surface the following through ML.NET (powered by TorchSharp):

  • Scenario-focused APIs for transfer learning, like the Image Classification API; there will be an API for each scenario, like Object Detection and NLP
  • Generic transfer learning API for custom scenarios that don't fit the scenario-focused APIs
  • Simplified ML.NET APIs and tooling for building neural networks from scratch (this will take a bit more investigation and planning)

Here is a diagram which visually outlines the transfer learning part of the plan:

Out of scope

For this plan, it is currently out of scope to port over the PyTorch domain libraries (Torch Vision, Torch Audio, and Torch Text).

Customer evidence

After talking to a variety of .NET customers, we discovered the following:

  • Lack of deep learning support is a top pain point/blocker for using ML.NET
  • Customers are forced to turn to Python-based frameworks for training DL models when scenario is not covered by ML.NET (mainly TensorFlow, Keras, and PyTorch)
  • Customers have a variety of DL scenarios, the most popular of which include Image Classification, Named Entity Recognition (NER), NLP-based scenarios, and Object Detection
  • Almost all customers said they prefer their E2E machine learning workflow, from training to consumption, to be in .NET, and they don’t care about the underlying framework ML.NET uses for training deep learning models
  • A majority of customers we talked to want ability to build neural networks from scratch using .NET
  • Several customers said adding transfer learning for their scenario would unblock them
  • One customer we talked to needed better support for consuming pre-trained models which is also evident in the number of issues filed around consuming ONNX models.
  • Customers were split in terms of tooling and automation: some prefer a code-first / API approach when building neural networks from scratch, while some prefer tooling.
  • Customers were also split in terms of automation when training deep learning models: some prefer no automation and full control when training, some want full automation, and some want a combination of automation with a lot of control

Additionally, the following issues filed in the repo add further evidence for backing up our deep learning plan:

Image segmentation / regression

Local object detection

Torch support

ONNX consumption

Other


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK