What is Optical character recognition?

06/21/2021
2 minutes to read

Optical character recognition (OCR) allows you to extract printed or handwritten text from images, such as photos of street signs and products, as well as from documents—invoices, bills, financial reports, articles, and more. Microsoft's OCR technologies support extracting printed text in several languages. Follow a quickstart to get started.

OCR demos

This documentation contains the following types of articles:

The quickstarts are step-by-step instructions that let you make calls to the service and get results in a short period of time.
The how-to guides contain instructions for using the service in more specific or customized ways.

Read API

The Computer Vision Read API is Azure's latest OCR technology (learn what's new) that extracts printed text (in several languages), handwritten text (English only), digits, and currency symbols from images and multi-page PDF documents. It's optimized to extract text from text-heavy images and multi-page PDF documents with mixed languages. It supports detecting both printed and handwritten text in the same image or document.

How OCR converts images and documents into structured output with extracted text

Input requirements

The Read call takes images and documents as its input. They have the following requirements:

Supported file formats: JPEG, PNG, BMP, PDF, and TIFF
For PDF and TIFF files, up to 2000 pages (only first two pages for the free tier) are processed.
The file size must be less than 50 MB (6 MB for the free tier) and dimensions at least 50 x 50 pixels and at most 10000 x 10000 pixels.

Supported languages

The Read API supports a total of 73 languages for print style text. Refer to the full list of OCR-supported languages. Handwritten-style OCR is supported exclusively for English.

Key features

The Read API includes the following features.

Print text extraction in 73 languages
Handwritten text extraction in English
Text lines and words with location and confidence scores
No language identification required
Support for mixed languages, mixed mode (print and handwritten)
Select pages and page ranges from large, multi-page documents
Natural reading order for text lines
Handwriting classification for text lines
Available as Distroless Docker container for on-premise deployment

Learn how to use the OCR features.

Use the cloud API or deploy on-premise

The Read 3.x cloud APIs are the preferred option for most customers because of ease of integration and fast productivity out of the box. Azure and the Computer Vision service handle scale, performance, data security, and compliance needs while you focus on meeting your customers' needs.

For on-premise deployment, the Read Docker container (preview) enables you to deploy the new OCR capabilities in your own local environment. Containers are great for specific security and data governance requirements.

Warning

The Computer Vision 2.0 RecognizeText operations are in the process of being deprecated in favor of the new Read API covered in this article. Existing customers should transition to using Read operations.

Data privacy and security

As with all of the Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's policies on customer data. See the Cognitive Services page on the Microsoft Trust Center to learn more.

What is Optical character recognition?

What is Optical character recognition?

In this article

Read API

Input requirements

Supported languages

Key features

Use the cloud API or deploy on-premise

Data privacy and security

Next steps

Is this page helpful?

Recommended content

Quickstart: Read client library or REST API - Azure Cognitive Services

How to call the Read API - Azure Cognitive Services

Receipts - Form Recognizer - Azure Applied AI Services

Language support - Computer Vision - Azure Cognitive Services

Recommend

Career and Resume Advice: Student Edition

This Freaky Little 'Eyecam' Wants You to Know Your Computer Is Watching You

The State Department and 3 other US agencies earn a D for cybersecurity

从世界500强的粤企名单中，能读出什么？

springboot~使用freemaker模版进行部署

Find a remote or in-person job at a YC startup from the most recent Winter ’21 b...

Early Deadline for YC Winter 2022

阿里“过关”

苑东生物销售费用占营收比重近50% 降本增效成难题？9月还面临巨额解禁

keycloak~docker部署https的keycloak使用自定义证书

About Joyk