Azure AI Document Intelligence: Paper to Digital

Modern businesses rely increasingly on artificial intelligence to build software with higher capabilities, more innovation and increased efficiency. Microsoft Azure AI services makes it easier to build applications powered by intelligence.

Azure AI services

Microsoft Azure offers various choices of easy to use AI services, what’s fascinating is that no AI expertise is required.

  • Document Intelligence : useful for extracting information from documents, it has built in models such as a national ID card, driving license, invoice, contracts and more.
  • Content safety : it helps process images and text using advanced algorithms to detect content that is potentially offensive, risky or undesirable.
  • Bot service : a set of developer tools for building chatbots using C# and JavaScript, comes with an easy to use graphical interface.
  • Speech : great for text to speech and vice versa, supports a wide variety of languages, it can be used also to identify a language from text.
  • Vision : unified service capable of different things such as analyzing images, reading texts, detecting faces and much more.

The following programming languages are supported by most of the Azure AI services: C#, C++, Java, JavaScript, Objective-C, Swift and Python.

How Document Intelligence service works

This service allows you to extract information from documents such as images or PDF files. It can even detect the handwritten text, but not only that, it’s capable of reading tables too.

Azure provides prebuilt models that you can use with different document types such as

  • Identity card (national ID, driving license, or passport)
  • Health insurance card
  • Business card
  • Tax forms
  • Contracts
  • Invoices
  • You can create your custom models too using Azure AI Studio

Microsoft offers SDK for its AI services, for example, you can use the @azure/ai-form-recognizer module with your Node.js projects.

AI services take some time to process the data once they receive it, so you need to consider writhing asynchronous code.

How Azure Document Intelligence AI service works.
How Azure Document Intelligence AI service works.

It’s worth noting that this service was formerly known as Azure Form Recognition.

Azure AI Studio

Think about using the Document Intelligence AI service, and you wish to extract text but not using the prebuilt models.

For example, you want to extract and process information from payslip documents, there’s no prebuilt model for that, and even if there is, payslips can be different from a country to another.

Azure AI Studio allows you to create custom models and train them, it has an easy-to-use user interface, it can be used with several AI services and not only limited to Document Intelligence. You can even use it to create your own ChatGPT-like solution.

Example of labeling data for a custom Document Intelligence model using Azure AI Studio.

The Document Intelligence Studio can be used to create two types of custom models

  • Extraction model, useful for extracting specific data from documents and forms.
  • Classification model, useful for splitting and classifying different documents.

Technical Demonstration

We will build a code that extracts information from driving licenses, we will test using the following test image, note that this is not a real driving license, it’s a French sample.

Example of a French driving license.
Example of a French driving license.

Deploy needed resources

This demonstration requires creating an Azure AI service, within a resource group, use the following commands to deploy them.

az group create --name AzureAiDemo --location northeurope
az cognitiveservices account create --kind FormRecognizer --location northeurope --name driving-license-parser --resource-group AzureAiDemo --sku F0 

Writing the code

Using node.js, create a file with the following code and install the @azure/ai-form-recognizer module

mkdir azure-ai-example
cd azure-ai-example
touch index.js
npm install @azure/ai-form-recognizer

Here’s the code

const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");

(async () => {
  const key = "<ACCESS_KEY>";
  const endpoint = "<COGNETIVE_SERVICE_ENDPOINT>";
  const documentUrl = "https://azurehacks.io/wp-content/uploads/2024/02/permis-conduire.jpg";

  const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
  const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-idDocument", documentUrl);
  const result = await poller.pollUntilDone();

  console.log("First name", result.documents[0].fields["FirstName"].value);
  console.log("Last name", result.documents[0].fields["LastName"].value);
  console.log("Date of birth", result.documents[0].fields["DateOfBirth"].value);
  console.log("Place of birth", result.documents[0].fields["PlaceOfBirth"].value);
  console.log("Document number", result.documents[0].fields["DocumentNumber"].value);
  console.log("Date of issue", result.documents[0].fields["DateOfIssue"].value);
  console.log("Date of expiration", result.documents[0].fields["DateOfExpiration"].value);
  console.log("Vehicle classifications", result.documents[0].fields["VehicleClassifications"].value);
})();

Testing the code

Use the following command to run the script

node index.js

When running this script, we’ll see that it will output the data recognized from the driving license image.

Conclusion

To-do

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.