Modern businesses rely increasingly on artificial intelligence to build software with higher capabilities, more innovation and increased efficiency. Microsoft Azure AI services makes it easier to build applications powered by intelligence.
Azure AI services
Microsoft Azure offers various choices of easy to use AI services, what’s fascinating is that no AI expertise is required.
- Document Intelligence : useful for extracting information from documents, it has built in models such as a national ID card, driving license, invoice, contracts and more.
- Content safety : it helps process images and text using advanced algorithms to detect content that is potentially offensive, risky or undesirable.
- Bot service : a set of developer tools for building chatbots using C# and JavaScript, comes with an easy to use graphical interface.
- Speech : great for text to speech and vice versa, supports a wide variety of languages, it can be used also to identify a language from text.
- Vision : unified service capable of different things such as analyzing images, reading texts, detecting faces and much more.
The following programming languages are supported by most of the Azure AI services: C#, C++, Java, JavaScript, Objective-C, Swift and Python.
How Document Intelligence service works
This service allows you to extract information from documents such as images or PDF files. It can even detect the handwritten text, but not only that, it’s capable of reading tables too.
Azure provides prebuilt models that you can use with different document types such as
- Identity card (national ID, driving license, or passport)
- Health insurance card
- Business card
- Tax forms
- Contracts
- Invoices
- You can create your custom models too using Azure AI Studio
Microsoft offers SDK for its AI services, for example, you can use the @azure/ai-form-recognizer
module with your Node.js projects.
AI services take some time to process the data once they receive it, so you need to consider writhing asynchronous code.
It’s worth noting that this service was formerly known as Azure Form Recognition.
Azure AI Studio
Think about using the Document Intelligence AI service, and you wish to extract text but not using the prebuilt models.
For example, you want to extract and process information from payslip documents, there’s no prebuilt model for that, and even if there is, payslips can be different from a country to another.
Azure AI Studio allows you to create custom models and train them, it has an easy-to-use user interface, it can be used with several AI services and not only limited to Document Intelligence. You can even use it to create your own ChatGPT-like solution.
The Document Intelligence Studio can be used to create two types of custom models
- Extraction model, useful for extracting specific data from documents and forms.
- Classification model, useful for splitting and classifying different documents.
Technical Demonstration
We will build a code that extracts information from driving licenses, we will test using the following test image, note that this is not a real driving license, it’s a French sample.
Deploy needed resources
This demonstration requires creating an Azure AI service, within a resource group, use the following commands to deploy them.
az group create --name AzureAiDemo --location northeurope
az cognitiveservices account create --kind FormRecognizer --location northeurope --name driving-license-parser --resource-group AzureAiDemo --sku F0
Writing the code
Using node.js, create a file with the following code and install the @azure/ai-form-recognizer
module
mkdir azure-ai-example
cd azure-ai-example
touch index.js
npm install @azure/ai-form-recognizer
Here’s the code
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
(async () => {
const key = "<ACCESS_KEY>";
const endpoint = "<COGNETIVE_SERVICE_ENDPOINT>";
const documentUrl = "https://azurehacks.io/wp-content/uploads/2024/02/permis-conduire.jpg";
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-idDocument", documentUrl);
const result = await poller.pollUntilDone();
console.log("First name", result.documents[0].fields["FirstName"].value);
console.log("Last name", result.documents[0].fields["LastName"].value);
console.log("Date of birth", result.documents[0].fields["DateOfBirth"].value);
console.log("Place of birth", result.documents[0].fields["PlaceOfBirth"].value);
console.log("Document number", result.documents[0].fields["DocumentNumber"].value);
console.log("Date of issue", result.documents[0].fields["DateOfIssue"].value);
console.log("Date of expiration", result.documents[0].fields["DateOfExpiration"].value);
console.log("Vehicle classifications", result.documents[0].fields["VehicleClassifications"].value);
})();
Testing the code
Use the following command to run the script
node index.js
When running this script, we’ll see that it will output the data recognized from the driving license image.
Conclusion
To-do