Implementing Google’s Firebase Machine Learning with Flutter

4 min readMay 24, 2021

Objective:

To implement Google’s Firebase Machine Learning— Cloud Vision’s text recognition API with Flutter for developing a simple Android Optical Character Recognition application.

What is Google’s Firebase Machine Learning?

Firebase Machine Learning is a mobile SDK that brings Google’s machine learning expertise to Android and iOS apps in a powerful yet easy-to-use package. Firebase ML provides convenient APIs that help you use your custom TensorFlow Lite models in your mobile apps.

Firebase ML provides two key capabilities around on-device custom models:

Custom model deployment: Deploy custom models to your users’ devices by uploading them to our servers. Your Firebase-enabled app will download the model to the device on demand. This allows you to keep your app’s initial install size small, and you can swap the ML model without having to republish your app.
AutoML Vision Edge: This service helps you create your own on-device custom image classification models with an easy-to-use web interface. Then, you can seamlessly host the models you create with the service mentioned above.

Read more about Firebase Machine Learning.

What is OCR?

OCR stands for Optical Character Recognition, it used to recognize the characters from pictures and subsequently that text can be used to create digital copies of image that are easy to read by the computer/machine.

It is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image.

Applications of OCR :

In today’s world everything is digital. OCR engines have been developed into many kinds of domain-specific OCR applications, such as receipt OCR, invoice OCR, check OCR, legal billing document OCR.

Some applications of OCR include-

In airports, for passport recognition and information extraction
Automatic insurance documents key information extraction
Traffic sign recognition
Extracting business card information into a contact list
More quickly make textual versions of printed documents
Make electronic images of printed documents searchable, e.g. Google Books
Converting handwriting in real-time to control a computer (pen computing)
Assistive technology for blind and visually impaired users
Automatic number plate recognition

Tools / Requirements:

IDE: Android Studio / VS Code with Flutter package

Language: Dart, Kotlin

Plug-ins / API : Firebase ML’s Cloud Vision’s text recognition API

Approach:

Make sure you have downloaded one of the required IDEs and integrated it with Flutter.

We will use the default application that Flutter builds and tailor it to fit our objective which is to build a simple Optical Character Recognition Application. By making the necessary changes in the basic structure of the Android application, the UI/UX of your application should look closely like this :

Now moving on to Firebase integration, use a Google account to login into the Firebase console and create a Firebase project and configure it and add the Firebase to the android project.

Next, deploy the Cloud Function to bridge your app and the Cloud Vision API. Add Firebase Auth to the app and finally add necessary dependencies to your application.

The Vision API can detect and extract text from images. There are two annotation features that support optical character recognition (OCR):

TEXT_DETECTION detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.
DOCUMENT_TEXT_DETECTION also extracts text from an image, but the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.

Create the image using bitmap function to process the image and create the JSON request to invoke the cloud function using any one of the above annotations and store the recognized text found in the fullTextAnnotation object in the desired format preferably string.

val annotation = task.result!!.asJsonArray[0].asJsonObject["fullTextAnnotation"].asJsonObject
System.out.format("%nComplete annotation:")
System.out.format("%n%s", annotation["text"].asString)

The text then can be used to copy and paste it in other applications which can be done by creating another function in the android application to do so.

In this way you can build your own OCR scanner by implementing Google’s Firebase Machine Learning — Cloud Vision’s text recognition API with Flutter.

References:

Recognize Text in Images Securely with Cloud Vision using Firebase Auth and Functions on Android

GitHub repository for the application: https://github.com/AmimaShifa/scan_text_from_image

Note: Although the Firebase ML toolkit that was used to build the application in the above GitHub link got deprecated recently but the functions can still be used for reference.