OCR Images and PDFs using Google Docs

INTRO

This guide has been revived and updated following a website reshuffle when the original guide was completely lost! The basics have not changed since the original was written in February 2019, but the methods have been improved. Google Docs does a very good job of OCRing images with text, even those taken with a phone or tablets camera, and it can also extract the text from PDF files. This method does rely on an internet connection.

Example image of text to be OCR'd (this is the lorem.png image in the app)

Example of OCr'd text

The demo app shows how this is done, returning the text to a label, and allowing the user to save this text to a tinydb. A google apps script web app provides the magic of handling the uploaded image (in base64) opening this up in a google doc for OCRing, then returning the text. Both the image/pdf and the created google doc are deleted from Google Drive on success. The demo app tests two different types of image and a sample pdf containing text, and allows for the user to take photos or written text with the device camera for OCR. A standalone google apps script project is used for this demo. You can use a project bound to a spreadsheet if you find this easier, but the spreadsheet is not needed for this. The demo app was developed and tested in companion and compiled versions on Android 10 and 11, the file paths used reflect this. There is no block coding to handle earlier Android versions.

SETUP

What do we need?

Google Apps Script Web App
1. Setup a project in the usual way
2. Copy the script code below into the project
3. Enter your own folder ID in the correct place
4. Publish as a web app, executing as "you" and accessible by "anyone, even anonymous"
5. Get the script url for use in the App Inventor App
6. Make a note also of the filename you give to your project (you can't find it again using just the script url!). I have included mine in the demo app for safekeeping....
7. When creating the script, you will need to add the Advanced Drive Service. With the legacy editor you do this by going to Resources > Advanced Google Services, then setting "Drive API" to "On". With new script editor: Services>Drive API. Use Version 2 of the Drive API.
App Inventor App
1. The blocks and the demo app aia are provided below
2. The very basics required are as follows:
  1. Camera component
  2. Web component
    1. In the PostText you must provide
      1. the base64 encoded string of the file
      2. the mimetype for the file (e.g. image/png)
      3. a filename (e.g. image1.png)
  3. Script URL
  4. Sunny Gupta's Filey extension for base64 encoding
3. I have added more elements for usability:
  1. A few procedures to extract the filename and the file extension
  2. A hard coded procedure to create the mimetype required (there are only a few filetypes that can be used)
  3. Use of Taifun's File extension to provide access to files in assets
  4. Note the need to modify the Camera image path for use in FIley
  5. A tinydb and listview to store and display the OCR texts

SCRIPT

// requires base64 encoded file, the file's mimetype, and a filename

function doPost(e) {

var data = Utilities.base64Decode(e.parameters.data);

var blob = Utilities.newBlob(data, e.parameters.mimetype, e.parameters.filename);

//provide here a folder ID for the creation of the image file

var fileID = DriveApp.getFolderById('your folder ID here').createFile(blob).getId();

if ( fileID !== "" ) {

try {

// Fetch the image from drive

var imageBlob = DriveApp.getFileById(fileID).getBlob();

var resource = { title: e.parameters.filename, mimeType: imageBlob.getContentType() };

// OCR on .jpg, .png, .gif, (or .pdf uploads)

var options = { ocr: true };

var docFile = Drive.Files.insert(resource, imageBlob, options);

var doc = DocumentApp.openById(docFile.id);

// Extract the text body of the Google Document

var text = doc.getBody().getText().replace("\n", "");

// Send the documents to trash

Drive.Files.remove(docFile.id);

Drive.Files.remove(fileID);

status = text;

} catch (error) {

status = "ERROR with OCR: " + error.toString();

}

} else {

status = "ERROR with OCR: No image specified";

}

return ContentService.createTextOutput(status);

}

BLOCKS

VIDEO

AIA and FILES

GoogleOCRBlank.aia

RESOURCES

Credits @Taifun and @Sunny for their extensions

Page updated

Google Sites

Report abuse