Monday, April 4, 2016

Real-time auto capture - user friendly approach to mobile OCR

To enable the fastest and the most reliable ID document reading, BlinkID uses some interesting technology tricks for image processing. One of them is definitely the real-time auto capture for document detection from the live video stream even when documents are scanned in imperfect conditions.

Real-life users are imperfect

Reading data from ID's can be challenging. If you have a high-quality picture taken at the right angle with good lightning and clear, plane background, it's great. You can do the OCR step easily from taken image. But, in reality, it is often not the case and you can't really expect it to be.
Image that shows the perfect scanning conditions
This is where the real challenge for the OCR begins since there are so many different aspects of imperfection to consider. What if the document is at an angle or the user is not holding the phone in parallel with the document (scanning in a perspective) so you don't get a rectangular image. What if a user has shaky hands or is scanning a document on a bus ride to work?
Image that shows imperfect conditions, phone shaking, scanning at an angle
To provide the best possible experience for the end-users it is imperative to make the technology work with imperfections of reality. It is unrealistic to expect people to adapt their behavior to technology.
Users want technology to work for them and not the other way around. Therefore, we have developed our own, proprietary Real-time auto capture technology.

A natural way

From the technical standpoint, the Real-time auto capture technology consists of several steps:
Auto capture process visualization
This is a very high-level, simplified overview. What is important to notice is the fact that the whole process is done locally on the device.
There is no need for server-side processing at all! But, more importantly, from the user's standpoint, the process consists of a half-a-second phone-pointing. Users don't even have to take a photo, they just point the camera to a document.
In fact, we don't want them to take photos because our technology processes multiple frames and picks the best one in just a fraction of the second.

Check out our demo app in action:
The demo video isn't the best case scenario show-off. It really works that fast!

Small differences that matter

The difference between a solution that would include taking photo by the users, sending it to the server and returning results and ours would be about 2-3 seconds (if the taken photo is ok).
You might think that this is not a big difference, but it's huge, considering the user experience. Technology that works smoothly and adapts to users is the ultimate goal and we think that's the main reason why our SDK's are better than most other solutions available on the market.

The other use-cases

Besides reading personal ID documents, our core technology can be used to read payment slips, barcodes, receipts, payment data like IBAN numbers and even free-form text in some cases.
Wherever you need to eliminate manual data entry and provide the best possible experience to your users, be confident that Microblink solutions will do the trick for you.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.