Tuesday, February 6, 2018

[ENGINEERING BLOG] Major change of the API and in the license key formats

Besides working on new features and making plenty of SDK releases in the last year, as described in this post, we invested a lot of effort into research and development in order to upgrade our proprietary mobile OCR technology. In parallel, we were busy preparing a new API for our mobile SDKs, one that will be able to support all the new features we have planned. Also, we are changing our licensing subsystem with new formats of the license keys in order to increase extensibility and flexibility.

This is the biggest change we have made in the past 5 years, so we invite developers to read this blog post in full.

Reading time: 25-35 min

Why the change?

In the last two years, we have shifted from traditional text recognition to a deep-learning approach. Our research team designed a custom-made machine learning system for OCR and is continuously working on new models of state-of-the-art neural networks, while our development team makes sure that DeepOCR runs fast on mobile devices while at the same time requiring only minimal memory. This enables high accuracy and speed for even the most complex use cases. DeepOCR technology is already implemented in our award-winning product BlinkReceipt and it also powers the recognition of handwritten problems in the Photomath app.

Microblink SDKs are used in a wide variety of use cases, from scanning of identity documents using BlinkID, payslips or invoices using PhotoPay, various predefined data using BlinkInput, to barcode- and QR code-scanning with PDF417. Now it's time to prepare all SDKs for the implementation of DeepOCR, but it's not as straightforward as one might think. Such a variety of use cases cannot be solved with a single DeepOCR model and so support for using multiple models within an SDK is needed.

The above-mentioned is the reason why we decided to change the licensing subsystem with the backward-incompatible change of the API and to introduce the new format of license keys. The new API and the new license keys are necessary to support all the information required to run DeepOCR and to also support other new features that we plan to add in 2018.

The release of the new API provides some additional key benefits for developers:

  • the integration of SDKs is easier and more flexible;
  • the SDKs are now faster and smaller; 
  • the interaction of objects within the API involves much less overhead to the native library that does all the processing.

We understand that this type of major change requires additional development effort on integration so we will be available to help you at every stage of the development. Please don't hesitate to contact us for support.

Since this license format change is not backward compatible with the current license format and as we use semantic versioning for our SDKs, this means that we need to raise the major version number of all our SDKs.

The new versions for the Microblink SDKs will be:
  • PDF417.mobi SDK 
    • for Android: version 7.0.0
    • for iOS: version 7.0.0
  • BlinkInput SDK
    • for Android: version 4.0.0
    • for iOS: 4.0.0
  • BlinkID SDK
    • for Android: version 4.0.0
    • for iOS: version 4.0.0
  • PhotoPay SDK
    • for Android: version 7.0.0
    • for iOS: version 7.0.0
As you may notice, we decided to increase the iOS versions by more than one version number. This is to reduce any risk of confusion and ensure that the same version number is used for both Android and iOS SDKs, as well as for the wrappers (PhoneGap, Xamarin, React Native).

Please note that the existing license keys cannot work with the new SDK versions but they will continue to work with the existing SDK versions. Vice versa, it's not possible to use the new license key with the old SDK versions.

What has changed?

In this section, we will describe all the changes in our SDKs, namely:
  • the change in the license key formats, specifically the licensing subsystem and licensing API;
  • the change in handling the recognizers and parsers;
  • implementation of a new concept: the processor.

License key formats

Because of the ever-increasing number of features that clients require from us, we decided that we needed a new license key format that would support these present and future demands. Technically, adding support for that required increasing the size of the binary layout of the license buffer, which meant that our license keys could no longer be formed from 8 groups of 8 alphanumeric characters.
Therefore, the new license keys are utilized and are now distributed in three different formats for the client to decide which one to use:
  • as a file;
  • as the base64 encoded strings;
  • as raw buffers.

We recommend using the license key as a file, as it's the easiest way to manage multiple license keys (demo and production). Instead of having different license setup codes for your test and production app, you can now have the same code while using different license files within assets of your app.

We simplified the API for setting up the license key. Instead of having several different ways of setting up the same license key, especially in Android, now there is a unified way to set up the license keys in both, i.e., Android and iOS, SDKs.

For example, in Android, a class called MicroblinkSDK allows you to set the license key in three ways:
  • as a path to the file within your assets folder;
  • as a base64 string;
  • as a raw buffer. 
The choice is yours. A similar class exists in iOS and can be used in a similar manner.

One of the greatest problems with the licensing system in the old API arose when a developer set a license key that didn't allow usage of a specific recognizer and then activated that recognizer. The developer was informed about the licensing error at the point when the native library was starting up and after the camera had already been initialized. This information was delivered via asynchronous callback, which was difficult to handle and confusing for most developers. Sometimes developers would simply ignore the callback and then wonder why the scanning wasn't working.

With the new API, this is no longer possible. We expect a developer to set the license key as early as possible during the startup of an app. Whenever a specific recognizer, detector, processor, or parser that is not allowed by the license key is instantiated, an exception arises in Android and an NSError will be returned in the iOS. Thus, it will be much more difficult for a developer to go into production with an invalid license key.

In addition, now we ensure that the demo license keys always inform a user when a demo version is being used so that the app testers can easily notice if the production version is using a demo license key.

Recognizers, Parsers, Detectors, Processors, Templating API

All our existing clients are already familiar with the concept of a Recognizer. Some of them are also already familiar with the concepts of a Parser and Detector, which are available only within BlinkInput, BlinkID, and PhotoPay SDKs. However, in the new API, we're introducing a new concept: the Processor. In order to explain the processor concept, let's first review the concepts behind the recognizer, parser, and detector.


The recognizer has always been the main unit of recognition within Microblink SDKs. Basically, a recognizer is the most abstract object that serves a specific use case. For example, BarcodeRecognizer is an object that knows how to scan barcodes on images received from a camera, while MRTDRecognizer is an object that knows how to find a machine readable zone of a travel document on a camera frame, performs OCR on that zone, and extracts relevant document information from it.

As you can see, a recognizer is quite a complex object with many responsibilities:

  1. It manages the detection of objects like barcodes, ID cards, payslips, and machine readable zones.
  2. It performs image correction and the dewarping of detected objects.
  3. It performs optical character or barcode recognition.
  4. It intelligently questions the recognized data in order to produce the final result.
Recognizers are not new and have existed in the all Microblink SDKs from the very first version of every SDK, but initially they were internal objects. Developers could only interact with them by creating RecognizerSettings objects that configured the expected behavior of a specific recognizer. When recognition was finished, developers then needed to typecast the given BaseRecognitionResult to the specific RecognitionResult for the specific recognizer. This, however, proved rather confusing, as it was not always clear that the specific RecognitionResult could only be produced by the specific recognizer configured with the specific RecognizerSettings.

Now, this process has been simplified. A developer now simply needs to instantiate a specific recognizer object, configure it, and give it to the RecognizerRunner object, which will use it to perform the desired recognition. After doing the recognition, that same specific recognizer will internally contain its recognition result, which a developer can then obtain by calling on an appropriate getter method.

This makes the recognizers long-lived stateful objects that live within an app and change their internal state while performing recognition. This is probably the biggest change developers will face when integrating the new version of Microblink SDK, but once used to it, it will be obvious that the recognition is much simpler to handle than it was before.

There is one special type of recognizer that is very flexible and configurable - it's called the Templating Recognizer. It is used as a part of the Templating API, which allows manually defining its behavior. To perform the detection of objects, a detector is required. Then, locations are used within that detection to identify any parts of the detected object that need perspective correction, and the settings for performing OCR on the corrected images are defined. Finally, the parsers that extract structured information from the OCR result are defined.

With the new API, we upgraded the flexibility of the Templating Recognizer and added a new processor concept that can be used within the Templating API. This is explained in more detail below.


The detector is an object that knows how to find a certain object in a camera image. BlinkID developers are likely familiar with DocumentDetector, which can find cards and checks in images, and MRTDDetector, which can find documents containing a machine readable zone in images. Those two detectors will remain in the BlinkID and PhotoPay SDKs, while other detectors will be removed from the SDKs.

Previously, developers interacted with detectors in a similar manner as with recognizers: they created a specific DetectorSettings object and associated it with a special recognizer called DetectorRecognizer by using the DetectorRecognizerSettings object. Then, during the operation of DetectorRecognizer, after it had internally performed the detection and before continuing to the next step, it returned the concrete DetectionResult via MetadataListener (or the didOutputMetadata callback in iOS). This asymmetry was confusing even more than the case with recognizers, especially because the same callback could receive detection results from internal detectors within recognizer objects and no one actually knew where these results were coming from.

In the new API, a developer will simply create a specific detector and associate it with DetectorRecognizer directly. After DetectorRecognizer internally performs detection using the specified detector, its detection results will remain saved within the specific detector and will be available to the developer via the provided getter method - in the same way as the recognizer's result is available via the specific recognizer's getter method.

Using detectors will now be the same as using recognizers, which we believe will make things a lot easier for developers.


Parsers are objects that can extract structured data from the raw OCR result. BlinkInput, BlinkID, and PhotoPay developers will already be familiar with the concept, especially when using the field-by-field scanning feature. With the field-by-field scanning feature, each parser tries to extract specific information from the OCR result obtained by performing OCR over a small area of a camera frame in the user interface.

In previous versions of SDKs, the parsers always produced their results as strings, which proved confusing for some use cases, like date parsing, whereby the date parser would return the string as returned from the OCR engine and, although it internally knew which part of the date was the day, which part was the month, and which part was the year, it had no way to communicate that back to the developer.

Moreover, in order to obtain the specific parser result, the developer had to know the exact name of the parser and the exact name of the parser group where a parser was placed. To make things even more confusing, when using BlinkInputRecognizer for performing field-by-field scanning, it was possible to use multiple parser groups over a single image, while when using DetectorRecognizer or MRTDRecognizer (i.e., Templating Recognizers), the name of the parser group was actually the name of the location within the detected location of the document and there was always a single parser group for each decoding location.

Has that confused you? I bet it has! To address this issue, we really thought hard and long about how to make this concept easier to use, but without losing all the flexibility it provided. We love symmetry, so we thought that it would be a good idea to organize parsers in the same way as recognizers and detectors are organized. So, we did it.

Parser is now a stateful object, just like the recognizer or detector. Developers will create a specific parser and then associate it with ParserGroupProcessor (more on that later), which will be associated with either BlinkInputRecognizer (for the field-by-field scan) or with the Templating Recognizer. Then, after the parser performs extraction of the OCR result, it will save the extraction result internally, and it will be available to the developer via the provided getter method, just like the recognizer provides its result via its own getter method.

This means that developers will no longer need to worry about assigning arbitrary strings to parser names and to then use those strings to later obtain parsed results from some obscure BlinkInputRecognitionResult; now, the parser's result will be available within the parser object.


Some might ask: "What about parser groups? Where did they disappear to?"
In the above story about parsers, you probably noticed that in the old API, parsers were grouped into parser groups, where every parser within the same group would perform extraction of the same OCR result calculated for the entire parser group. You also probably noticed the discrepancy between the field-by-field scan and the Templating API, where you could use multiple parser groups on the same image in the field-by-field scan, but only a single parser group on the dewarped image within Templating Recognizer.

We were thinking: "How to avoid that discrepancy and also provide more flexibility within Templating API?" or for example, "How to ensure that recognition performed with Templating API is not fully complete if the image that should contain a person's face in the document does not contain it?" We knew we needed something like a parser, but not working with the OCR result. Instead, it should work with the image just like a recognizer but should be possible to use within Templating Recognizer. Well, that led us to the Processor.

The processor is an object that can perform recognition of the image. Unlike the recognizer, the processor cannot be used alone - it must be used within the Templating API. The above-mentioned ParserGroupProcessor is a special processor (it acts as a parser group in the old API) that performs OCR on a given image using the same rules as the parser group used in the old API, and then runs every parser bundled within it to extract the OCR result. If a developer needs a dewarped image, ImageReturnProcessor can be used to simply save the image that was provided to it. In future releases, we plan to add lots of new processors for various use cases.

And the architecture of the processor object is the same as the architecture of the recognizer, parser, and detector. A developer will create the processor and associate it with Templating Recognizer. After the recognition is finished, the developer will obtain the result from the processor.

Templating API

If you were familiar with our Templating API you might now ask: "Where are the classifiers? How do we define decoding locations?"
Well, decoding locations are now defined within ProcessorGroup, which contains
  • one or more processors;
  • a location of interest within a document;
  • an instruction how to perform image correction and dewarping.
Templating Recognizer uses the chosen instruction to perform image correction and dewarping of the desired location and then runs processors within the given processor group on the corrected image.

What about classifiers?

We changed those too. In the old API, a developer had to define a single document classifier that needed to provide a classification of the document based on the parser results obtained in the pre-classification stage of Templating Recognizer's processing in order to continue processing with the document-specific parsers. Yes, we know that was a complex sentence, but it describes the very complex process that developers had to follow in order to use Templating API to correctly recognize the custom document.

Now, in order to provide a better abstraction, we created Class, which is an object containing two collections of processor groups and a classifier. The two collections of processor groups within Class are:
  • the classification processor group collection;
  • the non-classification processor group collection.
This process goes as follows:
  1. All processor groups within the classification collection perform processing.
  2. The classifier decides whether the object being recognized belongs to the current class and if it decides so, then the processor groups within the non-classification collection perform processing.
  3. Finally, Templating Recognizer just contains one or more class of objects.

OK, you have lost me back at the recognizer. Do I need to use this Templating API?

In the most common cases, the Templating API is not used. The Templating API is a very flexible API that can be used to perform the recognition of almost any document and with the new release, it has become even more flexible than it was in the old API. However, flexibility comes with increased complexness and we are aware of that. 

If we simplify it too much, then developers will not be able to add support for scanning custom documents, such as loyalty cards, or will be very constrained about what they can do. The Templating API would then not be flexible enough for many practical use cases and that would make our SDKs useless for those who want to add support for documents by themselves. Adding lots of flexibility makes Templating API very complex, but also very powerful.

Hence, we decided to make Templating API flexible and powerful, at the cost of it being more complex. The Templating API has always been and will always be a tool for the more advanced developers - typically those specialist in Microblink technology.

Platform-specific changes: iOS, Android, cross-platform

The changes described above apply to all platforms. However, there are some additional changes to mention that are specific to Android and iOS SDKs.

Name unification

A big problem in the old API was that the same concepts had different names in the Android and iOS SDKs. This was a problem in cases when a developer became familiar with Android documentation but then needed to port its code to iOS. Code porting was not so straightforward as some recognizers and UI elements had different names and even some basic API objects were named completely different (for example, PPCameraCoordinator in iOS was basically the same as RecognizerView in Android - but who knew that without asking our support engineers?).

The new API, however, has unified naming across platforms. The only difference in names now are those due to a specific platform's naming conventions; for instance, the DirectAPI singleton will now be called RecognizerRunner in Android and MBRecognizerRunner in iOS. Similarly, in iOS, there is now MBRecognizerRunnerViewController and in Android, there's RecognizerRunnerView and RecognizerRunnerFragment. In the same way, other components will have similar, if not the same, names, as you will see from the new and updated documentation accompanying each SDK release.

Images are now part of the results

In the new API, besides the scanned text, the results (in the recognizer and processor) can also contain images. This is especially important for BlinkID SDK. Now, it will be much easier to obtain images of the documents as well as faces and images of signatures from the documents. Those images will no longer be sent to an image callback. Instead, images will now be part of the specific recognizer's result, just like the extracted OCR data is.

Note for Android

In order to support this, we needed to change the way how recognizer objects are passed between activities via Intent. The problem is that Android has very strict limits on the size of data transferred via Intent, so it is not possible to transfer images. You can find details about this in the documentation and troubleshooting part of the new README. Also, make sure to check updated sample integration apps to see any changes.

iOS specific changes

Specifically for iOS, there are several notable changes to mention.
  1. Since the recognizer object is now a stateful object that gets mutated while it performs the recognition, we needed to change the way results are delivered via a delegate. Previously, that happened in didOutputResults: a method that was always called on in the main thread. Now, this happens in didFinishScanning: a method that will always be called on in the background processing thread. The reason for this is that when this method is called on, the recognition cannot continue since the same thread is busy processing the callback method. This gives you the opportunity to pause scanning while still in the processing thread to prevent changes to the recognizer's result that could occur during processing of the new camera frames that will arrive while the block is being dispatched from the processing thread to the main thread.
  2. There is no longer a didOutputMetadata delegate method. Instead, there is a separate delegate for each and every metadata item that can be obtained during processing. In this way, it is now more clear which methods need to be implemented if a specific metadata needs to be obtained.
  3. The segment scan overlay is renamed to MBFieldByFieldOverlayViewController and now will be part of the SDK. This means that integrating the field-by-field scanning feature into your app will be much easier, as you will not be required to copy lots of code from our sample app into your app to get that behavior. Using field-by-field overlay will now be as simple as using any other scanning overlay.

For more details about the iOS changes, you should always check the updated sample integration apps and documentation.

Android specific changes

Specifically for Android, there are two major changes. First, just like in iOS, we removed MetadataListener and introduced separate callbacks for each and every metadata that could be obtained during processing. This makes it much easier to manage events reported by the recognition process.
Second, we introduced the RecognizerRunnerFragment for a more flexible integration of the built-in UI.


One of the questions we were often asked by developers was how could they embed a built-in UI into their applications' UI. Unfortunately, with the old API that was not possible. Developers could either use our built-in activities or could use RecognizerView to create their own scanning UI from scratch. Creating a custom UI from scratch was too much effort for some and yet they needed our scanning UI within their layout. This usually resulted in the developers using our built-in activities instead of presenting a scanning interface as they originally intended or resulted in a poorly integrated RecognizerView that caused weird bugs and crashes in the final app.

Therefore, in the new API, the RecognizerRunnerFragment is introduced. We created a fragment that controls the RecognizerRunnerView and can be skinned with different built-in overlays. Furthermore, every built-in activity is now actually implemented in a way that it presents the RecognizerRunnerFragment in full screen and adds a specific overlay to it. This is very similar to what occurs in iOS integration. Now developers are given a way to simply present our built-in scanning UI somewhere within their application layout, without forcing them to navigate away to a new activity.

When using RecognizerRunnerFragment or RecognizerRunnerView, notification that scanning was completed will be obtained via ScanResultListener, just like before when using RecognizerView in the old API. However, there are some differences in behavior. Most notably, just like in iOS, ScanResultListener's method onScanningDone is no longer invoked in the UI thread. Instead, it is invoked in the background processing thread to give the opportunity to pause scanning and to prevent changing of the recognizer's result object while the runnable block is being dispatched from the processing thread to the UI thread.

For more details about the changes in Android, you should check the updated sample integration apps and documentation.

PhoneGap, Xamarin, React Native

All the above-described changes affect only the native Android and iOS SDKs. Existing APIs used within the Cordova/PhoneGap, Xamarin, and React Native wrappers will remain the same. However, the bridging code for the native SDK will need to be updated. We will do that for our official plugins; however, if you created your own wrapper around our Android and iOS SDKs, then you will need to update it according to the new API changes.

How does it all that affect me?

To be clear, the next update to the new SDK will not work straight out of the box. A developer will need to adapt your application to the new API. This means that you will need to get new license keys for all your applications and change the integration code. Depending on the complexity of your app, this may take from a couple of minutes to a couple of weeks, so make sure you get prepared to do the work. 

But fear not! If you used the most basic level of integration, there will be only a small set of changes that you will need to apply to your app. However, if you created a custom scanning UI using Microblink SDK or if you used the Templating API to add support for scanning some custom document types, then you will need to be prepared to make some larger changes to your codebase, which may take up to a couple of weeks. But rest assured, we want the upgrade process to be as easy as possible for you, so don't hesitate to ask our engineering teams if you need help.

To help you plan ahead the changes in your applications, we are announcing the SDK release schedule below.

New SDK version release schedule


PDF417 Android SDK was recently released on 22nd January with detailed documentation. PDF417 iOS SDK is scheduled for release in the first week of February. These SDKs will give you a glimpse of the new Recognizer architecture and you will have the chance to test the new license key formats.

BlinkInput SDK

In February, we also plan to release BlinkInput SDK for both Android and iOS, with the new API. This release will also contain a preview version of our next-generation DeepOCR engine. However, DeepOCR will be optional to try it out in your experiment and we would welcome your feedback on how we can improve it. This release will give you the opportunity to play with the new Templating API, the new Field-by-Field scan, and the new Parser and Processor architectures.

BlinkID SDK and PhotoPay SDK

After we release BlinkInput SDK, the plan is to release the new BlinkID SDK and PhotoPay SDK during March and April. However, these releases will depend on the feedback and the number of issues we receive from developers who have tried the new API in BlinkInput or PDF417. 

After we make sure the new API works flawlessly, we will continue porting the BlinkID and PhotoPay SDKs to the new API.

We encourage you to try the new API with PDF417 as soon as possible and to please give us your feedback. We are still actively working on the new API so your feedback will be very valuable to us.

Ultimately, we truly hope that you will enjoy using our products with the new API at least as much as we enjoyed creating it for you.

For feedback and help with integration, please contact us on help.microblink.com.

Monday, November 20, 2017

Insurtech: optimizing processes with mobile data capture

Being one of the most consumer-centric industries, insurance is overflown with challenges of the digital age. While closely related industries, such as banking, are taking brave steps to keep up with the demanding digital and mobile market, insurance seems to be struggling to keep up with trends in customer experience and to cut down costs and fraud losses. Efficient management of these issues is certainly challenging, but let us tackle some of them from the fast-moving mobile environment perspective.

The challenges of digital transformation

One of the major reasons why consumers today are more keen to make important financial purchases online is the empowerment they get from having fast and easy access to all information they need to make the final decision. Empowered consumers like to do research on their own and they may contact an agent or go visit a branch office only when they have concrete questions and concerns. This is not just the case with online retail; even more sensitive and complex services like opening a bank account or getting a loan are now done within minutes via a mobile app - so why shouldn’t insurance services be done with the same ease?
Apart from having information at hand, another important factor valued by consumers is the simple functionality of the research and purchase process. Choices in complex decisions such as insurance options have to be made simple and intuitive, in order to get the consumer to proceed to checkout as soon as possible. Certain parts of the insurance purchase process are dull and take up a lot of time, such as filling in personal information manually or providing physical copies of identity documents. But there is a way to replace the most time-consuming parts of the process with smart tech solutions, and enable consumers to go through with them in just a couple of seconds. By scanning identity documents with help of OCR (optical character recognition), personal data are instantly extracted and can be automatically entered into any data management system. Besides saving a great amount of time, this neat feature doesn’t require heavy investments in new office equipment - it can be done on any mobile device, or if you still prefer desktop solutions, as a part of the existing web application. Talk about killing two birds…

Developing a mobile-centric approach

Such solutions are certainly neat, but only if used as part of a well-defined and efficient digital channel strategy. A common mistake of the majority of insurance providers who are stepping into digital is that they are still trying to mimic the branch office experience and processes. Only a few of them, such as Slice or Inshur, have developed a mobile-centric approach, offering innovative and fast insurance services through mobile apps. Not only did they create more streamlined processes for consumers, but they also shortened the overall processing time. What once was a problem, now becomes a new customer acquisition channel.

Fraud prevention

Introducing new tech into the traditional insurance business processes sounds exciting. However, besides having to be used in the right way, there is another challenge: susceptibility to fraud, which still seems more difficult to manage remotely than in real life. 
Insurance scams are a fruitful area for information manipulation, mostly due to the after-the-fact nature of reporting any unfortunate events. The lag in reporting details in the claims process is useful for both customers and third parties who are looking to exploit the ability to bend the circumstances of such events. Besides eating up the insurer’s premiums,insurance scams done by third parties (medical or car service providers) can have devastating consequences for the customer’s financial or even medical health. 
Don’t get discouraged too early, though. The new disruptive tech offers advanced solutions, e.g. in verification and biometrics, which can help prevent fraud losses while additionally improving the overall customer experience, and is also becoming easier (and cheaper) to integrate into the existing business processes. Even text recognition tools mentioned earlier offer additional security feature: they work locally, on-device to ensure data privacy. 
By combining multiple tools together, you can cover a wide range of insurance services and processes through a mobile app or online.

Easy data input & management

So far we have talked about the consumer experience and protection of their data. But let’s look at the other side of the coin. Solutions mentioned above are not only useful for the consumer, but they also enable more productive and efficient internal processes. By moving multiple steps of insurance services to mobile and getting consumers to do the steps themselves on their own devices allows for a faster and more effective information flow. Instead of typing, fulfilling a claim form through a mobile app can be incredibly easy and fast by scanning the required information, like VIN, driver’s license, or health care card. When combined with corresponding data management systems, this feature can generate efficiency on a large scale, both in day-to-day operations and in complete business processes.
We hope this helps you realize the extent to which digital transformation has become an integral part of every industry - so now it’s time for insurance to speed up and move from the branch offices and web to mobile-first services.

Related information

BlinkID - SDK for scanning identity documents as the first step in remote user acquisition
BlinkInput - SDK for scanning various predefined numbers or text, eg. VIN, health care card, insurance number...

If you’d like to know more about OCR solutions we offer, get more info on how they can help optimize insurance processes, or explore various other use-cases, feel free to get in touch with us here. We’d be happy to have a chat with you! 

Monday, July 31, 2017

Mobile ID verification made easy

Many simple services today, such as transferring money or checking account balance, can be done with help of mobile banking apps within seconds. However, with a rising number of customers inclining more towards mobile, their expectations of getting quick and practical service are also rising and expanding in scope. For businesses looking to enter or further develop in the digital realm, this is both an opportunity and a challenge. Great opportunity lies in creating more streamlined customer service processes, but when it comes to more complex procedures, they usually come with security concerns. Here is our take on one such issue - remote identity verification.

Remember Alex, the teen who last year accidentally spent over $700 at Sephora on her mom’s credit card? Unfortunately, the anecdote is far from the worst things that can happen in handling sensitive virtual transactions. Frauds and identity thefts are serious issues for an increasing number of businesses seeking to move their operations to virtual space, and the issue of matching a person’s digital identity with their real-life identity is not an easy problem to tackle.

Overcoming trust, security and user experience challenges 

What kind of solutions are out there to curtail security risks? Most recent trends have shown moving towards a more holistic approach to verification, where multiple factors work together as parts of an integrated verification process. This is largely affected by increasingly stricter regulations for businesses, such as KYC (Know Your Customer) for financial institutions. One such approach which has been present for a while is multi-factor authentication. It’s a method where there are multiple steps that provide evidence of identity to an authentication mechanism. A common example of MFA is an additional security question that is sometimes required next to the username and password. However, now it offers even greater protection, because it is able to combine several data sources to verify one’s identity. Aside from passwords, security questions, and SMS authentication, biometrics (in the form of fingerprint, iris, face, or voice recognition) is increasingly used as an important factor in matching identity data with a live person. This makes remote identity verification just as secure as if it were done live, in branch offices. The most advanced solutions ensure accurate and safe verification process, but are also easy to integrate, fast, and user-friendly.

Possibilities are numerous, but here’s our quick walkthrough of how a remote identity verification process would look like in a couple of steps:

Instead of entering ID data manually, any document data can be extracted with blinkID, an easy-to-integrate SDK which enables real-time data extraction with high precision. Considering sensitivity of such data, scanning and extraction is done offline and locally on a device. The client decides how to handle the information from the security point of view.

This step ensures that a user is an actual person. There is a number of software solutions which offer variations of face movement recognition (Visage Technologies, BioID, Applied Recognition, and many more). With one such software integrated into the verification process, all the user has to do is point the phone camera towards them and do a few movements, such as blinking or smiling.

Finally, this step brings the first two together - it matches the data extracted from the user’s identity document with the user’s real-life identity. There are intelligent facial ID recognition solutions being developed around the world, which combine biometrics with artificial intelligence and machine learning to offer precise and accurate user identification (look up Innov8tif, Microsoft, Megvii or NEC).

Biometrics integration

Biometrics solutions can be added into any of these steps to contribute to the security of the identity verification process (eye-scanning technology, voice recognition, finger- or palm prints, and alike).

What about twins?

Skeptics may argue that remote identity verification may not be able to tackle some challenges of live identity verification. For example, no matter how advanced biometrics and liveness detection softwares may be, could they tell the difference between identical twins? To this we say - could a person behind the counter in a branch office tell them apart? Live identity verification, in this case, is equally, if not less secure than remote verification.

The process we described takes seconds to complete because it eliminates the need to manually enter the required data. However, it’s clear that a complete KYC process cannot be completed with only these three or four steps and that additional checks and regulatory compliance are required. That is why all of the solutions usually come in the form of SDK or API and they are easily integrated into any app. From the users’ perspective, these are simple actions that could be done anywhere without being time-consuming. After all, isn’t that the main purpose of all new services in a mobile-centric world - to offer seamless engagement without compromising UX?  

Beginning of July, Zagreba─Źka banka, a part of the Unicredit Group, launched remote account opening service within their mobile banking app. It features a very simple, three-step identity verification process. Additional security and background checks of the complete process are extensive, but this doesn’t affect the user experience and a bank account can be opened in less than five minutes.

Besides remote account opening, there are many use-cases where identity verification is necessary: from hotel and airport check-ins, security checks, citizenship and immigration assessments to voter registration, amongst others.

Any questions? 

We hope you found this insight into ID verification useful. If you have any questions, feel free to get in touch with us via social or e-mail here. We’d be happy to have a chat with you!

Monday, February 6, 2017

Seamless prepaid user registration

In a number of countries, regulators require telecom companies to register their new prepaid users. Before these regulations, the process of purchasing a prepaid SIM was effortless. People would buy them just about anywhere and used them in a few minutes. Today's demand is to make buying a SIM card safe, but remove as much friction as possible to achieve great user experience.

Security vs. UX… and cost

The story is as old as the technology itself. All onboarding processes are essentially a trade-off between smooth user journey and secure processes while keeping it all cost-effective.
Manual input of personal data into a mobile app form just doesn’t cut it for either side. For users, it means a lot of typing. For the Telecoms, on the other hand, it means unreliable data that could lead to noncompliance with regulations or an expensive backend checking process to ensure security.

Machine-based solutions can be effective when it comes to UX and time but are often expensive. However, there is a solution offering a good performance on all three aspects (time, cost and UX) that usually drive the decision.

Simple registration via mobile app

Instead of manually inputting data into the app, a mobile camera can capture all required information in real-time.
Microblink offers software development kit that uses a camera to accurately scan both, SIM card serial number and identity document.

Therefore, for telco sales or their third-party resellers, the registration process can be completed anywhere simply through a mobile app. The SDK can be integrated into any mobile app and operates 100% offline. With no server-side components, the personal and other data remain only on a device.

For countries where data privacy regulations don’t allow taking ID images, SDK can be used to only extract data without ever saving the image in the phone memory or anywhere else.

Challenge us @ #MWC17 Barcelona!

Our technology will be presented at the Mobile World Congress in Barcelona from February 27th to March 2nd. We’ll be happy to demo our solutions and discuss your use-case.
Meet us at Hall 8.0 Stand F21.

Thursday, January 12, 2017

Where were we and where are we heading?

Looking back at 2016, we can say it was, without a doubt, a very successful year for us. So let's recap the most interesting events of 2016 and tell you what's in store for 2017.

2016 in a nutshell

The year kicked off with a spin-off of our Photomath app into an independent company. Something that started two years ago as a showcase of our state-of-the-art OCR technology grew into one of the most popular apps worldwide in the educational category!

In September last year Photomath became world’s first app that solves handwritten math problems in real-time using the mobile camera. During the last 2 months, Photomath spent 29 days as #1 in the educational category of the US App Store, with over 40M downloads and more than 600K daily active users. At the end of the year, it was featured as one of the best apps of the year by German Der Spiegel magazine, Spanish El Pais, Austrian The Kronen Zeitung and other international media.

As the year went by, Photopay was implemented in several new mobile banking apps in the CEE region. In the same time, we added support for documents from more than 100 countries to the BlinkID. Expansion of our customers’ network required gathering new talents so our team grew to 50.

International Exposure

We gained a trust of over 100 new companies from all over the world, from large enterprises to startups, and increased our revenue by 100%! By the end of the year, our technology was being used by millions of end users in over 60 countries.

We went to the epicenters of innovations by participating at some of the most vibrant fintech events in Europe, Middle East and USA. It was a great opportunity to catch up with partners and clients, which resulted in promising plans for expansion of our existing cooperations.

The conferences and meetings with the industry leaders helped us better understand the financial industry challenges in 2017, especially when it comes to PSD2 in Europe. This had a big effect on our product development in order to put the client’s needs in the core of our innovation.

In Q3 & Q4 we proudly introduced two new products in our OCR family and once again proved that machine vision surpasses the obstacle of manual data entry and enables the best user experience for mobile apps.

The first one is an enhancement of a well-known BlinkID product. To our simple and fast ID scanner we added a new authentication functionality. Identity check is enabled in 3 simple steps in order to easily onboard new customers - scanning and extracting data from the ID, liveness test and face matching.

The other one is BlinkReceipt, real-time receipt scanner that enables capturing a complete retail receipt and extracting data on an item level. Very useful for personal finances management, marketing analysis or loyalty programs.

So, what’s cooking in 2017?

We will continue to market new, cutting-edge technology products. In 2017 we’re investing a great deal of time and resources in developing learning systems in order to upgrade our OCR to the new machine learning technology. Machine learning represents a turning point in the way problems are solved in computer science, and it has caught a lot of attention within the academic community in last couple of years.

With the new approach our existing real-time text recognition will be even more accurate but, more importantly, we’ll be able to add new features and cover new use-cases, languages and various documents more easily.

Stay tuned!

Wednesday, November 2, 2016


The US presidential elections were never more interesting and the tight race who will replace Barack Obama as the president in the White House has reached into its last week. The 2016 Presidential elections are being called the most important elections since 1932 and that is because of the major differences between candidates and their parties. One of the crucial things is that lots of voters still have not registered. A 2012 study estimates that 24% of the voting-eligible population are not registered to vote, meaning at least 51 million US citizens.

But how do the presidential elections in the US even work? 

Almost all of the states require citizens who wish to vote be registered 2-4 weeks before the election day.
As eligibility is not automatically presumed, how is a voter registered? Usually, it's required for a citizen to register in a jurisdiction of a residence, while only some states accept registration at the county level. Till the mid 90s registration was only possible at the governmental office. Afterward, in order to increase voting turnout, the process was simplified and registration was possible at the DL registration centers, schools, libraries, disability centers and by mail-in. This year online registration was approved as well.

Online voter registration

Online registration means avoiding paperwork and filling out the forms on a website instead. Validation is done by comparing the given data with the identification document. The signature previously recorded by the state becomes also the voting signature. In case the information doesn't match application is sent for further review. Online registration meant significant savings of the tax-payers’ money, from 83 cents per paper registration to only 3 cents for online registration.

Mobile App registration

In today’s modern world smartphones have become the main and ultimate means of interaction and source of information. Therefore, a mobile App voter registration solution makes the voting process accessible to the young people in a format they are already familiar with. Most apps of this kind provide not only the registration but the general information on the candidates, ballot issues and information about polling location.

Advanced technologies enable extraction of personal data using a mobile device camera. Now, there’s no longer need to manually enter data. Within an app, it’s enough only to scan a driver’s license and the information is automatically extracted. MicroBlink offers this functionality and improves the user experience of several voting mobile apps. The scanning and the data extraction take less than a second which makes the whole process really attractive for the millennial voter population.

The future of online voting

Now, there are attempts to make the voting system online so that even more people could vote. In that case, smartphone apps would resolve many of today’s voting problems. The process for the voters would be very simple, meaning registered voters could cast their vote in real-time using their mobile device, right up until the polls close. There are opposing opinions on this system. Many argue that smartphones would be too complex for the elderly. On the other hand, there are many positive aspects like decreasing the likelihood of invalid ballots (e.g. marking 2 candidates instead of 1) and informing voters better about the candidates. Anyhow, for these changes we’ll have to wait for the next elections.

There are only a few days left and it’s predicted that it will be very tight race counting on each vote. It will be interesting to see if these simplifications and introduction of mobile app registration would have an impact on the number of voters.

Thursday, October 20, 2016

Improving Mobile User Experience with the Best OCR

After successful participation at the Money20/20 Europe MicroBlink has chosen Las Vegas to showcase the newest product for the financial industry. Fast and easy way to acquire new customers remotely.

Consumers want access to banking and financial services any time any place through their mobile devices. With the machine vision software MicroBlink surpasses obstacle of manual data entry and enables the best user experience in mobile apps.

MicroBlink developed new functionality for mobile verification. SDK includes complete ID scanning, liveness detection and face matching between the photo on the ID and a selfie. Due to advanced methods for face matching verify feature of blinkID provides reliable way to confirm user’s identity to remotely access financial services while minimising fraud risk. Main advantage of blinkID is that it operates offline therefore sensitive customer data remains only on a device.

Technology for enabling true digital experience should make the process of making bank transactions easy, onboarding fast, interface seamless and beautiful with user-centric approach.
Today, mobile vision technology is ready to deal with such challenges. Why not use smartphone camera to read data from checks, invoices and other payment slips for quick data input, or to scan IDs for KYC process?

MicroBlink’s advanced technology required years of AI based research and development and now it’s available as a component easily integrated into all sorts of digital services like mobile or web apps to eliminate manual data input.

Interesting Fact

Photomath - the world’s first app that solves handwritten math problems in real-time by using smartphone camera - is built on MicroBlink's technology. It’s among top 5 educational apps and has more than 36 million downloads.

More information on MicroBlink:

Contact for meeting and scheduling a demo:
MicroBlink is exhibiting at the booth 939