Our minds are so complex, they constantly capture and use context you most likely have never even considered. Many illusions (or the latest blue/black vs. gold/white dress controversy), are due to the fact your mind is working without you really being aware of what it is doing.
Our minds scour our memories and surroundings to try to piece together the understanding of what is put before us each and every moment of our lives.
The challenges our minds face gathering context for understanding, are the same challenges Intelligent Character Recognition systems experience when gathering content. Computers, however, have to work with much less context than many users realize. And for now, ICR engines cannot make a conclusion based on the past the way our minds do. However, ICR engines and content capture solutions continue to make strides to solve such challenges.
In the meantime, it is important to give such systems as much context as possible to accurately determine a document and its content. How do we currently do this?
Below is a simple ICR problem we recently experienced with a customer. The user requirements were to determine a set of number pairs written freehand at the top or the bottom of a page. This turns out to be a very common requirement we face and a very difficult problem to solve.
Take a look at the characters that were given to the ICR engine to be interpreted as a number. Can you tell which number the images below are supposed to represent?
How difficult was it to recognize the numbers? With numbers, you have a very little context to determine what a character can be except you know it to be 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. What happened when you saw the image that looked like the letter ‘A’? Did you consider it a letter or did you stick to the instruction this was supposed to be a number and consider it a similar number? For ICR engines if you configure it to only interpret numbers, then the image that looks like the letter ‘A’ becomes a ‘4’ or even a ‘9’ determining the curvature of the character.
In order to provide more context to the ICR engine, fields should be added to the document being captured. These fields are usually boxed character fields to help define the height, width, and spacing of each character. You most likely recall such boxes on Federal Forms like below:
Simply providing these three extra pieces of context improves the accuracy of an ICR engine dramatically. Where the customers are getting at best 80 to 85 percent accuracy, having well-formed fields can improve accuracy to 95 to 98 percent.
At this time ICR helps a lot with normal printed letters and numbers. For letters, ICR can even determine a character based on dictionary terms. It would be interesting one day to see if any of the ICR engines begin to consider collecting data on particular handwriting styles, it commonly views, so the next time it sees a particular shape it can better determine a number or character based on prior handwriting patterns.