Highly Optimized Image Processing
StreamShot is based on finely tuned image-processing algorithms delivering a maximal throughput and lowest latency. A powerful dynamic-window binarization is used to generate high-quality bitonal images and to extract stable text segments on difficult image backgrounds. Application-specific features can be manually configured enhancing the significance of input data for generic classifiers.
Object Matching Based on High-Precision Contours
StreamShot achieves excellent recognition rates on small structures by extracting super-resolution contours directly from gray and color images. In contrast to pixel-based image data, quality and processing speed is independent of the orientation resulting in a highly flexible and robust detection. High-resolution contours can be used as ideal input for shape-based classifiers as well as for locating objects in any rotation.
In-memory Fault-Tolerant Database Matching
Recognition errors of OCR engines should be kept to a minimum but cannot be avoided completely on a character level. Therefore it is essential to use redundancy at a word- or document level. StreamShot employs a hyper-fast in-memory database matching which can correct for errors at two levels. It can recognize single words or phrases with errors by finding the best matches with directories. Based on many words, it can find the best matches of record-like data structures and resolve complex identities like addresses and customer IDs. Its compact memory footprint allows to use this technique even on mobile devices.
Configurable Decision Trees
When solving complex recognition tasks, it is difficult to achieve highest recognition and lowest error rates based on a single generic approach. Instead best-of-class solutions combine different and complementary algorithms in order to tackle a large variety of challenges. recosys has more than two decades of experience with tuning of recognition engines on large data sets. It has build a highly efficient and flexible environment for development including visual inspection and systematic logging.
Decision trees allow the deployment of incremental improvements with a maximum stability of the system’s behavior: Instead of delivering a newly trained generic classifier (with potentially unknown behavior), decision trees can be used for finely controlled specific error reduction.
In addition to specific feature detection StreamShot makes use of a variety of classifiers including template matching (nearest-neighbors), Bayes classifiers, neural networks, and decision trees. When large applications are deployed, the input data may be harvested and analyzed automatically. Due to different behavior of different approaches an automatic training can be implemented. Ideally training data is derived from human postprocessing, resulting in a powerful “Robotic Process Automation (RPA)” approach.