Intelligent Document Analysis for large Digital Mailroom Solutions (IDA-4-DMS)

Large companies and institutions especially in industries like health-care, insurance, finance and government have to process hundreds of thousands of incoming documents daily. Some of these documents are already electronic (e.g. email), but most of them are printed or even handwritten.
To handle huge amounts of incoming mail, all these documents quickly needs to be:

  • converted into a standardized electronic format (e.g. PDF)
  • sorted by specific destinations
  • pre-classified by categories and urgency
  • pre-analyzed regarding specifics content.

IDA-4-DMS is able to convert heterogenous input data (e.g. images, PDF, handwritten, machine printed) into an electronic standard format (e.g. PDF, JSON) in a fully automatic process. The extensive use of the latest state of the art and multiple award winning Artificial Intelligence for text recognition and document understanding in combination with GPU based supercomputing enables an unrevealed accuracy and speed. IDA-4-DMS is able to process more than 50.000 pages per hour on a single 4xGPU server (e.g. IBM’s Power AI) by supporting a full scalability towards even larger solutions. IDA-4-DMS supports local as well as server or even cloud based solutions on standard hardware, but is also perfect for reducing hardware and maintenance costs by utilizing optimized and dedicated hardware for high performance computing in large scale applications. Our easy to use API allows us to integrate IDA-4-DMS into any custom workflow in a flexible manner.
For more information please check out our website: or ask us for details or reference installations.

Image 1: processing scheme of IDA Core

Use-Case: Health Insurance

IDA-4-DMS is processing more than 300.000 daily incoming documents within a time window of 6 hours for a German health insurance company since April 2017. IDA-4-DMS was integrated into the already existing workflow by our partner IBM and runs on a high performance computing server in a local computing center (private cloud server). On customers request IBM did a performance comparison regarding the reading quality with a previously leading OCR system based on a collection of 10 different document categories. All categories have been known to be challenging for text recognition including very difficult script, the results can be found in the table below.

Image 2: performance comparison with a leading OCR engine on 10 different tough document types, IDA core has a significant improvement of more than 20% in the average character based accuracy.

Recent test on a new High Performance GPU server from IBM – Power AI (4xP100 GPU) – have been showing a performance of more than 50.000 pages per hour. IBM announces to offer a complete package, as a compact solution to reduce hardware and service costs significantly.

Download PDF