Mohomine , Inc.
Software development
Info
Mohomine develops technology that automatically classifies and extracts text from unstructured documents -- those documents in which it can't be predicted where salient content will appear. According to Gartner Group, 80 percent of business documents are unstructured, including e-mail, Web pages, PDF files and paper contracts. The rest are structured, such as forms, or semi-structured, such as invoices. Information capture technology addresses an extremely small percentage of this unstructured data and a somewhat larger proportion of semi-structured and structured documents, such as forms. Until now, capture systems could not classify or index unstructured documents without substantial manual intervention -- document preparation, manual indexing, and so on. In many cases the cost of this approach has outweighed the benefits of electronically capturing the information. Mohomine technology helps to solve those problems. It reduces manual labor costs and improves the speed and quality of information access by automating the process of data entry and document classification. Mohomine's two primary technologies are the mohoClassifier and the mohoExtractor. Both technologies are:Highly scalable: The pattern-recognition techniques used by Mohomine can process huge volumes of text data on low-end, inexpensive hardware. Humans can classify 20 to 100 documents per hour. Mohomine software can classify 20 to 100 documents per second. Another way to look at it: The mohoClassifier can categorize between 50-100 megabytes of text a minute on desktop hardware, which approximates a 300-page novel per second. Language independent: Unlike many natural language processing approaches, Mohomine's patter- recognition software doesn't rely on understanding each language. It has been used successfully with many European languages, Arabic and Chinese. Highly accurate: Accuracy ranges from the mid-60 percent to the high-90 percent, depending on the type of document. According to Gartner, a classifier that achieves 60-percent accuracy justifies the cost of installing and maintaining the system. Easy to deploy and integrate: The learn-by-example architecture, combined with easy-to-understand and use APIs, enables the Mohomine software to be packaged within existing information capture products and deployed by non-classification and extraction experts quickly and at low cost. Competing, rules-based architectures require the labor-intensive development and maintenance of ontologies for each unique application. Mohomine was acquired in April 2003 by Kofax, the product development center of DICOM Group. Mohomine technologies dovetail perfectly with Kofax information capture technologies for document scanning, XML capture, distributed capture. The result is a powerful, end-to-end solution for automatically capturing extremely large volumes of unstructured data and delivering it into any of 100 content and document management applications. For more detailed information on Mohomine products and technology, visit our Products section or contact us.
Industries / Specializations
Software developmentMap
5120 Shoreham Place Suite 200, 92122 San Diego