Learn About Critical Document Capture CapabilitiesWhat is Document Capture?

8 Document Capture Must Haves

Document capture software provides the ability to “capture” your paper documents. Simply put, it’s the software that works with your scanner to take the scanned file, associate or tag it with meaningful terms (indexing), process it to increase its usability, and often integrate it with a document management system.

Most scanners provide integrated software to provide basic processing of the scans. Intelligent data capture software adds functionality which aims to increase efficiencies and lower costs. This article introduces you to functions and features of intelligent data capture so that you might better evaluate your needs.

Whether you want to capture patient records, invoices, freight tickets, student records, waybills, accounting records…any type of document (even existing scanned files) consider these eight “Must Haves” for your capture solution.

1. Support for Multiple Input Devices and Formats

Does the capture system work with standard input devices and file formats?

  • MFP Devices
  • Standard TWAIN Scanners
  • Embedded Touchscreen Scanners
  • Existing PDF or TIFF Files

2. Flexible IndexingExtract barcode information for indexing

Indexing is the process of tagging or associating information with a file so it can be used for search and processing purposes later. Index information is also referred to as “metadata.” Great care should be taken to design an efficient indexing scheme. If the design is not devised correctly at the outset, trying to rectify it later can be both difficult and costly. There are a variety of methods and technologies used to extract this data. Read more on Indexing.

Can I extract metadata or index information using?

  • Barcodes
  • Zonal Optical Character Recognition (OCR)
  • Drag and Drop OCR
  • Manual Entry

3. Automated File Naming, Splitting and Routing

Can I automatically name, split, and route or distribute files based on:Extract barcode information to name and route files and folders

  • Barcodes
  • Image OCR’d Text
  • System Data
  • Extracted Data

Many documents contain vital information in the form of 1-D (Linear) or 2-D barcodes. With an intelligent data capture solution, these markers can be used to identify new breaks in documents as well as name the file or destination folder. Read more about barcodes and what they can do here.

Text strings captured from within the document by OCR technologies can also be used to split, name, and route files with an intelligent data capture solution. And, system data such as date, time, and page can be used to automate the splitting, naming and routing of files.

4. Intuitive Interface

Is the system user friendly?

  • Easy Page Navigation

  • Touchscreen Enabled

  • Onscreen Keyboard

  • Preview Extraction and Splitting


5. Standard Output

What file formats and output can you create?

  • PDF
  • Searchable PDF
  • TIFF
  • CSV or XML

Well-established, standard formats such as PDF and TIFF dominate the document output choices. Both formats support black and white, grayscale and color and multipage output. PDF format leads the choice when more advanced features are needed within the file such as embedded OCR, hyperlinking, digital signing and other security features.

Intelligent data capture solutions should support the creation of CSV or XML formats. These standard file formats are used to export indexing information to a document management or search and retrieval system.

6. Integrates with Existing Document Management or EMR

Can I easily share index data with my Document Management or EMR System?

When designing your workflow and data capture solution, a key consideration is whether you will implement a document management or EMR/EDR/EHR solution, or a simple search and retrieval system. If your capture solution can export CSV or XML format files, you most likely will be able to share index information with a management solution. Some businesses begin a capture implementation and capture their documents into a file folder hierarchy that is a bridge to a more sophisticated document management system implemented at a later date.

7. Batch ProcessingUsing barcodes to split scan stacks

Can I process a whole folder of documents, “watch folders,” or process a stack of documents all at once?

Batch processing allows for unattended document processing where a scan operator can scan a stack of documents and let the capture software process it based on pre-established rules or profiles. Some advanced document capture solutions can watch or “poll” Windows folders and process new files as they are put in the folder. The processing can include technologies and methods mentioned previously such as barcode and OCR to automatically, split, name and route documents. With batch processing, user intervention is limited and processing time and costs are optimized.

8. Image Enhancement

Clean up images with adaptive thresholding technologyCan scans be cleaned or enhanced? To improve usability and increase accuracy of OCR technology, image enhancement is required in a document capture solution. Typical image enhancement might include, deskew, despeckle and rotate functions. Truly intelligent capture should also include options to remove blank pages, remove separator sheets, autorotate, remove lines, and adaptive thresholding. Adaptive thresholding technology assists in cleaning “dirty” documents or documents that have a colored background which interferes with the foreground data.


Document Capture Nice to Haves

Now that we’ve discussed key elements of intelligent document capture that should be considered in the design of every document capture workflow, let’s discuss options or feature that would be “nice to have” in your workflow.

1. Search and Retrieve

Does the capture system double as a search and retrieval system? As discussed earlier, document capture solutions should allow you to integrate your scans and indexing information into a document management solution. Some document capture systems include a simple search and retrieval database that can be used as a stepping stone to document management at a later date. Or, if you just need a simple solution, the capture software database might fit your needs completely.

2. Easy Administration

Consider these questions which can make the life of your administrators easier.

  • Can I create reusable templates or workflow “sets”?
  • Can I pretest my workflow instructions

3. Ability to Use Regular ExpressionsContent mining through regex

Regular expressions (regex) provide a fast and powerful method to search, extract and replace specific data found within scanned documents. Regular expressions are essentially a special text string for describing a search pattern. You could think of regular expressions as extremely powerful wildcards that the intelligent data capture software can employ to find your key information. Regex is extremely flexible and patterns can be constructed to match almost anything. Learn more at Using Regular Expressions for Automated Data Capture and Extraction.

4. Digital Rights and Security

Can I restrict viewing, printing, cut/paste and apply passwords to the scanned documents and files? Are there auditing functions?

With today’s privacy concerns, especially surrounding HIPAA compliance, security of your documents and information is key.

5. Rollback Features

Can I return to the original scanned image if needed? What kind of backup features are included?

6. Branding Opportunities

Can I brand the solution for multiple departments or customers?


So where do I go from here?

Learn more about DocuFi’s intelligent data capture solution,  ImageRamp.