Armedia has a customer using Captiva 7 to automatically capture tabular information from scanned documents. They wanted to export the tabular data to a CSV file to be analyzed in Excel.
Capturing the tabular data in Captiva Desktop proved to be simple enough, the challenge was in exporting it in the desired format. Our customer wanted each batch to create its own CSV file, and that file needed to contain a combination of fielded and tabular data expressed as comma delimited rows.
Here is an example of one of the scanned documents with the desired data elements highlighted.
Here is an example of the desired output.
EMPLOYEE,EID,DATE,REG HRS,OT HRS,TOT HRS
As you can see, the single fields of Employee Name and Employee Number are repeated on each row of the output. However, because Employee Name and Employee Number were not captured as part of the tabular data on the document, this export format proved to be a challenge.
Here’s what I did:
- In the Document Type definition, I created fields for the values I wanted to capture and export (
Name, EmployeeNbr, Date, RegHrs, OTHrs, TotHrs). Here’s how it looks in the Document Type editor:
- In the Desktop configuration, I configured:
- Output IA Values Destination: Desktop
- Output dynamic Values: checked
- Output Array Fields: Value Per Array Field
- Finally, I created a Standard Export profile that output the captured fields as a text file, not a CSV file. I named the file with a “CSV” extension so Excel could easily open it, but to create the required output format, the file had to be written as a text file. Here is what the Text File export profile looks like:
The content of the Text file export profile is:
EMPLOYEE,EID,DATA,DATE,REG HRS, OT HRS, TOT HRS
---- Start repeat for each level 1 node ----
---- Start repeat for each row of table: Desktop:1.UimData.Hours ----
---- End repeat ----
---- End repeat ----
By using two nested loops I was able to access the non-tabular fields,
EmployeeNbr, as well as the tabular fields in the same output statement. This looping feature of the Text File export profile saved having to write a CaptureFlow script to iterate through all the table variables and concatenate Strings for export. A nice feature, but not well documented.