Over the past 12 months (or maybe longer) I have been involved with 2 separate projects involved in Alfresco integration with Daeja ViewONE Pro.
Daeja also offers modules to extend the functionality of its offering. Let’s take a look at the two functionalities that i needed and how Daeja delivered on these needs:
For this i used the PDF Module, which adds PDF support to Daeja ViewONE Pro. This allows for users to annotate PDF files in ways that are outside of the way that Adobe offers these services.
The Permanent Redaction module came as a result of clients requiring annotations on documents which, for security purposes, needed them to be burned into the document itself. A redaction is a special annotation. When placing a redaction onto a document it’s intended purpose is to obscure the information below. In other words, make it unreadable, unrecognizable to anyone or anything. This reminds me of the film Good Morning Vietnam when Adrian Cronauer if given news to read out most of the pages contain black bars. This is redaction, and in this case permanent redaction. See Example Image Below
This is where Daeja’s offering of the Permanent Redaction module comes in. It can take the annotations** on the document, pass them through the Permanent Redaction Module and burn them into the document, producing either a TIFF or PDF as output. Please do note, it is up to the client and implementer as to how and when the burning takes place and who has access to original content and who has access to burned content.
OK, enough about Daeja. Back to the projects.
Annotation Example: Project 1
The client was not concerned about maintaining an annotations file. They simply wanted all redactions to be burned into the document and for the document to be versioned (so as to maintain a history of redactions). This was to support redacting PII (Personally Identifiable Information) in support of FOIA (Freedom of Information Act) requests. In this case all documents being handled were PDF. If you remember one of the output options of Permanent Redaction was PDF. This made it simple in that each new version was the result of burning. This greatly simplified the model around security.
The burning process can be implemented in 2 ways, on-demand and background processing, and the client wanted the burning process to be on-demand.
On implementing the burning process it was determined to run this in a separate Tomcat instance from Alfresco. Not knowing how frequently documents were be worked on and knowing that the size of some of these PDF’s could in fact be over 200Mb in size offloading this to another instance seemed to be the correct way to go. Daeja provide example code on how to implement the burning process but it is up to the implementor to customize this for the particular environment this is intended for.
Now, you may ask if they are so awesome why use Web Services as in combination with Web Scripts. The issue for this client was the size of the PDF file. Potentially 200Mb or more. This is a limitation of Web Scripts in that uploading large files is a memory / performance hit. This is the main reason Web Services was included in the combo. As part of Web Services there is a highly optimized set of functions for streaming content into the repository. Thankfully Alfresco also included the ability to share the authentication token between the different API’s! To summarize for this client, Daeja ViewONE Pro is integrated into Alfresco. When a document is opened for redaction, the applet is rendered to the browser through a custom webscript. When the user clicks on the ‘burn’ button, the noderef of the document is sent to the Permanent Redaction instance running in the separate Tomcat instance, along with the annotation data. The custom code running under the Permanent Redaction instance fetches the document from the repository via webscripts, applies the annotations and burns them into a new PDF. This PDF is uploaded to the repository as a new version of the starting document.
Annotation Example: Project 2
The second project followed more of the traditional annotation model. The client wanted the ability to annotate documents based upon content type, content format and user role and permissions. One of the requirements was to NOT version the document. To be honest, requirement I like. The document itself is not changing so why constantly version it? The way Daeja works is it maintains a separate content file with the annotations. The initial integration of Daeja and Alfresco was by Dr. Qu (Alfresco) and further continued by Jared Ottley (Alfresco). This initial integration includes a basic annotation content model which associates the annotation file to the document as a child association. Simple and effective. The one thing that was annoying me for this client, it would be nice to version the annotations applied to the document so that there was a visual history. Using the power of Web Scripts and Free Marker a template was created to list versions of the annotation file associated with a document, if it has one.
Some more information on Daeja is the ability to specify a server side script, cgi, etc. the call when the annotation save button is pressed. Web Scripts are invoked via HTTP. This means the Daeja save annotation button can invoke a custom webscript. This webscript adds the versional aspect to the annotations file (if it is not already attached) then versions the annotation file. This webscript also attached the Free Market template to the document to provide access to the versioned annotation files. When clicking on a previous annotation this opens the document with the specified annotation and displays this in Daeja ViewONE Pro. Oh, and this is in read only mode so that previous versions cannot be rewritten!
It should be noted the user/date label being applied to the document was actually an annotation. Annotations must be on the document and not fall outside the page boundaries. This meant some calculations were needed to determine the location of the annotation and to decide if the label would be applied below or above the annotation. The next consideration was how close to the right hand margin was the annotation. I would like to say some complex mathematical formula was implemented but it was some simple maths that came to the rescue. Actually this functionality is most likely cause for another blog as there are other things to consider.
This project was different to the first project but both were great learning experiences in the areas of annotation and redaction and what clients are looking for. It has provided great information and ideas on what else could be done in the area of integration to provide even more cool integration features.
*currently Daeja supports over 300 document types!
**TIFF and PDF support a subset of annotations which can be burned into the document.