Did you know that the industry standard for manually transcribing 1 hour of clear audio is 4 hours? That is a 4 to 1 ratio. Poor audio could take as much as 9 hours to transcribe 1 hour (9 to 1 ratio), which means a lot of manual hours are needed to transcribe audio and video files. Think about how much Society relies on audio and video files these days. Recently, we were asked to help solve this challenge by a law enforcement agency. This agency was spending about 8 hours for every 1 hour (8 to 1) of body camera or interview audio or video. Our solution – ArkCase, Alfresco, and AWS – originally proposed for their IT Modernization project addressed this challenge. Intrigued, they asked for a demo. With a clear understanding of the opportunity to further demonstrate value, we asked for a few weeks to prepare the demo.
Before We Get to the How, Let’s Describe the What
What are the platforms being used in this solution?
- For starters, ArkCase is a low-code IT modernization or case management platform. ArkCase aims to be the premier open source low-code case management platform.
- Alfresco is the leading open source enterprise content management platform providing core document management, business process management and records management services.
- AWS is the leading cloud-based platform as a service (PaaS) supporting our ability to comply with FedRAMP, CJIS, HIPAA, HiTECH, SOC2 and other security controls.
ArkCase provides an intuitive, accessible and responsive user interface to view, stream, and transcribe rich media files. ArkCase integrates with AWS Transcribe service to provide high volume, high quality and cost-effective transcription of audio and video files. The transcription functionality in ArkCase allows users to upload audio and video files within ArkCase and configure whether files are automatically transcribed or manually sent to be transcribed allowing organizations to control their costs. ArkCase then uses an Alfresco Activiti workflow to process the files using AWS automatic speech recognition (ASR) service, which will produce the transcription files. The outcome of the process is a highly visual transcription with close caption and that can be searched and edited for enrichment. In addition, the transcription can be compiled into a Microsoft Word document from sharing. The user interface allows for streaming the rich media file while viewing the transcription text. This is extremely helpful when manually QAing the transcription text.
What Does ArkCase Transcription Functionality Provide?
The ArkCase user can perform many different actions on the transcribed file:
- Viewing of the File Details
- Total Word Count
- Confidence Rating of the Transcribed File
- Transcription Status (In Process, Complete)
- Listen or View to File in a Streaming Viewer with Close Caption
- Searching for text within audio or video files
- Jumping to that section of the audio or video
- Tagging the file based on the transcription
- Viewing Individual Sections of the Transcription Text
- Each Transcription Text Section Shows:
- Start Time of that Section of Text
- Confidence Rating of that Section of Text
- Editing Individual Sections of Transcription Text during QA
- Automatically Compile the Transcription Text into a Single Document File
Can ArkCase Transcription Functionality be Configured?
ArkCase transcription functionality also provides administration configurations for certain transcribe options within the ArkCase application:
- Enable Transcription to turn on transcription
- Automatic or Manual Transcription to decide if you want all rich media files processed or to manually select the files you want to be processed
- Word Count per transcribed section for chunking and readability
- Confidence Threshold for highlighting sections that may need human reviewing
An Administrator can control the transcription functionality for all users in the ArkCase application, by enabling or disabling the functionality. If the functionality is enabled, the rich media files can be sent to AWS for transcription manually or automatically. If the Admin selects to enable automatic transcription, then each rich media file that is uploaded into ArkCase will be automatically sent to the Amazon Transcribe service. The Admin can also control the word count and confidence threshold for each section of transcription text. ArkCase will flag any sections of transcription text that do not meet the configured confidence threshold.
ArkCase transcription can support many different use cases, no matter your specific business. If you want to gain efficiencies and get more value from your audio and video content, let’s talk.
In recent years, multimedia content is more used than ever. This means that the world data trapped in multimedia formats becomes larger and more difficult to use. To help you solve this problem and use multimedia content to your advantage we made this integration of ArkCase, Alfresco, and AWS.
ArkCase, Alfresco, and AWS created a platform that can help enterprises derive value from the ever-increasing multimedia content. Thanks to this integration, enterprises of all sizes can now use Armedia Legal Module and Amazon Transcribe as part of it.
Armedia Legal Module turns any audio or video file into a text format which can later be used as any other textual document. No more external transcription services. No more waiting. No more time wasted.
Want to view a recorded demo of ArkCase Transcription?
Are you interested in a live demo of ArkCase Transcription?
In one of my previous blog posts, I touched on the topic of AI-powered transcription services on the market. There, I introduced the idea that, with this pace of multimedia production, traditional, human-powered transcription services is not the solution.
In the past 2 years, we’ve produced 90% of all the data our civilization has. At this pace, and a 9:1 ratio of transcribing multimedia files, human-powered transcription is simply impossible to keep up. It’s too slow, too expensive, too prone to error, and too vulnerable to data leaks.
Just like hiring an army of workers to dig a perfectly straight ditch of a 1000 miles is not the best option, we need to start thinking of how machines can help.
In this blog post, I’d like to dig a bit deeper and do better coverage of the 4 major transcription services: Amazon, Google, IBM, and Nuance. They are all good players, however only one can fully respond to all of your specific needs.
To help you choose the best transcription service provider, let’s make a little comparison between the four.
My Comparison Methodology
I’ll be covering the four providers from several different angles, so you can get a more comprehensive understanding of their value proposition for your specific needs. Here are the different angles I’ll be covering:
- Speed. The speed of a transcribe platform is a crucial factor. Given enough time, everyone could transcribe a multimedia content, but the point of the existence of platforms like these is to make that time as short as possible. But in some cases, speed may not be the ultimate, deciding factor. Some companies will be better off with a slower but more accurate solution.
- Accuracy is paramount to a transcription platform. Very often the worth of the transcription platform is measured by its accuracy. If the platform gives you a transcription that needs additional edits in punctuation and speakers, then that platform my friend hasn’t done much of the job for you. But again, in some cases, companies that have large amounts of transcripts, they’ll be better off with a slightly less accurate, but much cheaper solution.
- Price. No matter if you are a small company or a well-established vendor moving the market, everyone cares about costs. How much of a deciding factor this will be, depends on how large your budget is, and how important the other two metrics are.
Now that I’ve introduced the software packs and the methodology of comparing the 4 transcription services, let’s get started.
Amazon Transcribe Service
In trying to keep up the pace with the evolution of language, Amazon Transcribe platform is continually learning and improving. AWS Transcribe platform is designed to provide fast and accurate automated transcripts for multimedia files with varying quality.
Currently, Amazon’s transcription service is able to process multimedia content:
- Duration: maximum 2 hours,
- Custom Vocabulary: maximum 50 KB file size
- Sampling rate: from 8KHz (telephony audio) to 48Kh
- Languages: English and Spanish
- Formats: In WAV, mp3, mp4, FLAC
Thanks to AWS processing prowess, Amazon Transcribe is doing transcription at an astonishing speed.
The best thing about Amazon Transcribe is the accuracy of transcriptions. AWS has been the world’s most comprehensive and broadly adopted cloud platform for the last 12 years. This experience can be seen in the accuracy Amazon Transcribe shows in their results.
Namely, unlike other transcribe services, Amazon transcribe platform produces texts that are ready to use, without a need for further editing. To achieve this, AWS Transcribe pays special attention to:
- Punctuation. Amazon Transcribe platform is capable of adding appropriate punctuation to the text as it goes and formats the text automatically. This way producing an intelligible output which can be used without further editing.
- Confidence score. AWS Transcribe makes sure to provide a confidence score which shows how confident the platform is with the transcription.
This means you can always check the confidence score to see whether a particular line of the transcript needs alterations.
- Possible alternatives. The platform also gives you an opportunity to make some alterations in cases where you are not completely satisfied with the results.
- Timestamp Generation. Powered by deep learning technologies, AWS Transcribe automatically generates time-stamped text transcripts.
This feature provides timestamps for every word which makes locating the audio in the original recording very easy by searching for the text.
- Custom Vocabulary. AWS Transcribe allows you to create your own custom vocabulary. By creating and managing a custom vocabulary you expand and customize the speech recognition of AWS Transcribe.
Basically, custom vocabulary gives AWS Transcribe more information about how to process speech in the multimedia file.
This feature is very important in achieving high accuracy in transcriptions of specific use such as Engineering, Medical, Law Enforcement, Legal, etc.
- Multiple Speakers. AWS Transcribe platform can identify different speakers in a multimedia file. The platform can recognize when the speaker changes and attribute the transcribed text accordingly. Recognition of multiple speakers is handy when transcribing multimedia content that involves multiple speakers (such as telephone calls, meetings, etc.).
AWS Transcribe platform also allows you to specify the number of speakers you want to be identified in the multimedia file. The platform allows identification of up to 10 speakers.
The best performance can be achieved when the number of speakers you require to be identified, matches the number of speakers in the multimedia content.
The best part of Amazon Transcribe, unlike the other transcription services we discuss, is that you pay-as-you-go based on the seconds of audio transcribed per month.
Amazon Transcribe API is billed monthly at a rate of $0.00056 per second. Usage is billed in one-second increments, with a minimum per request charge of 15 seconds.
Thanks to all of these features, Amazon Transcribe service may be considered as highly accurate transcribe service. With its speed, accuracy, and price this transcribe service is one of the best, if not the best player in the game.
Google Speech-to-Text is available for multimedia content from different lengths and duration and returns them immediately. Thanks to Google’s Machine Learning technology, the platform can also process real-time streaming or prerecorded audio content including FLAC, AMR, PCMU, and Linear-16.
The platform recognizes 120 languages which makes it much more advanced than Amazon Transcribe platform.
However, despite this, Google still falls short on accuracy and price, compared to Amazon Transcribe platform.
Google Speech-to-Text accuracy improves over time as Google improves the internal speech recognition technology used by Google products. It includes:
- Automatic identification of the spoken language. Google employs this feature to automatically identify the language spoken in the multimedia content (out of 4 selected languages) without any additional alterations.
- Automatic recognition of proper nouns and context-specific formatting. Google Speech-to-Text works well with real-life speech. It can accurately transcribe proper nouns and appropriately format language (such as dates, phones numbers).
- Phrase hints. Almost identical to Amazon’s Custom Vocabulary, Google Speech-to-Text allows customization of context by providing a set of words and phrases that are likely to be met in the transcription.
- Noise robustness. This feature of Google Speech-to-Text allows for noisy multimedia to be handled without additional noise cancellation.
- Inappropriate content filtering. Google Speech-to-Text is capable of filtering inappropriate content in text results for some
- Automatic punctuation. Like Amazon Transcribe, this platform also uses punctuation in transcriptions.
- Speaker recognition. This feature is similar to Amazon’s recognition of multiple speakers. It makes automatic predictions about which of the speakers in a conversation spoke which part of the text.
Google Speech-to-Text costs $0.006 per 15 seconds, while the video model costs twice as much, at $0.012 per 15 seconds.
Considering the speed, price, and accuracy, Google Speech-to-Text is definitely among the best in the industry. However, its features are mostly based on language instead of meaning and inference. Which for now, gives Amazon Transcribe advantage in the game.
But, let’s move on and take a look at the other two transcription services.
IBM Watson Speech-To-Text
IBM Watson Speech-to-Text can transcribe speech form 7 different languages. However, the service does not support all features for the 7 languages. For most languages, it supports 2 sampling rates, broadband or narrowband models. It uses broadband for audio that is sampled at a minimum rate of 16 kHz and narrowband for audio that is sampled at a minimum rate of 8 kHz.
In addition to basic transcription, IBM Watson Speech-to-Text includes voice control of embedded systems, transcription of meetings and conference calls, and dictation of email and notes in a real-time.
When it comes to accuracy, IBM Watson speech-to-text pays special attention to:
- Keyword spotting. This feature enables search by a specific keyword. It basically identifies spoken phrases that match specific keyword strings.
- Speaker recognition. This feature is available for audio content in US English, Spanish or Japanese.
- Word alternatives. This feature enables requests of alternative words that are similar to the words in transcripts by acoustics.
- Word confidence. IBM Watson speech-to-text provides confidence levels for each word of a transcript.
- Word timestamps. The service also provides timestamps for the start and end of each word of a transcript.
- Profanity filtering. This feature censors profanity from US English transcripts.
The IBM Watson Speech-to-Text is priced at $0.02 per minute. This price applies to the use of both broadband and narrowband models.
IBM Watson Speech-to-Text has a wide range of possibilities. When it comes to accuracy, the features above say it all. IBM Watson Speech-to-Text is one of the most accurate transcription services.
However, all of these features do not apply to all languages and even more importantly, some of them come only with the BETA version. This makes IBM Watson Speech-to-Text described as such much more expensive in comparison with the previous two transcribe services.
Nuance Dragon Transcription
Nuance Transcription Engine can easily transcribe messages and conference calls in 43 different languages. The process takes up time according to the length and duration of the message and the traffic on the server.
The service pays special attention to accuracy and for that matter includes the following features:
- Multi-speaker identification. Nuance Transcription Engine can recognize and transcribe up to six individual speakers.
- Customizable language models. This feature is actually very similar to Amazon Transcribe custom vocabulary. It can identify various names using specialized vocabulary tools.
- Intelligent error correction. This transcribe service makes probability‑based suggestions for alternative words when the speech is too unclear to transcribe. This feature is very useful and significantly increases accuracy.
- Timestamps. Nuance Transcription Engine provides fully time‑coded and stamped lines which increase the clearance of transcription. Making possible to know who said what and when in a particular case.
Nuance Transcription Engine price is starting at $150 and it’s a lifetime deal.
Although this transcription service is one of the best on the market, when it comes to accuracy, it, however, differs much from the other transcribe services included in this comparison.
The major difference is that Nuance Transcription Engine focuses on transcribing voice messages and industry-specific transcriptions.
To be more specific, the Nuance Transcription Engine is one of the best, if not the best medical transcription software in the world. Which, unfortunately, means that if you are not a part of that industry, the accuracy of your transcriptions will not be as good as that of medical transcriptions.
Let’s Wrap Up
A research shows that the human brain can remember only 10% of what we read and 20% of what we hear. This is nothing less but an emphasis on the need for deriving value from multimedia content. And AI has proven to be the real deal when it comes to transcribing multimedia content.
Capturing and retrieving information from multimedia content using NLP and Speech Recognition has been the goal of Artificial Intelligence giants for the last decade. And they become more sophisticated every year.
In this comparison, I’ve decided to include only four transcription services which, by my research, are the best ones. I included three factors (speed, accuracy, and price) according to which I was leading the comparison. And based on these factors, I found out that:
- All four transcription services included in the comparison have some distinctive qualities that give them an advantage over the rest solutions on the market,
- They are all fast in processing and delivering results,
- They all show high accuracy of transcriptions,
- They all offer acceptable prices.
However, not all of them can equally respond to everyone’s needs. Take a good look at the comparison made above and decide which one will meet your needs best.
We at Armedia decided to rely on AWS and integrate Amazon Transcribe as part of our Armedia Legal Module for ArkCase.
What choice you’ll make, depends on your organizations’ requirements.
If you have any questions, do not hesitate to get in touch with us. Our team at Armedia is always at your service.
Legal professionals use court case videos to verify the authenticity of information and to increase the transparency of the judicial process. Lawyers, judges, and other legal personnel use these case videos as any other case-related documents. This means that these videos need to be as searchable and as available to legal professionals as any other regular Word document. Easier said than done.
Considering the fact that Courts deal with sensitive situations where real people are involved (very often in a somewhat vulnerable state), the security of these videos must be at the highest possible level. Meaning, court case videos must be stored safely, transcribed quickly and accurately, by reliable staff, and used on-time.
The Challenge of Using Traditional Transcription Services
Because of the sensitive information covered in these videos, Courts and Court workers should move away from relying on traditional transcription service providers. The more people get involved in handling sensitive data, the higher is the risk of data leakage.
There are tons of web-based transcription service providers where you upload a file, an unknown staff process it, and you have a downloadable transcript. Sounds good and easy, but this black-box approach to sensitive video transcription services leaves a lot of unanswered security questions:
- Who did the transcription?
- How good are they at keeping secrets?
- Did the file upload pass through a secured connection?
- Is the company using a secure server to store the videos?
- Did the transcript come through an encrypted connection?
- Is the video safely deleted from the vendor’s servers?
- Is there a way to ensure that nobody else will get their hands on the transcript?
This is why Courts need a closed system of handling sensitive video materials, where the recordings and transcripts aren’t directly accessed by non-court personnel.
Here’s a list of desired deliverables of such a closed, fully automated transcription system:
- This closed system would be built by components that comply with data handling safety licenses.
- The system would be almost infinitely scalable, so there is no limit to how many videos can be safely stored.
- The system would need to be extremely accurate in transcribing the videos, and have a way to add timestamps to the transcribed document.
The good news is that video storage solutions and multimedia transcription software is commercially available. Multimedia handling has been a growing industry, and the legal sector is in a position to choose.
How Amazon Is Becoming The Go-To Vendor For Video Transcription and Storage
While Amazon.com is a household brand and we buy anything from books to groceries on Amazon, the company has been on the leading edge in some very interesting technologies.
Amazon Web Services, or AWS, is a Cloud platform for hosting web applications that need vast processing power and even greater storage capacities. AWS helps companies focus on their software, while Amazon handles the server infrastructure. In a sense, software companies use Amazon’s Platform as a Service approach for building scalable solutions.
Amazon Transcribe is another service that is available on the market. Transcribe can take a multimedia file stored on an Amazon S3 server and convert a video conversation into a properly annotated and timestamped text.
Amazon AI is a result of Amazon’s huge investments in Artificial Intelligence too. From the humble product recommendation feature on Amazon.com, they’ve come a long way. Combining AI and their Transcribe service, users can even use subject matter related dictionaries, so that Amazon can have a sample of words probably used in the video. This increases the transcription accuracy and returns a close to perfect transcription.
This integration will help you derive value from your Court case videos, and make them more accessible and searchable, filed properly with names, addresses, people involved, etc.
AWS transcribe platform can help you with all of this and much more:
- The transcribe platform allows you to create your own custom vocabulary list. As legal professionals, you are aware that your vocabulary differs from the common use of language. With this feature of AWS Transcribe integrated into Armedia Legal Case Management solution, you increase the accuracy of your Court video transcription.
- With its many features, this platform can identify multiple speakers in a video content.
It can recognize up to 5 different people in the same video and offer a divided-by-speaker text, or an XML file with timestamps and labels distinguishing between speakers.
This is crucial when transcribing court case videos since attributing the transcribed text to the wrong speaker could change the course of the entire case.
- With AWS Transcribe, the transcripts of your court case videos will be almost 100% accurate and ready to use. That’s because AWS Transcribe adds needed punctuation to the text and automatically generate timestamps for each word in the transcript.
At the same time, this platform provides a confidence score which shows how confident the platform is with the transcription provided. As a user, this feature enables you to always know whether a particular line of the transcript needs alterations, and whenever necessary, you can make alterations in the text.
All these features allow AWS Transcribe to produce texts that are ready to use, without additional edits. Court case videos transcribed with AWS Transcribe are indexed properly, searchable, and very easy to work with.
But all this will be of limited help to legal workers if they can’t somehow easily integrate Amazon Transcribe into their core legal case management system. These transcripts will need to be assigned and limited to specific cases. That way, only the legal workers assigned to that case can easily access the transcriptions whenever they need them.
And this brings me to the next point.
The Armedia Legal Module For ArkCase
If you’ve followed my blogs for just a little bit, you’ll know how big we are on ArkCase. So far, we’ve built several specific solutions using ArkCase as our platform for case management.
As you may have read in other blog posts, we use ArkCase because it’s an open source case management system that integrates with all the key players: Alfresco, AWS, Ephesoft, Snowbound, MS Exchange etc.
We’ve taken this open platform and built a Legal module, turning the generic case management platform into a legal case management solution that’s very easy to use, and has almost limitless potential to accommodate your data storage needs.
The way we built the Armedia Legal Case Management solution is that we’ve added configurable workflows, forms, rules and access-level management features needed to run a Legal case. Plus, we’ve added a seamless integration with Amazon Transcribe.
Let me explain how a video transcription would work.
When you create a legal case, all the forms and workflows are set in place. Then, a legal worker with the needed access level rights can upload a video or audio recording. This multimedia material gets uploaded onto a secured S3 server. Then, once uploaded, AWS kicks in and provides the needed processing power to quickly and accurately transcribe the videos.
The end result is a neatly timestamped transcription file that you can search through just as easily as if it was a regular Word or PDF file.
In simpler words, Armedia Legal is a Legal Case Management Software solution with built-in transcription and video management. This makes our Legal Module perfect for managing and using all court case videos in a safe and secure way, without any worries about data leakage. The best thing about it is that we’re relying on Amazon’s PaaS, and you don’t have to worry about the underlying infrastructure.
Let’s Sum Up
As a legal professional, you must be concerned about the security of the data you deal with in every case. Especially court case videos.
Data leakage is something no legal body can afford. Even if there is just a tiny opportunity of risk, like the case of using traditional transcribe service providers, you’ll want to avoid it.
Using traditional transcription service providers means allowing an unknown third party to gain access to sensitive materials that can harm you or your clients.
This is why you’re better off relying on a closed system like Amazon Web Services to store, process and transcribe your legal video materials.
We’ve built the Armedia Legal Case Management solution so you can avoid the risks of data breaches, by integrating with Amazon Transcribe. It rests on ArkCase, an open source case management system that is a great platform for building cost-effective case management solutions.
If you have any questions or you want to find out more about our Legal Module, feel free to contact us or check out our Webinar.
Being present in the Legal world for so many years, you have probably witnessed a number of buzzes.
“Something new, innovative and life-changing has occurred.”
“A new technology, an innovation that will change the industry forever.”
You’ve been probably hearing the buzz of Artificial Intelligence (AI) for years now. And as the time passes by, you may have noticed that, besides the big claims, nothing big has come out of it yet.
AI is here, making enormous changes in the Legal sector, and most importantly, AI is here to stay.
Maybe it’s time to accept the fact that AI is not just a buzzword, but rather a real disruptor that has already made revolutionary changes in the Legal Sector and will continue to do so.
Let’s see why.
AI in the Legal Sector: Industry Disruptor or Buzzword?
Last year, the newspaper ‘Lawyers Weekly’ wrote about an Australian tax lawyer that opened a ‘Law Firm Without Lawyers’.
As Lawyers Weekly reported, the law firm offers Last Will and Testament services, assistance with business structuring, and asset protection.
This law firm is expected to include a suite of options to support victims of domestic violence in the future.
Adrian Cartland, the owner of the law firm, explains:
“The difference is that where one would expect to find a lawyer sitting behind a large mahogany desk, there is a computer that clients can use to consult.”
He explains that AI is able to answer questions related to Wills and help the clients generate a perfectly legal Last Will and Testament. And for customers who require support, a member of their team is available at the click of a button…
“This is true Artificial Intelligence that is being applied to make everyday life easier,” says Adrian Cartland.
He adds: “The current process to write up a Will can be very time to consume and expensive…Making a Will should be as easy as popping to the shops for a carton of milk. Artificial Intelligence makes the process cheap, fast and legally binding.”
As we can see, AI is definitely much more than just a buzzword in the Legal Sector. But, it is also important to point out that although the power of AI in the Legal Sector is not to be undermined, there is no need to worry that AI will replace a legal professional.
As the behavioral scientist and senior research fellow for Harvard Law School’s Center on the Legal Profession and the Harvard Kennedy School, Dr.Paola, says:
“Indeed, I think this is where we often hear that misconception that AI, machine learning and deep learning will replace human beings — especially lawyers. Instead, the way to look at it, I think, is that AI, machine learning and deep learning will be a way of complimenting people’s decision-making, allowing us to make the most interesting work our priority.
We will be able to answer very complicated questions — key strategic questions — and having more fun doing it. AI can free up lawyers to focus on the more analytical and strategic thinking aspect of tasks − because, despite AI’s ability to locate, distill and organize any quantity of data for the needed information, the interpretation of the information still needs to be done by a human being.”
Dr. Paola makes some great points. Let’s see why.
Why Established Vendors Should Embrace AI-Powered Solutions
“With AI, lawyers will shift their focus from routine activities to much more high-value work involved in shaping strategies and navigating complex legal problems.” – Gillian K. Hadfield, a law professor at the University of Southern California.
Every day, the helping hand of AI finds its way to reach deeper into the Legal Sector and help to solve difficult tasks. With much faster pace and better accuracy than any legal worker can… in more than just one Legal area.
Let’s take a look:
1. Image Analysis
A. Optical Character Recognition (OCR)
One of the oldest usages of computer software and images is recognizing texts on images. Basically, it’s the process of recognizing texts on scanned documents.
Using AI in OCR is the process of taking a scanned set of documents, and applying training and dictionaries in order to recognize as much text as possible, with as high of reliability as possible.
The Legal sector benefits greatly from AI in OCR. All the forms, documents, archive materials etc. get digitized by scanning and text extraction. With legal forms, things get easier as you can train the system to understand what sort of documents are being processed, and what vocabulary is used throughout the documents. That way, even if the scanned image is of a lower quality, the computer can recognize a certain word based on a dictionary it uses in the legal sector.
This kind of image processing helps legal organizations digitize large archives of documents, and extract text from them with a high-reliability score, with minimal human intervention.
B. Objects Recognition (Image Taxonomy)
Unlike text recognition, object recognition is a much more challenging technical problem.
One of the best representatives of this problem is the famous “dog or a muffin” problem, and there are plenty of texts dealing with the technicalities of coding in the software that will be able to tell the difference
Image Source: Freecodecamp.org
This type of a problem also happens with recognizing actual objects in an image, from patterns that appear to have the same geometrical structure. Is it a person, or a graphite. Is it a car, or a box and two trash cans?
Using AI and machine learning to properly identify objects is very important for legal case files. Imagine a use case where a legal worker needs to find an image where there is a photo of a crime scene with a gun and a wallet. Instead of sifting through photos manually, the legal worker can run a search for these two objects, and get a list of images where these two objects are found.
C. Computer Vision (Scene Description)
Scene description is, at least for now, the holy grail of image recognition. And AI is making this attainable, soon.
For people, it’s very easy to see a scene and quickly identify the objects, their relationships, and deduct/explain what the picture represents.
Take a look at this example
Image Source: PyImageSearch.com
The software here is accurately recognizing the objects in the image. One step further from this is to deduct the relationship between these objects. We can see that in the image, the software has recognized that there is a fence, a few pots of flowers, a horse, and a person.
But to fully explain the image, we’d want the software to be able to recognize that the horse has jumped. And the person riding it is on it. And that the fence is below the horse.
So the full description of the image to (eventually) be as succinct as: “A jockey successfully jumping over an obstacle.”
Such scene description capability can be very helpful in documenting and identifying car accidents, people interacting at a crime scene, or even recognize acts of violence in real time.
2. Textual Analysis
A. Contract Analysis
Contract analysis is a Legal area which can easily be covered by AI. The process requires reviewing, analyzing, deriving value, and checking legal agreements against current laws and rules.
Although the workflow of this process varies in each case, the basis is always the same. And thanks to Natural Language Processing (NLP), all of these functions can be easily done by AI technologies.
NLP is much faster and accurate than humans, and that’s why AI technologies supported by NLP can easily and quickly free up lawyers from time-consuming tasks such as:
- Financing/OTC derivative agreement review,
- Sales/procurement contract review,
- Employment contract review,
- Compliance and risk review,
- Some types of eDiscovery,
- Due diligence,
- Lease review.
B. Legal Data Research
Legal Data Research is another Legal area where AI-powered solutions can help your employees speed up the process.
Compared to other enterprise search or a database trawl, AI technologies are much faster and effective. These technologies do not only search for the best answer, but they also learn from the questions asked. They isolate the information needed from a mass of data instead of bringing upon hundreds of documents that contain the same keyword.
In addition to this basic legal research, AI can do predictive legal research based on a comparison between previous case records relevant to the current case. For instance, comparisons between previous actions of lawyers, available court documents, rulings made by judges on similar matters, etc.
Based on these comparisons, AI provides information on the likelihood for success and the possible damages that could appear in the case.
C. Intelligent Interfaces
The third main branch of Legal AI is the development of intelligent interfaces.
Intelligent interfaces are interactive, web-based, Q&A systems. They either guide users in completing basic legal documents and forms or enable them to gain legal information via text input.
These systems use NLP to understand the input from the dialogue box and Machine Learning (ML) to help the system provide the right answer.
Intelligent interfaces are usually based on drop-down menus and checkboxes to quickly move the user through a series of steps.
The presence of AI in all of these areas helps lawyers to finally get rid of the time-consuming and repetitive tasks and focus on solving more important and complex problems.
3. Voice Systems (AWS Transcribe)
Recognizing characters on a piece of paper is relatively straightforward. Recognizing objects, not so easy, but still doable.
Recognizing speech from a recorded video is a lot more challenging. The ability of systems to clear out ambient noise, and recognize two people dialoguing, and then extract that dialogue into a text file, is quite the achievement.
AI and Machine Learning make this possible. And not only possible, but doable, and useful.
Systems that can read an audio or video file, and return a timestamped transcript of the dialogue are commercially available.
We’re actually using the Amazon Transcribe service in our Armedia Legal Case Management module for ArkCase.
I’ve covered this subject in a recent blog post on how AI helps the legal sector with multimedia transcription capabilities, but just to quickly summarize it here: AWS is allowing us to process legal hearing recordings, recognize different people talking in the video, and then return a neatly indexed, timestamped transcript of the video conversation.
If Bill and Harry are talking, Amazon Transcribe will be able to know who said what, when, and offer a neat transcription file of that conversation.
This technology is making the life of legal workers much easier, as they can search in a video recording just as easily as if it was a Word document.
Over the last decade, the impressive benefits of AI ensured its place in multiple industries. Improving their operational performance in many small, but significant ways.
One of these industries is the Legal Sector.
And yes, industry disruptors can be risky, but only for those who decide to ignore them.
So far, AI has made significant changes in the Legal Sector, especially in the 3 areas we discussed earlier:
- Image Analysis,
- Textual Analysis, and
- Voice Systems.
The added benefit for most of us is to understand that AI will NOT replace people. It will rather help us focus on helping more people with their cases.
If you have any questions or comments, please feel free to share them in the comments section below.
And don’t forget to share this blog post with your colleagues on your social media.
As Artificial Intelligence (AI) in the legal sector evolves, so does the fear among legal workers. Many legal professionals fear that AI will make such a change in the legal sector that human effort will no longer be required.
But, the truth is that many of the misunderstandings regarding AI come from peoples’ limited experience with AI technologies. Instead of fearing AI technologies, we should make use of them since they can be beneficial in many fields, especially in the legal sector.
What if I tell you that you can use AI to focus on analyzing and interpreting legal content instead of organizing it?
Yes, it’s true. You could finally leave the administration work behind and spend your days as you imagined you would when you signed up for law school: helping people and do justice.
Let’s see what Dr. Paola Cecchi-Dimeglio, a behavioral scientist and senior research fellow for Harvard Law School’s Center on the Legal Profession thinks about AI in the legal sector:
“The way to look at it, I think, is that AI, machine learning and deep learning will be a way of complimenting people’s decision-making, allowing us to make the most interesting work our priority. We will be able to answer very complicated questions — key strategic questions — and having more fun doing it. Because the tedious work, which until very recently was still done by having 100 lawyers gathered in a back room to do document review, is now (at least, in much of the legal industry) being done by programs in order to provide lawyers with the best outcome or decision. Therefore, AI can free up lawyers to focus on the more analytical and strategic thinking aspect of tasks − because, despite AI’s ability to locate, distill and organize any quantity of data for the needed information, the interpretation of the information still needs to be done by a human being.”
Dr. Paola makes some great points. Especially when it comes to using AI-powered transcription services for securely transcribing case-related multimedia content. Let’s see why.
Why Do You Need AI-Based Transcription Services?
Much of the data in the world is not in a text form but in a form of spoken words on video and audio recordings or even live events. This data is as relevant as any other form of case data.
As a legal professional, you are probably aware that the content captured in such formats is very difficult and sometimes even impossible to use. The access, the searchability, the organization… they all require effort and time that no legal worker has the privilege to waste. This makes voice transcription services an important part of Legal Case Management.
For years, legal professionals have been relying on transcription services that convert multimedia content into text. But, involving a third party in a process that is delicate by itself may not be the best idea. Especially not today, where technology evolves every day and provides a new solution for each problem we face.
For that reason, our team at Armedia decided to rely on AWS transcription services and help you, as a legal professional, to derive value from all your multimedia case records.
Let’s look at why AWS Transcribe can be the best choice when it comes to transcription services.
Why Is AWS Transcribe The Best Transcription Service Choice?
Building a service that converts human speech to text with the same accuracy as human transcribers is no walk in the park.
Recognizing natural speech is still a challenge for machines. But, despite all of these challenges, AWS managed to develop their Transcribe Service which is capable of producing transcripts that are almost 100% accurate and ready-to-use, without many additional edits.
There are several very good reasons that come to mind, of why AWS Transcribe is such a game-changer:
- With AWS Transcribe you won’t have to worry about the security of case records, which is not the case with human-powered transcription. Because only you and the machine will be able to have an insight into the multimedia content and its transcribed format.
- AWS Transcribe is fast and very simple to use. You just upload the multimedia file and within a few seconds, you get a ready-to-use transcription.
- AWS Transcribe can recognize multiple speakers. This feature is very important since it attributes each piece of texts to the appropriate speaker in the hearing.
- AWS Transcribe provides timestamps for each transcribed word. This significantly increases the searchability of a case record.
- AWS Transcribe can add appropriate punctuation to the text as it goes. This contributes to producing intelligible output that can be used without a lot of extra edits.
- This platform will provide a Confidence Score so you can quickly find the places where AWS Transcribe isn’t exactly sure what the people on the recording are talking about.
- AWS Transcribe allows you to add a list of custom vocabulary with words you think the platform wouldn’t recognize. These are usually legal vocabulary or non-English names that are not in a common use.
I’m pretty sure you can find yourself benefiting from at least a handful of these features that AWS Transcribe has.
Armedia Legal Module
For legal professionals that want to leave administrative work behind and focus on interpretation, our Legal Module for ArkCase is the right choice. Armedia Legal Module allows you to finally use multimedia content as you would use any other form of case records without spending hours manually sifting through recorded materials, and without worrying about data breach issues. All you have to do is just upload the audio and video files related to the specific case, and you’ll get a properly formatted, searchable transcription.
With the integration of AWS Transcribe, the Legal Module allows you to:
- Transcribe a multimedia content,
- Add timestamps,
- Easily search for specific words and phrases in the video file,
- With one simple click, jump to the exact moment where that keyword is mentioned in the video.
The Armedia Legal Case Management Module enables legal professionals like you to easily organize, access, search through, and manage any case related record regardless of its form.
This Module is built in a way to respond to all of your legal case management needs and save time, efforts, and resources while increasing the security, quality, and value of case videos.
Managing case videos and audio recordings is not as easy as managing a Word document. It requires much more time and effort to get the maximum value out of any multimedia file in a legal case.
Transcribing court videos and audio through traditional manual transcription services is a time-consuming and labor-intensive task. This leads to a delay in receiving the text output and excessive labor cost.
For that reason, we relied on Amazon’s AI-powered Transcribe service and integrated AWS Transcribe as the best possible solution for quickly, reliably and securely converting multimedia into a text as well as drastically reducing the cost.
As a platform, we opted for ArkCase. It’s perhaps the only flexible and well-supported open source case management system that companies can use to build specialized solutions. With ArkCase, we’re bringing in serious software technologies provided by Alfresco, Ephesoft, Snowbound, and of course AWS.
With this technology stack, legal workers can let AI do the heavy lifting of transcribing videos.
If you want to find out more about our Legal Module and integration with AWS Transcribe, feel free to contact us.