Document Management System: Choose the right solution for your business

Regardless of one’s occupation, document management is a crucial aspect of work. Architects, doctors, human resources specialists, or construction workers – we all need to work with documents to accomplish our tasks effectively. For knowledge workers, whose expertise lies in their knowledge, documents contain all their intellectual capital. Take lawyers, for instance. The end product of their work is often a contract. This document encapsulates their knowledge, experience, and research and serves as a foundation for executing a particular transaction or business.

Effective document management is essential for businesses of all sizes. It enables employees to retrieve a document and work with content seamlessly across devices and organizational boundaries.

In this blog, we will explore the purpose of document management and outline some of the best solutions.

What is Document Management

Document management refers to the systematic and efficient control of documents, both electronic and physical, throughout their entire lifecycle. It involves creating new records, editing existing ones, collaborating with colleagues, sharing documents, and subjecting them to reviews and approvals. Documents may need to be updated or revised over time and may require ongoing management and storage, even after they have been “retired” or replaced.

Document management lifecycle often involves collaboration and reviews by multiple stakeholders. Effective document management requires clear communication and coordination between different groups and individuals.

The Need for Document Management System

The COVID-19 pandemic highlighted the unwillingness of many companies to adapt to new and unpredictable circumstances. We have encountered numerous situations where companies struggled to adjust to the unique events due to inadequate document management strategies, including essential components such as email access. Some companies found themselves unable to access their computers or documents in their offices, resulting in a complete stop of their business operations.

While some recognized the issue and sought a complete solution, others managed to adapt and overcome the crisis with a “good enough” solution. Unfortunately, some companies did not understand the full range of issues from poor document storage.

Which Document Management System to choose

Any Document Management System is better than none. Even software primarily used for managing personal documents can be a good solution for handling company documents, ensuring secure access, daily work, collaboration, and sharing.

Some of the most popular solutions for personal use are these:

  • Google Drive,
  • Microsoft OneDrive, and
  • Dropbox

All of them can be used in business context as well, depending on the complexity of the organization, company size, and industry. However, there are still many limitations to these solutions, making it wiser to seek specialized ones.

The leading research and consulting firm, Gartner, classifies solutions in this domain as Content Services Platforms (CSPs).

According to Gartner, CSP technologies enable employees to seamlessly retrieve and work with content across devices and organizational boundaries. Core CSP functionalities include content capture, creation, consolidation, processing, and retention to support personal, team, departmental, and enterprise business operations.

Gartner identifies and categorizes the most popular CSP solutions as Niche Players, Challengers, Visionaries, or Leaders.

Visionaries and leaders include these solutions:

  • Microsoft Office 365
  • Hyland
  • Box
  • M-Files

Each document management system (DMS) has its strengths and limitations. When evaluating options, it is crucial to consider various factors, including a budget, scalability, ease of use, and integration capabilities.

It can be challenging to determine a one-size-fits-all solution based solely on company’s size, as finding the right fit for an organization requires careful consideration of the advantages and limitations of each DMS.

Comparing Document Management Systems: Pros and Cons of Popular Solutions

In the upcoming sections of this blog, we will provide a list of pros and cons for the DMS options mentioned above:

Microsoft Office 365

Microsoft Office 365 is a popular DMS that offers a familiar suite of productivity tools and cloud-based storage for easy access to documents from anywhere, on any device. Real-time co-authoring and editing features also enable seamless collaboration among team members. However, some advanced document management features require additional licenses or add-ons, and the system’s capabilities vary depending on the subscription plan. Users may also need to be familiar with Microsoft Office applications to utilize the DMS fully.

Hyland

Hyland is a robust DMS solution suitable for large enterprises with complex document management needs. It offers comprehensive capabilities such as document capture, indexing, workflow automation, and records management. The system also integrates with various business systems, such as ERP, CRM, and HR, for streamlined processes. However, customization or configuration may be required for specific use cases, and the implementation and setup may require professional assistance, which could impact the overall cost and timeline.

Box

Box is a cloud-based DMS that provides easy access to documents from anywhere, on any device, with real-time collaboration features such as comments, annotations, and version control. The system offers a user-friendly interface with intuitive navigation and search capabilities. However, it doesn’t have all the advanced document management features compared to other DMS solutions, and customization options are limited. It may be suited for smaller organizations or businesses with simpler document management needs.

M-Files

M-Files is a document management system with intelligent metadata that allows easy organization and retrieval of documents based on content and context. It also offers robust version control, document history tracking and integration with various third-party applications and systems. However, the system may have a steeper learning curve for users who are not familiar with the metadata-driven organization.

M-Files and Metadata-Driven Approach

With the growth of digital information and data, businesses need a comprehensive and streamlined solution to manage their documents efficiently. M-Files is designed to meet this need, offering a flexible and adaptable system that can be customized to the unique requirements of any business.

One of the main advantages of implementing the M-Files solution is its ability to be easily adapted to changing market and company conditions.

We have been working with M-Files for almost ten years, and its metadata-driven approach has proven to be ideal for enterprise companies with complex business operations.

Focused on WHAT a document is, rather than WHERE it is stored

Traditionally, document management has involved putting documents in specific folders, which has not changed much since the introduction of graphical interfaces. While some competitors have built solutions around folders, such as Connected Folder, Smart Folder, and Folder Collections, these solutions don’t significantly improve daily work.

Conversely, M-Files focuses on WHAT a document or information is, rather than WHERE it is stored. This means the document is stored within the M-Files platform based on metadata, not folders. Each document or piece of information is marked with essential tags, such as user, project, project leader, region, industry, etc., which enable easy searching and working with documents. M-Files also allows the creation of “virtual folders” (called Views), which can be arranged dynamically and personalized, allowing users to find information in a way that makes the most sense to them.

In addition, searching is possible by document content and metadata, making it possible to find information quickly using advanced filters. Full-text search enables the user to find a document even when unsure whether they have set the correct metadata on the document.

Using business process automation, the metadata of the document guides the processes from draft to approved. M-Files uses electronic signature capabilities to streamline processes, and ensure greater transparency and security of documents in the workflow.

Finally, M-Files offers a high level of security, with access rights defined based on metadata. For example, if a particular document is classified as belonging to a specific customer, only those users who have access to that customer can see the document. Access rights are fully customizable and support anything from simple to complex access rules.

Conclusion

With the rise of digital information and data, businesses need comprehensive and streamlined solutions to manage their documents efficiently.

While basic solutions such as personal cloud storage may provide temporary solution, specialized Document Management Systems like M-Files, Microsoft Office 365, Hyland and Box offer advanced functionalities. Factors like budget, scalability, ease of use, and integration capabilities must be considered when selecting DMS, as there is no one-size-fits-all solution.

M-Files stands out as a leader in the field with its unique metadata-driven approach, enabling intelligent organization and retrieval of documents based on content and context.

Investing in a robust document management solution is crucial for businesses to enhance productivity, collaboration, and data security, regardless of the DMS chosen.

The challenges in building OCR models

Challenges and innovations in the IT world on Advanced Technology Days

Advanced Technology Days was held in Zagreb for the 17th time! The conference has become a traditional gathering of IT enthusiasts in the SEE region with an emphasis on new technologies and innovations in the field.

This year Unitfly had two presenters: our COO Alan Debijađi talked about Azure Synapse Analytics, an Azure platform that combines enterprise data warehouse and big data analytics to ensure centralized management of data lakes and warehouses. On the other side of the coference room, our Software Engineer Dino Grgic presented the challenges of optical character recognition (OCR) models, the topic we will cover today.

Software Engineer Dino Grgić/ Unitfly

Introduction

The process of converting an image of text, or a hand-written text into a machine-readable text, also known as ‘optical character recognition’, became publicly widespread in the early 1990s.

Since then, the technology has undergone a lot of improvements. Nowadays, we are able to digitalize hand-written documents, along with other benefits of OCR.

Are today’s OCR solutions accurate enough and no longer challenging? Do they still require deep learning?

These are some of the questions our colleague Dino wanted to give an answer to in his presentation on this year’s Advanced Technology Days conference.

What is OCR?

Before we get to the bottom of the issue regarding OCR, let’s get to know a term called Computer Vision – a field of artificial intelligence (AI) that enables computers and systems to identify and understand objects in digital images, videos and other visual inputs –  and take actions based on that information.

computer vision identifying objects
Picture 1. Computer Vison system identifying objects on the street
source: https://appen.com

OCR is a subfield of Computer Vision. It recognizes text in an image and converts it in a machine-readable text data. Some of the fields where OCR is used and useful are:

  • License plate recognition
  • Traffic sign recognition
  • Helping the blind and visually impaired reading the text
  • Converting handwritten notes to machine-readable text
  • Translation from one language to another
converting notes to machine-readable text
Picture 2. OCR – Converting notes to machine-readable text
source: https://research.aimultiple.com/handwriting-recognition/
Google translation using OCR
Picture 3. OCR – Translation

OCR yields very good results for general use cases, however, there are a lot of specific cases where deep learning is still required.

For example, detection of data in unstructured incoming invoices in the Croatian language (and the language your current OCR model works on is English, and doesn’t recognize some specific letters used in Croatian –  Č, Ć, Š, Ž… ). This is a perfect example of a field where OCR needs improvement to become a reliable model for solving a requirement.

How to create OCR model

We might need to develop and train our own model if the use case is too specific. In order to do that, we need a set of data – for different fonts, and formats we need to train our computer for better recognition of any given incoming invoice and data in it. More data leads to a better model.

An important note to point out is that this model could be used in this field only (Croatian incoming invoices), but cannot be used, for example, in Arabic incoming invoices – because of the difference in a set of data that was put in it.

3 step process of creating an OCR model

  1. Pre-processing
    Inserting an image in the model, from which we want a computer to learn. Every image goes through a lot of filters before any text is detected.
  2. Text detection + text recognition
    Using bounding boxes, we detect the location of the text, and with text recognition, we train the computer to read it.
  3. Post-processing
    Converting data that we processed in previous step and generating the output in the form we want – document file, excel sheet, etc.

It is easy to recognize regular text, but the ongoing research focuses on recognizing irregular texts – blurred, with the object hiding a part of the text, text on a transparent background with the noise behind it, italic text, bad lightning, etc. …

technical challenges of OCR
Picture 5: Technical challenges of OCR algorithms
Source: Chenxia Li, et. al, Picture 5: Technical challenges of OCR algorithms
Source: Chenxia Li, et. al, “Dive into OCR”

Conclusion

There is no such thing as a 100% effective and accurate OCR model. Each OCR model is used for the specific task in mind only. Because of that, it is not possible to generalize the solution easily. Systems depending on OCR depend on its quality, so the OCR field will always seek for improvement in mode accuracy.

The presentation, demo and useful links regarding OCR you can find on Dino’s GitHub repository.

BESPLATNI WEBINAR

Zašto digitalno poslovanje više nije dovoljno?

Otkrijte kako postići maksimalnu produktivnost zaposlenika i povećanje profitabilnosti kompanije.