Our Machine Learning Assets team develops and delivers machine learning-based solutions for scanning and extracting information from invoices and receipts (Smartscan), and a prediction engine that learns from user behaviour (AutoSuggest). Last week, the team launched a new version of Smartscan.
For the past 6 months, the team has researched and developed a completely new version of Smartscan using a new technology called ‘transformers’. This new technology is – thanks to the extremely large machine learning models it enables – rewriting the rules of natural language processing (NLP).
The new version was trained on 40 times more data than the previous version. Also, the machine learning model itself is 100 times larger (with close to half a billion weights). But what really matters is the impact for users:
- 50% less work left for the users—the new Smartscan returns more information, and with a lower number of errors.
- 100% increase in invoice automation rate—The number of invoices where Smartscan provides complete and correct information has more than doubled.
The platform underneath this new Smartscan release is extremely flexible and highly scalable and will carry the ML Assets team well into the future.
Closing the gaps
What’s new in this generation of Smartscan is a sophisticated new model of the document structure – a layout model – trained on millions of documents.
The layout model allows Smartscan to close a few functionality gaps. While a lot of invoices look very similar – there are a few harder tasks for a machine learning system – in particular finding the information that isn’t standardised.
While most of us know exactly where we would look to find the total amount on an invoice, that’s not always the case for other kinds of information. One such example is the customer ID number (KID) used in Norway.
There are no general rules,not even a general convention, about exactly where to put this ID on an invoice.. However, the new Smartscan model manages to find the KID almost every time (more than 95% of all KID numbers found).
The advantage of scale
Building this system would not even be possible without Visma’s advantage of scale. Our customers receive tens of millions of invoices every month,and Smartscan processes a big chunk of all these invoices.
In machine learning size matters. If you want outstanding performance you need really big machine learning models. But there’s a hitch: The size of your machine learning model is limited by the amount of data available to train the model on.
Training a machine learning model well is very much like digesting and compressing a big dataset to extract only the general features from the dataset. You can try to work on insufficient data but you will most likely run into a problem known as overfitting.
Instead of learning from the dataset, the model ends up simply memorising the dataset. The end result is an inflexible model that is able to repeat the information it was fed but not very good at analysing data it has not seen before.
Underneath the new Smartscan is a completely new machine learning platform with room to grow even bigger. We expect to grow the system an additional 3 to 10 times in size based on the data available.
In addition to that, a future release of Smartscan promises to improve the quality and flexibility of document scanning in a number of important ways:
- Closer and closer to human performance. In the future Smartscan will even be able to correct errors in the inputs we get. For example, fixing text recognition errors in the documents we process.
- Adaptable scanning. The bigger the model, the quicker it is to learn new tasks. Smartscan will be able to adapt to new tasks given only small sets of examples.
- Creative abilities. As of today, Smartscan offers pure extraction: The system simply identifies text in documents with a particular meaning. In a future release, Smartscan will be able to infer information from documents. This includes. providing answers that can’t be directly read off from the documents. This opens up new possibilities in using Smartscan for document evaluation and assessment tasks.
Smartscan by the numbers
Smartscan processes more than 4 million documents per month, saving our customers from hours of tedious work. To replace Smartscan with human labour would take well several hundred full-time employees. Smartscan powers document recognition in systems such as Tripletex, eAccounting, Visma Expense, e-conomic, Proactive, Dinero, and apps such as Visma Attach and Visma Mobile Employee.
More Visma systems are slated to take advantage of Visma Smartscan in the future. With this new release, we expect this trend to accelerate.