From paper to digital invoices using machine learning

Sam Verhaegen - 2018-05-24


After the first successful steps in the world of digital pathology and Artificial Intelligence (AI), IxorThink has been focussing on an automated invoice analysis system for IxorDocs.

What is IxorDocs
IxorDocs is the digital link between your company and its clients, employees and the government. Your invoices are sent securely to the correct party, including the international PEPPOL-platform. Apart from Business to Government (B2G) and Business to Business (B2B) e-invoicing, IxorDocs can also be used to optimise the HR document flow.

IxorDocs and AI
As a part of the document flow we need to transform an invoice in PDF format to UBL, a standard format for digital invoices.
For humans it is not a complex task to recognise a name or VAT number in an invoice. We rely on both the layout and the structure of the text. However, for a computer this not a trivial task. In order to avoid manual manipulation during the document processing, AI is the way to go. Machine Learning (ML) gives a computer the ability to "learn" from a set of data, without being explicitly programmed.
Using machine learning, it is possible to analyse a PDF invoice in an automatic way. This means detecting all fields of interest like the invoice-number, order-number, date, VAT number, etc. to be able to handle a document-flow fully automatically.

Results & future steps
While there is only a small dataset available at the moment, our IxorThink team was able to create and train a named entity recognition (NER) model to correctly analyse new invoices. This can be invoices which follow a known template, or unseen invoices from new customers.
At this moment we are able to correctly detect the most important fields, so an important next step is to further roll out this proof-of-concept. The model itself can also be expanded to detect for example all invoice lines.
The ultimate goal is to extract useful data from other types of documents too.

At Ixor we are constantly in search of exciting ways to implement new technologies in our products. If you'd like to stay updated on our progress in AI, Document Management and Internet Of Things (among other things), please register for our newsletter here.

Related articles

IxorDocs krijgt een upgrade en ondersteunt nu AS4 en UBL BISv3 voor e-invoicing

Read more

Stochastic patch generation for Whole Slide Imaging

Read more

IxorTalk ontwerpt slimme mobipunten

Read more