“This contribution plays a part in democratising AI. We have made Document AI more freely and widely available to other developers in organisations all around the world.”
Hi Christoffer! You work as a data scientist in machine learning (ML) at Visma – how is that and what do you do?
First of all, it is a lot of fun. We get to come up with creative solutions using deep learning to solve challenging problems in Document AI. We read research papers and turn the knowledge from them into our two machine learning products: SmartScan and AutoSuggest.
In a few words, what is Hugging Face exactly?
Hugging Face is a company known for their enormous efforts to democratise AI through their transformer implementations and the accompanying model hub, which is like an app store for machine learning models. The entire thing is open-source, so everyone is able to contribute – it really is a community effort. Researchers and practitioners use it in their everyday work to create (more or less) all of the amazing AI applications that you see everywhere.
“Hugging Face is like an app store for machine learning models.”
I’m curious to know what you use the community for?
The Visma Machine Learning Assets team works in the field of Document AI, specifically with SmartScan. The team started using Hugging Face in 2020, when we wanted to explore the Transformer model, which is a natural language processing model that Hugging Face specialises in. They have a repertoire of almost 70,000 thousand machine learning models for different use cases and languages.
So, we’re using their image models and natural language models as a starting point to fine-tune our data. This means that we can quickly iterate and experiment with what works and doesn’t work for us. Needless to say, Hugging Face Transformers is a huge asset to our team.
What’s more, we’ve also found that the community around Hugging Face is great for discovering relevant research papers and new algorithms.
So, you recently contributed a TensorFlow model at Hugging Face Transformers. Can you tell us about that?
The specific thing that the team and I contributed is an AI model, called LayoutLMv3. It’s a huge state-of-the-art model from Microsoft in the field of Document AI. It combines some of the best from computer vision and natural language processing in one model that is able to understand documents beyond just text. On top of that, the LayoutLMv3 model is super accurate, amazingly fast, and also relatively simple at the same time.
We’ve been using TensorFlow (Google’s AI framework) in our team for a long time, and we saw that a TensorFlow version of LayoutLMv3 didn’t exist, so we decided it was time to give back to the community that has given us so much.
“The LayoutLMv3 model is super accurate, amazingly fast, and also relatively simple at the same time.”
This is the first TensorFlow model at Hugging Face that was added entirely by a contributor (including model architecture and weights). What’s in it for the community?
Besides the benefits we get from the model, this implementation we shared with the Hugging Face community plays a part in democratising AI. The model becomes much more freely and widely available to other developers in organisations around the world, which I think is pretty cool.
What does this mean to you? (And congrats!)
Of course it’s a privilege to give back, as it’s nice to not only be a consumer, but also a contributor in the community, so I think it’s really great that we have the talent and support to do this. I also want to give a big shout out to the rest of the Visma Machine Learning Assets team and my co-contributors Esben Toke Christensen and Lasse Reedtz for pushing this through the finish line. Also a big thank you to the Hugging Face team for their incredible work and support.
“Hugging Face transformers are a huge asset in our team, so it is a privilege for me to be able to give back to the community by contributing this implementation.”
Are you interested in how we can help you with machine learning?
For more information about the Machine Learning Assets team and our products Smartscan and Autosuggest:
If you want to learn more about LayoutMv3: Read this article on LayoutLMv3: Pre-training for Document AI