BoxofDocs logo

Document Topic Modeling

Company

BoxOfDocs is a government technology company headquartered in Calgary, Canada. The first of its kind, the company provides a knowledge-sharing platform designed specifically for the use of local governments.

It aims to be the world’s largest secure library of public and member-contributed municipal documents – policies, practices, support documents from local governments – and hopes to unite communities around the world to solve common problems together. Its revolutionary platform gives governments access to documents to help them make more informed decisions and improve policies.

Requirement

Local government meetings may comprise individuals from various regions, offices, and municipalities. Discussions include a wide range of well-known and unfamiliar topics.

The BoxOfDocs platform stores the meeting minutes and uses artificial intelligence and machine learning techniques to identify topics that could be of interest to municipalities to develop policies and make decisions. They currently have over 140,000 meeting minutes that contain a big mix of topics, and this number increases by a thousand every day.

BoxOfDocs wanted to market its platform to local governments around the world. Their platform uses ML to:

  • Extract information
  • Identify the topics discussed
  • Group topics
  • Give users access to the most discussed topics
  • Provide users with the top 10-15 topics discussed
  • Provide users with unique topics that they might not be aware of but might find relevant
  • Ensure users have access to topics they are unfamiliar with
  • Help governments and municipalities share documents to make better decisions regarding new policies

The prototype needed to be ready in two months so it could be shown to investors.

Problem

BoxOfDocs had massive amounts of data that were not available to users in the most seamless and conducive manner. The team used a manual process to sift through and code data. The process was laborious and time-consuming as with building any software. It required an automated pipeline that would read through documents as soon as it received them and segregate common topics. The platform must retrieve the most relevant information and provide users with instant search results when users search for a particular topic.

The company’s data scientists were working toward a solution but it involved reading through each document and then writing code for each one. The process is manual which meant a lot of time was spent on the project.

Another problem was that the new meeting minutes were added to the platform every day. The documents had to be quickly added and predictions made available to users seamlessly.

They also needed a solution that could be used repeatedly on new sets of documents as they are added to their platform. This would limit time wasted creating a new solution for each new set of documents. This functionality is critical to BoxOfDocs being able to stay competitive and quickly provide its users with updated information.

It tried a variety of services that did not work out. Some solutions worked to a certain degree but were too expensive to implement on a large scale. Additional computational costs and technological costs also had to be considered.

Time was of the essence, and BoxOfDocs wanted a quick solution that would ensure that their platform was always current and up-to-date. Patch-creating a solution in-house was out of the question. The entire process would need…

  • senior data scientists to implement algorithms to read the data
  • a team of software engineers or AI developers to automate the process
  • developers who could develop and integrate APIs into the application

The project would likely require a team of 10 highly-skilled experts and a manager. BoxOfDocs did not have the expertise in-house. Hiring such a team was out of reach.

Solution

BoxOfDocs deployed mlOS, the Machine Learning Operating System. This feature-rich solution allowed their team is able to build, deploy, and manage models at scale. It has so far proven to be a fast and cost-effective way for BoxOfDocs to produce ML to power its Document Topic Modeling platform.

mlOS

mlOS is an automated machine learning platform that enables uniform, fast, and repeatable ways to build, deploy, and manage ML models. It has everything a team needs to create, deploy, and control models from one place. It is easy-to-use and easily accessible. The generated codes are easily customizable and the fully-automated pipeline makes the entire process repeatable.

mlOS can be used seamlessly by developers of all skill levels. The entire team can collaborate so several members can work on a project simultaneously.

Braintoy’s in-house data scientists offered private consultations to help ensure the needs are understood and the platform used to its full potential. 

1 person could now do the job of 10!

Results

BoxOfDocs needed a solution that was simple and fast, cost-effective, repetitive, reliable, and robust. After exploring many options, they decided to implement Braintoy’s mlOS to see what could be achieved. 

BoxOfDocs is able to accomplish a lot with its small dedicated team. Braintoy has a data scientist working with BoxOfDocs on a regular basis to ensure they get full value. It’s a partnership.

With mlOS, algorithms can be created and updated in real-time, allowing BoxOfDocs users to always access the latest information. One unique feature was that it can work with any type of data and repeatedly process it within the same machine learning pipeline. Algorithms work on new data as it is added. The algorithm can be applied to all similar data types. This feature has been exceptionally beneficial to BoxOfDocs, as data is constantly added to the platform.

Plug & Play

Another advantage is that BoxOfDocs has not had to make any modifications to their existing processes in order to accommodate Braintoy’s platform. mlOS works seamlessly with all data inputs. Using mlOS does not need any specialized skills. Anyone with a basic understanding of data science can use it.

New data comes in daily and models are quickly updated and deployed so users have instant access to the most current information. With just a click, models can be updated or replaced without interrupting the other applications’ normal functions.

Highlights

  • A fully automated pipeline – all that will be required is the input of data, and within a few clicks, they will be able to create a model that would provide them with the desired output (for example, the top 10 topics discussed).
  • BoxOfDocs got intelligent machine learning designed around a customized application instantly.
  • mlOS is integrated into the BoxOfDocs software application “out-of-the-box” so that users get instant access to the most current and precise documents they require.

Join a Live Training on October 29

mlOS 101