sunset, cut, twilight

Truckers Win by Predicting Empty Miles

Trucking operators struggle to maintain higher operating ratios. They face increased competition. While costs have increased, prices have not. They need to find a way to reduce cost and improve operational efficiency by avoiding wasteful expenses such as unnecessary movement of goods, driving empty miles, optimizing maintenance schedules and avoiding unplanned equipment breakdowns.

A huge culprit that decreases the operating ratio is empty miles. Empty miles, also known as non-revenue miles or deadhead miles are the miles when the truck unit is not earning any revenue for being on the road. This happens when there is a load from origin to destination but no load backward. The truck drives empty. A recent research (Latheef, 2020) indicate that empty miles range from 33% to 40% of the total miles driven!

Empty miles are wasteful for trucking operators. Everyone from the shipper, the end-user, and the environment pays for empty miles. Empty trucks running on the road contribute to 76 million metric tons of carbon dioxide emission each year (Peterson, 2019).

The Challenge

The problem of empt miles faced by a trucker
Source: John Woods/Globe and Mail/2020: David Phillips, as seen in Winnipeg on April 3, 2020.

“We are hauling a lot more empties than we used to because of the virus, said David Phillips, who’s been driving long haul for 33 years. There are days where I’ve hauled two empties all the way back.”

Can Artificial Intelligence help Mr. David Phillips and the trucking company reduce empty miles?

Business Understanding

Empty miles happen because of inefficient scheduling combinations. Real-world data of a Western Canadian trucking company was used. While understanding the problem and the availability of data, it was learnt that trucking companies normally gather data for all their shipments over the years. This includes information about the shipper, the consignee, the driver, and the dispatcher. It also has monetary information such as outsourcing and insourcing cost, pickup and drop off cost, fuel cost, driver pay, and total charges. By using this data, cost, revenue, operational ratio, delays, deadhead miles, type of trailer being requested, etc. can be reliably predicted by machine learning.

Industry-specific knowledge was important for input variable selection to be able to develop a good model. In the simplest case of predicting whether a particular trip will have empty miles or not, a label was created using the records for the empty miles. It is 1 if there are no empty miles otherwise 0. Shipper City, Consignee City, Client Type, Trailer Type, and Month of the years were the input variables to predict the target variable – empty miles.

Machine Learning

An end-to-end AI app was created to predict the likelihood of a loaded trip back. This was done using supervised machine learning in the Braintoy machine learning Operating System (mlOS).

Figure 1: Raw Data Analyses and Data Visualization

The raw data was transformed for detailed analysis. It shows that the empty miles are lowest for the winter months and highest in July.

We know that there could be many reasons for these cyclic changes in empty miles, for example, if the clients of this trucking company are in the consumer goods industry, then the winter months that cover Black Friday, Christmas, Boxing Day, and New Year could be busier than July and August when consumers may go on vacations. Therefore, quite like “trailer type”, “shipper” and consignee locations”, the “client type” could also have an impact on the empty miles. And then is the issue about the recent changes that happened because of COVID-19. All these factors were considered to develop a model to predict where a load will result in empty miles or not.

Supervised machine learning was used to understand the association between all such input features to predict the target variable – will the trip back be likely empty or not? The features with higher importance are selected to improve model performance. The model predicted 0 and 1 for the target variable, empty miles, or no empty miles, respectively, during the trip back.

Accuracy and ROC are two performance evaluation criteria for model evaluation. Of all the algorithms, the Random Forest gave the best performance with a 90.59% accuracy and a decent ROC curve. Knowing that this is real-world data, the model performance is good.

Figure 2: Empty Miles Classification Models created using Auto Pilot in Braintoy mlOS


Preventing empty miles lowers cost, increases profitability, and reduces carbon footprint. If 33% to 40% of the total miles driven are empty miles, then imagine what a knowledgeable dispatcher in a trucking company can do by knowing in advance, upto a 90.59% accuracy, if a trip is going to come back empty or not!

But this is just the start. The models only get better as more data comes in. Many more models can be built to assist the dispatcher to plan the trips such that empty miles are reduced and the total revenue increased at the same time.


Jaspreet Gill is a Data Scientist and Software Engineer with a Master’s in System Software and Post Graduate Diploma in Machine Learning and Artificial Intelligence. She is skilled in ML/AI, anomaly detection and preventative maintenance, modelling, and software engineering.