To predict the food delivery time in real-time, we need to calculate the distance between the food preparation point and the point of food consumption. After finding the distance between the restaurant and the delivery locations, we also need to analyze the relationships between the time taken by delivery partners to deliver food in the past for the same distance.
For this task, we require a dataset containing information about the time taken by delivery partners to deliver food from the restaurant to the delivery location. We found an ideal dataset with all the necessary features for this task, which you can download from here.
In the section below, we will take you through the task of Food Delivery Time Prediction with Machine Learning using Python.
Food Delivery Time Prediction using Python
We will begin the task of predicting food delivery time by importing the necessary Python libraries and loading the dataset:
import pandas as pd
import numpy as np
import plotly.express as px
data=pd.read_csv("/Users/rahul_anand/Downloads/deliverytime.txt")
print(data.head())
Now, let’s take a quick look at the column insights before continuing:
data.info()
data.describe().T
Next, we will examine the dataset to determine whether it contains any null values
data.isnull().sum()
Calculating Distance Between Two Latitudes and Longitudes
The dataset does not include a direct feature showing the distance between the restaurant and the delivery location. Instead, it provides the latitude and longitude points for both. To calculate the distance between these two points, we can use the Haversine formula, which computes the great-circle distance based on latitude and longitude values.
Here’s how we can apply the Haversine formula to calculate the distance between the restaurant and the delivery location:
# Set the earth radius in km
R=6371
#Convert degrees to radians
def deg_to_rad(degrees):
return degrees*(np.pi/180)
# Function to calculate the distance between two points using the haversine formula
def distcalculate(lat1, lon1, lat2, lon2):
d_lat=deg_to_rad(lat2-lat1)
d_lon=deg_to_rad(lon2-lon1)
a=np.sin(d_lat/2)**2 + np.cos(deg_to_rad(lat1)) * np.cos(deg_to_rad(lat2)) * np.sin(d_lon/2)**2
c=2*np.arctan2(np.sqrt(a), np.sqrt(1-a))
return R*c
#Calulate the distance between each pair of points
data['Distance']=np.nan
for i in range(len(data)):
data.loc[i, 'Distance']=distcalculate(data.loc[i, 'Restaurant_latitude'],
data.loc[i, 'Restaurant_longitude'],
data.loc[i, 'Delivery_location_latitude'],
data.loc[i, 'Delivery_location_longitude'])
We’ve calculated the distance from the restaurant to the delivery location and added it as a new column called distance. Now, let’s check the dataset again:
print(data.head())
Data Exploration
Now, let’s explore the dataset to understand the relationships between different features. We’ll begin by examining how the distance impacts the time taken to deliver the food:
fig=px.scatter(data_frame=data,
x="Distance",
y="Time_taken(min)",
size="Time_taken(min)",
trendline="ols",
title="Relationship between Distance and Time Taken")
fig.show()
We can see a clear relationship between the time taken and the distance traveled for food delivery. On average, most delivery partners complete their deliveries within 25–30 minutes, regardless of the distance.
Now, let’s explore how the delivery time varies with the age of the delivery partner:
fig=px.scatter(data_frame=data,
x="Delivery_person_Age",
y="Time_taken(min)",
size="Time_taken(min)",
color="Distance",
trendline="ols",
title="Relationship Between Time Taken and Age")
fig.show()
We can observe a linear relationship between the delivery time and the age of the delivery partner. This indicates that younger delivery partners tend to deliver food faster compared to older partners.
Now, let’s explore how the delivery time is related to the ratings of the delivery partner:
fig=px.scatter(data_frame=data,
x="Delivery_person_Ratings",
y="Time_taken(min)",
size="Time_taken(min)",
color="Distance",
trendline="ols",
title="Relationship Between Time Taken and Ratings")
fig.show()
We can see an inverse linear relationship between the delivery time and the ratings of the delivery partner. In other words, delivery partners with higher ratings tend to deliver food faster than those with lower ratings.
Next, let’s check whether the type of food ordered by the customer and the type of vehicle used by the delivery partner have any impact on the delivery time:
fig=px.box(data,
x="Type_of_vehicle",
y="Time_taken(min)",
color="Type_of_order")
fig.show()
It looks like the type of vehicle used or the kind of food being delivered doesn’t make a big difference in the delivery time. From our analysis, the features that have the biggest impact on food delivery time are:
- Age of the delivery partner
- Ratings of the delivery partner
- Distance between the restaurant and the delivery location
In the next section, we’ll walk through how to train a Machine Learning model to predict food delivery time.
Food Delivery Time Prediction Model
Now, let’s build and train a Machine Learning model using an LSTM neural network to predict food delivery time. In the next section, we’ll walk through the step-by-step process of training this model:
#Splitting data
from sklearn.model_selection import train_test_split
x=np.array(data[["Delivery_person_Age",
"Delivery_person_Ratings",
"Distance"]])
y=np.array(data[["Time_taken(min)"]])
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.10,
random_state=42)
#Creating the LSTM neural network model
from keras.models import Sequential
from keras.layers import Dense, LSTM
model=Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(xtrain.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.summary()
# training the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=1, epochs=9)
Now, let’s test how well our model performs by providing some inputs to predict the food delivery time.
print("Food Delivery Time Prediction")
a=int(input("Age of Delivery Partner: "))
b=float(input("Ratings of Previous Deliveries: "))
c=int(input("Total Distance: "))
features=np.array([[a, b, c]])
print("Predicted Delivery Time in Minutes = ", model.predict(features))
Conclusion
This project highlights how machine learning can be applied to predict food delivery times with reasonable accuracy, helping both customers and businesses plan better. While the current model captures important features like distance, order size, and preparation time, future improvements could include integrating time series forecasting methods. By analyzing daily, weekly, or even seasonal patterns in delivery data, these techniques can uncover hidden trends and further boost prediction reliability. Ultimately, combining machine learning with time-aware models offers a more robust solution for real-world delivery challenges.