We’ve adopted maps and navigation into our lives so effortlessly that, more often than not, we don’t have to think about the work that goes into this behind the scenes.
When you’re on the road, you’re more concerned about getting to your location in one piece (assuming most of our readers are stuck in traffic in tier-1 cities just like us tears). A well-designed navigation system ensures you don’t have to think about anything except the destination. A key contributing factor is Turn Identification. Let's find out what goes on behind the scenes in Turn identification.
Our work in the Indian mobility space has given us a vast databank of anonymized telemetry data from our 2/3/4 wheelers. We receive anonymized data in the form of location pings at regular intervals containing geographical coordinates; this enhances our routing system. This data is raw and processed by our fantastic data science team.
This blog post will share a workflow of detecting or identifying turns through this telemetry data. We will cover the following in the blog post:
- Why is turn identification required?
- Some terminologies
- Approach for identifying turn
- Demonstration
- Conclusion
Why is turn identification required?
With rapid urbanization and the facelift our cities are getting, we must update our routing system to identify new roads, turn restrictions and other required details.
Identifying these turns helps us improve our algorithm concerning the current and improving road architecture and furniture. To understand the workflow, we must be familiar with some basic terminology.
Terminologies
Here are some terminologies you must be familiar with before getting into the workflow
- Way ID - Each road and route have a unique way ID, we use the OpenStreetMap(OSM data)
- Device ID - Each device/ cab ping has a unique ID - With respect to the open source data we are using to demonstrate
- Bearing angle - It is the angle of way ID corresponding to North in a clockwise direction.
- Turn Angle - Change in bearing angle while traversing from one way id to another.
- Turn direction - direction of turn (right/left), dependent on the change in bearing angle.
- Turn Class - Straight/ right/left/ u-turn, depending on the magnitude of the turn angle.
Let us go ahead with the approach!
The Mathematical Approach
On a high level, the approach considers ride count between the way ID pair as a factor to understand restrictions. Assessing the angle between the two-way I'd help us understand which turn was taken and which was merely a deviation.
For any turn, we have two angles as described below:
We are following a general mathematical approach to detect the turn based on the angle of deviation from the true north.
Going mathematically,
- If | angle 2 - angle 1| < 180 then;
- If angle 2 > angle 1, then it would be a right turn
- If not, it would be a left turn
- If | angle 2 - angle 1| > 180;
- If angle 2 > angle 1 then it would be a left turn
- Else, it would be a right turn
A detailed illustration has been given in the flowchart below.
The mathematical approach seems very simple, but there might be cases where something else is needed. A lot of filtering and managing of the data is required, and here is where the data science team takes over. Here is the workflow for identifying the turn.
Let us understand how each step works with a code snippet inclined. For demonstration purposes, we will use an open-source Chicago Taxi mobility dataset.
Imports
These are the necessary imports required for processing the data.
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
from shapely.geometry import box
import contextily as ctx
Input data (Cab ping and OSM Shp for Chicago)
df_gps = pd.read_csv('./data_uic.csv',header=None)
osm_chicago = gpd.read_file('./chicago_osm.shp')
Filtering Data
We only require the following data: -
- Device_id
- Timestamp
- Latitude
- Longitude
- Speed
- Bearing
To be more specific, we will only take out the above data for a particular device Id as an indication of a single journey.
df_gps.columns = ['device_id', 'timestamp','latitude','longitude','speed','bearing']
df_ride = df_gps[df_gps.device_id == 478]
gdf_ride = gpd.GeoDataFrame(df_ride, geometry=gpd.points_from_xy(df_ride.longitude,df_ride.latitude),crs=4326)
gdf_ride.timestamp = pd.to_datetime(gdf_ride.timestamp)
gdf_ride
Input data visualization
Input data can also be visualized.
plt.figure(figsize=(10,5))
ax = plt.axes()
gdf_ride.plot(ax=ax, color='b')
ctx.add_basemap(ax,crs=4326, source=ctx.providers.OpenStreetMap.Mapnik)
c = 0
for x, y, label in zip(gdf_ride.geometry.x, gdf_ride.geometry.y, gdf_ride.timestamp):
if c % 10 == 0:
ax.annotate(label, xy=(x, y), xytext=(3, 3), textcoords="offset points",rotation=45, size=8)
c+=1
Similarly, we can also visualize the difference/change in the bearing angle.
plt.figure(figsize=(20,2))
plt.plot(gdf_ride.timestamp[1:],np.diff(gdf_ride.bearing))
plt.show()
Now the mathematical logic has to be applied in the form a function:
def angledist(a1, a2):
print((a1-a2),(a1-a2) % 360,(a2-a1) % 360,a2-a1)
return(min(abs(a1-a2),abs((a1-a2) % 360),abs((a2-a1) % 360),abs(a2-a1)))
def turn_dir(a1,a2):
if abs(a2-a1) < 180 :
if a2 > a1:
turn_side = 'right'
else:
turn_side = 'left'
else:
if a2 > a1:
turn_side = 'left'
else:
turn_side = 'right'
return turn_side
def turn_type(point_gdf,points_final):
turn_magnitude = []
turn_class = []
k = point_gdf.bearing.values
for f in points_final['index'].values:
i = point_gdf.index.values.tolist().index(f)
try:
turn_direction.append(turn_dir(k[i],k[i+1]))
c = [turn_dir(k[i],k[i+1]),turn_dir(k[i],k[i+2]),turn_dir(k[i],k[i+3])]
if c[0] != c[1]:
turn = abs(np.diff([angledist(k[i], k[i+1]),angledist(k[i], k[i+2])])[0]/2)
else:
turn = np.mean([angledist(k[i], k[i+1]),angledist(k[i], k[i+2])])
turn_magnitude.append(turn)
turn_series = np.unique(np.array(c), return_counts=True)
if turn <= 15:
tr_class = 'straight'
elif turn > 15 or turn < 165:
tr_class = turn_series[0][list(turn_series[1]).index(turn_series[1].max())]
else:
tr_class = 'U-turn'
turn_class.append(tr_class)
except IndexError:
pass
print(gdf.index[i], '=',angledist(k[i], k[i+1]),angledist(k[i], k[i+2]),angledist(k[i], k[i+3]),turn_dir(k[i],k[i+1]),np.mean([angledist(k[i], k[i+1]),angledist(k[i], k[i+2]),angledist(k[i+3], k[i])]))
return turn_magnitude, turn_class
Here is the workflow of the turn classification:
Now that the function is ready, we have to merge all the different datasets; it will be done by making a spatial joint between the OSM and the ping data.
osm_ping_count = osm_chicago.merge(join_osm_ping.groupby('osm_id').count().geometry.rename('ping_count').reset_index())
osm_ping_count1 = osm_ping_count.merge(join_osm_ping.groupby('osm_id').max().timestamp.rename('leaving_time').reset_index())
osm_ping_count2 = osm_ping_count1.merge(join_osm_ping.groupby('osm_id').min().timestamp.rename('entry_time').reset_index())
final_way_ids = osm_ping_count2[osm_ping_count2.ping_count>1]
We might have multiple duplicate data; let’s remove the unnecessary duplicates;
final_way_ids = final_way_ids.sort_values('ping_count').drop_duplicates(subset=['leaving_time'],keep='last')
## Remove duplicates
final_way_ids = final_way_ids.sort_values('ping_count').drop_duplicates(subset=['entry_time'],keep='last')
points_final = join_ping_osm.reset_index().merge(final_way_ids.osm_id).sort_values('timestamp').drop_duplicates(subset=['osm_id'], keep='last')
to_osm_ids = points_final.osm_id.values[1:].tolist()
points_final = points_final[:-1]
turn_angle, turn_class = turn_type(gdf_ride,points_final)
The next step is to get the last ping for each way Id and obtain the turn angle and classification.
points_final['to_osm_id'] = to_osm_ids
points_final['turn_class'] = turn_class
points_final['turn_angle'] = turn_angle
We can even plot and visualize the output.
plt.figure(figsize=(20,20))
ax = plt.axes()
# osm_ping_count.plot(ax =ax,column= 'way_id')
final_way_ids.plot(column= 'osm_id',ax =ax,cmap='jet')
# gdf_ride_proj.plot(ax =ax,color='c')
points_final.plot(ax =ax,linewidth=5,column= 'osm_id')
# for x, y, label in zip(gdf_proj.geometry.x, gdf_proj.geometry.y, gdf_proj.bearing):
# ax.annotate(label, xy=(x, y), xytext=(3, 3), textcoords="offset points",rotation=0, size=8)
for x, y, label in zip(points_final.geometry.x, points_final.geometry.y, points_final.turn_class):
ax.annotate(label, xy=(x, y), xytext=(3, 3), textcoords="offset points",rotation=45, size=8)
# for x, y, label in zip(final_way_ids.geometry.centroid.x, final_way_ids.geometry.centroid.y, final_way_ids.osm_id):
# ax.annotate(label, xy=(x, y), xytext=(3, 3), textcoords="offset points",rotation=45, size=8)
ctx.add_basemap(ax,crs=26916, source=ctx.providers.OpenStreetMap.Mapnik)
Time to get the result of the turn classification;
result = points_final[['longitude', 'latitude','osm_id','to_osm_id', 'turn_class', 'turn_angle']]
result.columns = ['longitude', 'latitude','from_way_id','to_way_id', 'turn_class', 'turn_angle']
Result
Note that for scaling things up and using large datasets, we will need the support of Pyspark or Apache Sedona and similar other technologies.
Conclusion
We have relied on maps and navigation systems for almost everything. With the increase in technical advancements, all these techs have become an integral part of our lifestyle. These small features, like turn restriction and identification, require a turn-around of efforts and mathematical calculations. Math concepts and visualizations like these apply to all location and data analytics.
Data scientists work on enormous datasets to solve both simple and complex problems. Our dedicated team of engineers, data scientists and GIS professionals perform tasks that impact the end user silently behind the scenes.
We at OLA Campus Pune work with many exciting technologies and solve real-world problems like identification & restriction of a turn. We are working on building next-gen solutions to the existing mobility problems, aiming to make a consumer's journey a lot easier. A Big thanks to Aazad Patle, our expert in data science, for helping us throughout this article.We look forward to sharing more of our workflows in the coming days.
If you have some feedback or have found this blog post of your interest, do Connect with Us!
Important Links:
Top comments (0)