# Matching geographic coordinates between two data frames

## Issue

I have two data frames that have Longitude and Latitude columns.
DF1 and DF2:

``````DF1 = pd.DataFrame([[19.827658,-20.372238,8614], [19.825407,-20.362608,7412], [19.081514,-17.134456,8121]], columns=['Longitude1', 'Latitude1','Echo_top_height'])
DF2 = pd.DataFrame([[19.083727, -17.151207, 285.319994], [19.169403, -17.154144, 284.349994], [19.081514,-17.154456, 285.349994]], columns=['Longitude2', 'Latitude2','BT'])
``````  I need to find a match for long and lat in DF1 with a long and lat in DF2. And where data match, add the corresponding value from the BT column from DF2 to DF1.

I used the code from here and managed to check if there is a match:

``````from sklearn.metrics.pairwise import haversine_distances
threshold = 5000 # meters
DF1['nearby'] = (
# get the distance between all points of each DF
haversine_distances(
# note that you need to convert to radiant with *np.pi/180
X=DF1[['Latitude1','Longitude1']].to_numpy()*np.pi/180,
Y=DF2[['Latitude2','Longitude2']].to_numpy()*np.pi/180)
``````

So the result I need would look like this:

``````Longitude1 Latitude1 Echo_top_height   BT
19.82       -20.37       8614         290.345
19.82       -20.36       7412         289.235
and so on...
``````

## Solution

You can use `BallTree`:

``````# Update: for newer versions of sklearn
from sklearn.neighbors import BallTree
from sklearn.metrics import DistanceMetric
# from sklearn.neighbors import BallTree, DistanceMetric

# DF1
dist = DistanceMetric.get_metric('haversine')
tree = BallTree(coords, metric=dist)

# DF2
distances, indices = tree.query(coords, k=1)
df1['BT'] = df2['BT'].iloc[indices.flatten()].values
df1['Distance'] = distances.flatten()
``````

Output:

Longitude1 Latitude1 Echo_top_height BT Distance
19.8277 -20.3722 8614 284.35 0.0572097
19.8254 -20.3626 7412 284.35 0.0570377
19.0815 -17.1345 8121 285.32 0.000294681 