Hi there stack overflow community,
I have the following dataframe in an excel:
sparte sparten status stati gesellschaft gesellschaften 10 Krankenvoll B beantragt 0 - Allgemein - 11 Reisekranken A aktiv 10000 nordinvest 12 Krankenkasse N beitragsfrei M552D SV SparkassenVersicherung
and the folliwing column for merging in a csv:
sparten status gesellschaft 10 B 0 11 A 10000 12 N M552D
to merge some columns from an excel and a csv file I’m using the following code:
df1 = pd.read_csv(r'path', sep=',').drop(columns = ['risiko']) df2 = pd.read_excel(r'path') df3 = pd.merge(df1,df2[['status','stati']],on='status', how='left').drop(columns = ['status']) df4 = df3.merge(df2[['sparte','sparten']],on='sparte', how='left').drop(columns = ['sparte'])
It works fine for me, but now i want to me merge the following column:
df4 = df3.merge(df2[['gesellschaft','gesellschaften']],on='gesellschaft', how='left') print(df4)
…and it does not work. It merges only the cells with this format
M552D, but leaves the cells with numbers untouched. I don’t understand what I’m doing wrong. If I try to put
how='right' the merge works, but the other columns disappear.
Maybe someone has an idea what is happening here! Thanks for any hint!
The problem is that the
geselschaft column contains only strings in
df1 which is loaded with
read_csv, because the column is not fully numeric. But in
df2 which is loaded with
read_excel, it contains a mix of int and string values. And at Pandas level and int and a string cannot be equal.
A possible workaround is to force a string conversion at merge time:
df4 = df3.merge(df2[['gesellschaft','gesellschaften']], left_on='gesellschaft', right_on = df2['gesellschaft'].astype('str'), how='left')
Answered By – Serge Ballesta