Issue
I have some -np.inf
and np.inf
values in my dataframe.
I would like to replace them with the respective minimum and maximum values of the dataframe.
I thought it should be possible with something like this:
df.replace([np.inf, -np.inf], [df.max, df.min], axis=1, inplace = True)
But it didn’t work. I had the idea because I can use something similar to replace nans with fillna()
.
What is an effective way to go about it?
Is there a numpy version?
Thanks for any tips!
Solution
You can use .replace()
, as follows:
df = df.replace({np.inf: df[np.isfinite(df)].max().max(),
-np.inf: df[np.isfinite(df)].min().min()})
Here, df[np.isfinite(df)].max().max()
and df[np.isfinite(df)].min().min()
are the respective finite maximum and minimum of the dataframe. We replace np.inf
and -np.inf
with them respectively.
Demo
Data Input
df = pd.DataFrame({'Col1': [np.inf, -2000.0, 345.0], 'Col2': [1234.0, -np.inf, 890.0]})
Col1 Col2
0 inf 1234.0
1 -2000.0 -inf
2 345.0 890.0
Output:
print(df)
Col1 Col2
0 1234.0 1234.0
1 -2000.0 -2000.0
2 345.0 890.0
Edit
If you want to replace with min max of the particular column instead of the min max over the global dataframe, you can use nested dict in .replace()
, as follows:
col_min_max = {np.inf: df[np.isfinite(df)].max(), # column-wise max
-np.inf: df[np.isfinite(df)].min()} # column-wise min
df = df.replace({col: col_min_max for col in df.columns})
Demo
Data Input
df = pd.DataFrame({'Col1': [np.inf, -2000.0, 345.0], 'Col2': [1234.0, -np.inf, 890.0]})
Col1 Col2
0 inf 1234.0
1 -2000.0 -inf
2 345.0 890.0
Output:
print(df)
Col1 Col2
0 345.0 1234.0
1 -2000.0 890.0
2 345.0 890.0
inf
and -inf
are replaced by the respective max, min of the column accordingly.
Answered By – SeaBean
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0