How to Binned data frame in row and columns

Issue

I have a dataframe with datetime as index, and the columns "Direction", "Shear", "Ratio".
Like

                    Wspd_avg_122  Wspd_max_122  ...     Ratio     Shear
Date                                             ...                    
2017-12-31 12:30:00         12.42         16.24  ...  1.061997  0.109330
2017-12-31 12:40:00         13.61         17.66  ...  0.918442  0.114139
2017-12-31 12:50:00         12.94         18.21  ...  1.071097  0.027630
2017-12-31 13:00:00         11.49         17.24  ...  1.085292  0.071180
2017-12-31 13:10:00         12.20         16.14  ...  0.999180  0.070978

I would like to create a dataframe with binned Shear as index and Binned Direction as Column. And the values be Ratio mean.

I tried:

def Dir_binned(frame):
    Dir_bins = np.linspace(0,360,37)
    binned = frame.groupby(pd.cut(frame['Dir_117m'],Dir_bins))
    bins = binned.mean().fillna(0)
    Count = binned.count()
    return bins, Count

def Shear_binned(frame):
    Shear_bins = np.linspace(0.00,0.80,17)
    binned = frame.groupby(pd.cut(frame['Shear'],Shear_bins))
    bins = binned.mean().fillna(0)
    Count = binned.count()
    return bins, Count 

But in this case I only binning my Shear or Direction as index and the others columns appear the mean value of each bin.

Example:

                  0-10          10-20     ...   350-360   
Shear                  
0.00-0.05         1.061684      1.913588  ...  1.061459
0.05-0.10         1.061847      1.164875  ...  0.918442  
0.10-0.15         1.062354      1.485624  ...  1.071097  
0.15-0.20         1.063094      1.485254  ...  1.085292 
0.20-0.25         1.078492      1.060152  ...  0.999180  

Solution

It looks like you want a pivot table with a mean aggfunc. Here’s a mockup with some sample data.

df = pd.DataFrame({'Shear':[0.109330,0.114139,0.027630,0.071180,0.070978],
                   'Dir_117m':[1,2,10,12,150],
                   'Ratio':[1,2,3,4,5]
                  })


df['shear_bin'] = pd.cut(df['Shear'], bins = np.linspace(0,0.80,17))
df['dir_bin'] = pd.cut(df['Dir_117m'], bins= np.linspace(0,360,37))


df.pivot_table(index='shear_bin', columns='dir_bin', values='Ratio', aggfunc='mean', fill_value=0)

Output

dir_bin      (0.0, 10.0]  (10.0, 20.0]  (140.0, 150.0]
shear_bin                                             
(0.0, 0.05]          3.0             0               0
(0.05, 0.1]          0.0             4               5
(0.1, 0.15]          1.5             0               0

Answered By – Chris

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published