Take the difference between pivot table columns and compare to last column in previous row

Issue

I have a pivot table. Columns represent years, rows month. I want to create two tables containing the percent changes between every value and its counterpart for the previous month.
I have managed to create a pivot table with the percentage changes, but logically, data is missing for January.
Instead, I would like to compare January with December, i.e. the last row of the previous column.
Thank you in advance.

df = pd.DataFrame(np.random.randint(0,100,size=(12, 3)), columns=('2016', '2017', '2018'))
df.index.name = 'month'
df.index = df.index +1
print(df)   
         2016  2017  2018
month                  
1        49    98     7
2        72    60    67
3        64    71    53
4        71    75    91
5        68    96    48
6        35    21    54
7        14    98     3
8        62    38    64
9        68    92    58
10       64    95    94
11       54    81     8
12       86    18    90

my current solution:

df_month_pctchange = df.pct_change(axis=0).mul(100)
print(df_month_pctchange)
        2016   2017    2018
month                      
1        NaN    NaN     NaN
2      46.94 -38.78  857.14
3     -11.11  18.33  -20.90
4      10.94   5.63   71.70
5      -4.23  28.00  -47.25
6     -48.53 -78.12   12.50
7     -60.00 366.67  -94.44
8     342.86 -61.22 2033.33
9       9.68 142.11   -9.38
10     -5.88   3.26   62.07
11    -15.62 -14.74  -91.49
12     59.26 -77.78 1025.00

Desired result:

        2016   2017    2018
month                      
1        NaN   7.35  -61.11
2      46.94 -38.78  857.14
3     -11.11  18.33  -20.90
4      10.94   5.63   71.70
5      -4.23  28.00  -47.25
6     -48.53 -78.12   12.50
7     -60.00 366.67  -94.44
8     342.86 -61.22 2033.33
9       9.68 142.11   -9.38
10     -5.88   3.26   62.07
11    -15.62 -14.74  -91.49
12     59.26 -77.78 1025.00

Solution

you can select both first and last row of df with iloc, use shift on the last row to report value from 2016 to 2017 and so on, and do the calculation. Then set the result in the first row of df_month_pctchange

# your code
df_month_pctchange = df.pct_change(axis=0).mul(100)

# what to add to fill the missing values
df_month_pctchange.iloc[0] = (df.iloc[0]/df.iloc[-1].shift()-1)*100

print(df_month_pctchange)
#              2016        2017         2018
# month                                     
# 1             NaN   13.953488   -61.111111 # note it is 13.95 and not 7.35 in 2017
# 2       46.938776  -38.775510   857.142857
# 3      -11.111111   18.333333   -20.895522
# 4       10.937500    5.633803    71.698113
# 5       -4.225352   28.000000   -47.252747
# 6      -48.529412  -78.125000    12.500000
# 7      -60.000000  366.666667   -94.444444
# 8      342.857143  -61.224490  2033.333333
# 9        9.677419  142.105263    -9.375000
# 10      -5.882353    3.260870    62.068966
# 11     -15.625000  -14.736842   -91.489362
# 12      59.259259  -77.777778  1025.000000

Answered By – Ben.T

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published