Would you please tell me what is wrong with the following as I get the error:
ValueError: cannot reindex a non-unique index with a method or limit
import pandas as pd import numpy as np import matplotlib.pyplot as plt import pandas_datareader as web data= web.get_data_yahoo("BTC-USD", start = "2015-01-01 ", end = "2021-01-01 ") btc_dailly_return= data['Adj Close'].pct_change() btc_monthly_returns = data['Adj Close'].resample('M').ffill().pct_change()
When you use resample, you have to tell it how you would like to combine all the entries within the timeframe you chose. In your example, you’re combining all the values within one month, you could combine them by adding them together, by taking the average, the standard devation, the maximum value, etc. So you have to tell Pandas what you would like to do by providing an additional method:
data['col'].resample('M').sum() data['col'].resample('M').max() data['col'].resample('M').mean()
In your case,
last() is probably the most reasonable, so just change your last line to:
btc_monthly_returns = data['Adj Close'].resample('M').last().ffill().pct_change()
As to why the error only pops up with BTC-USD: that particular table has a duplicate date entry, causing
ffill() to throw an error.
last() (or any other reduction type aggregator) doesn’t care about the duplicate.
resample('<method>').ffill() should be used for upsampling data, i.e. turning a list of months into a list of days. In that case
ffill() would fill all the newly generated timestamps with the value from the previous valid timestamp. Your example downsamples, so a reducing aggregator like
mean should be called.
Answered By – pvandyken