Issue
I am currently working with PyTorch Forecasting and I want to create dataset with TimeSeriesDataSet. My original data lies in a pandas Dataframe and looks like this:
date amount location
2014-01-01 5 A
2014-01-01 7 B
... ... ...
2017-12-30 4 H
2017-12-31 8 I
So in total I got nine different unique values in "location" and an amount for each location per date. Now I am wondering what the group_ids parameter for the TimeSeriesDataSet class does and what it exact behaviour is? I am not really getting the idea based on the documentation.
Thanks a lot in advance!
Solution
A time-series dataset usually contains multiple time-series for different entities/individuals.
group_ids
is a list of columns which uniquely determine entities with associated time series. In your example it would be location
:
group_ids (List[str]) – list of column names identifying a time series. This means that the group_ids identify a sample together with the
time_idx
. If you have only one timeseries, set this to the name of column that is constant.
Answered By – iacob
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0