What does the group_ids parameter of the TimeSeriesDataSet class specifically do in PyTorch Forecasting?

Issue

I am currently working with PyTorch Forecasting and I want to create dataset with TimeSeriesDataSet. My original data lies in a pandas Dataframe and looks like this:

date         amount        location 
2014-01-01     5               A
2014-01-01     7               B
    ...       ...             ...
2017-12-30     4               H
2017-12-31     8               I

So in total I got nine different unique values in "location" and an amount for each location per date. Now I am wondering what the group_ids parameter for the TimeSeriesDataSet class does and what it exact behaviour is? I am not really getting the idea based on the documentation.

Thanks a lot in advance!

Solution

A time-series dataset usually contains multiple time-series for different entities/individuals.

group_ids is a list of columns which uniquely determine entities with associated time series. In your example it would be location:

group_ids (List[str]) – list of column names identifying a time series. This means that the group_ids identify a sample together with the time_idx. If you have only one timeseries, set this to the name of column that is constant.

Answered By – iacob

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published