Why is my Tensorflow LSTM Timeseries model returning only one value into the future

Issue

I have a tensorflow model for predicting Timeseries values using LSTM, it trains fine but when I ask it to predict some values in time it only gives me the T+1 value,

How can I make it to give me from T+1 to T+n values instead of just T+1.

I thought about giving him back the predicted value to analyse again in a loop, e.g.

We look back at 20 samples for this example

T±0 = now value
T-k = value k steps into the past (known)
T+n = value n steps into the future (unkown at the start)

--- Algorithm

T+1 = model.predict(data from T-20 to T±0)
T+2 = model.predict(data from T-19 to T+1) #using the previously found T+1 value
T+3 = model.predict(data from T-18 to T+2) #using the previously found T+1 and T+2 values
.
.
.
T+n = model.predict(data from T-(k-n) to T+(n-1)) # using the previously found T+1 .. T+(n-1) values

The thing is that T+1 has an mean absolute error of around 0.75%, doesn’t the error propagate/compound through the predictions ? If it does it means that if I ask the program to predict T+10 it will have a mean absolute error of 0.75%^10 = ~7.7%, which is not very good in my case. So I’m looking for other ways to predict up to T+n values.

I’ve looked at few youtube tutorials but each time it seems that their call of model.predict(X) returns multiple values already, and I have no idea about what parameters I could have missed.

Code :

import tensorflow.keras as tfks
import pandas as pd
import numpy as np

def model_training(dataframe,folder_name, window_size=40, epochs=100, batch_size=64):
    """Function to start training the model on given data
    
    Parameters
    ----------
    dataframe : `pandas.DataFrame` The dataframe to train the model on
    window_size : `int` The size of the lookback window to use
    epochs : `int` The number of epochs to train for
    batch_size : `int` The batch size to use for training

    Returns
    -------
    None
    """
    dataframe,_ = Process.pre(dataframe) #function to standardize each column of the data

    TRAIN_SIZE = 0.7
    VAL_SIZE = 0.2
    TEST_SIZE = 0.1

    #Splitting the data into train, validation and test sets
    x,y = dataframe_to_xy(dataframe, window_size) #converts pandas dataframe to numpy array

    x_train,y_train = x[:int(len(dataframe)*TRAIN_SIZE)],y[:int(len(dataframe)*TRAIN_SIZE)]
    x_val,y_val = x[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE))],y[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE))]
    x_test,y_test = x[int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE)):],y[int(len(dataframe)*(TRAIN_SIZE+VAL_SIZE)):]

    #Creating the model base
    model = tfkr.models.Sequential()
    model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
    model.add(tfkr.layers.LSTM(64))
    model.add(tfkr.layers.Dense(8, 'relu'))
    model.add(tfkr.layers.Dense(10, 'linear'))

    model.summary()

    #Compiling and saving the model
    cp = tfkr.callbacks.ModelCheckpoint('ai\\models\\'+folder_name+'\\', save_best_only=True)
    model.compile(loss=tfkr.losses.MeanSquaredError(), optimizer=tfkr.optimizers.Adam(learning_rate=0.0001), metrics=[tfkr.metrics.RootMeanSquaredError()])

    model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=[cp])

def predict_data(model,data_pre,window_size,n_future):
    '''Function to predict data using the model
    
    Parameters
    ----------
    model : `tensorflow.keras.models.Sequential` The model to use for prediction
    data_pre : `pandas.DataFrame` The dataframe to predict on
    window_size : `int` The size of the lookback window to use
    n_future : `int` Number of values to predict in the future

    Returns
    -------
    data_pred : `pandas.DataFrame` The dataframe containing the predicted values
    '''

    time_interval = data_pre.index[1] - data_pre.index[0] 

    #Setting up the dataframe to predict on
    data_pre, proc_params = Process.pre(data_pre) #function to standardize each column of the data
    data_pred = data_pre.iloc[-window_size:]
    data_pred = data_pred.to_numpy().astype('float32')
    data_pred = data_pred.reshape(1,window_size,10)

    #Predicting the data
    data_pred = model.predict(data_pred)
    
    #Converting the data from numpy array to pandas dataframe and doing some formatting + post-processing/reversing standardization
    #yada yada pandas dataframe

    return data_pred

If there is no way preventing the error propagation, do you have any tips for reducting the error of the model ?

Thanks in advance.

Solution

You could let your LSTM return the full sequences, not only the last output, like this:

model = tfkr.models.Sequential()
    model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
    model.add(tfkr.layers.LSTM(64, return_sequences=True))
    model.add(tfkr.layers.Dense(8, 'relu'))
    model.add(tfkr.layers.Dense(10, 'linear'))

Each LSTM output would go through the same dense weights (because we did not flatten). Your output would be then of shape

(None, window_size, 10)

This means that for k input time points you would get k output time points.

Downside is, that the first output would be calculated only using the first input. The second output only by using the first two inputs, and so on. So I would suggest to use a bidirectional LSTM, and combine both directions, maybe like this:

model = tfkr.models.Sequential()
        model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
        model.add(tfkr.layers.Bidirectional(tfkr.layers.LSTM(64, return_sequences=True), merge_mode='sum')
        model.add(tfkr.layers.Dense(8, 'relu'))
        model.add(tfkr.layers.Dense(10, 'linear'))

Answered By – AndrzejO

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published