Convert CNN-LSTM model to 1D-CNN model dimension error – `logits` and `labels` must have the same shape

Issue

I have a CNN-LSTM model which I want to convert into a simple CNN model for results comparison. This is the original CNN-LSTM model:

        # define model CNN-LSTM
        model = Sequential()
        model.add(TimeDistributed(Conv1D(filters=16, kernel_size=2, activation='relu'),
                                  input_shape=(None, 30, 40)))
        model.add(TimeDistributed(Conv1D(filters=16, kernel_size=2, activation='relu')))
        model.add(TimeDistributed(Dropout(0.2)))
        model.add(TimeDistributed(MaxPooling1D(pool_size=3)))
        model.add(TimeDistributed(Flatten()))
        model.add(LSTM(10))
        model.add(Dropout(dropout))
        model.add(Dense(10, activation='relu'))
        model.add(Dense(1, activation='sigmoid'))
        model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=["accuracy"])

After the LSTM layer the output dimension is (None, ):

model.summary()
Model: "sequential_15"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 time_distributed_67 (TimeDi  (None, None, 14, 16)     1296      
 stributed)                                                      
                                                                 
 time_distributed_68 (TimeDi  (None, None, 13, 16)     528       
 stributed)                                                      
                                                                 
 time_distributed_69 (TimeDi  (None, None, 13, 16)     0         
 stributed)                                                      
                                                                 
 time_distributed_70 (TimeDi  (None, None, 4, 16)      0         
 stributed)                                                      
                                                                 
 time_distributed_71 (TimeDi  (None, None, 64)         0         
 stributed)                                                      
                                                                 
 lstm_2 (LSTM)               (None, 10)                3000      
                                                                 
 dropout_17 (Dropout)        (None, 10)                0         
                                                                 
 dense_22 (Dense)            (None, 10)                110       
                                                                 
 dense_23 (Dense)            (None, 1)                 11        
                                                                 
=================================================================
Total params: 4,945
Trainable params: 4,945
Non-trainable params: 0
_________________________________________________________________

Now I want to remove the LSTM part and just keep the CNN part for comparison:

            # define model CNN
            model = Sequential()
            model.add(TimeDistributed(Conv1D(filters=16, kernel_size=2, activation='relu'),
                                      input_shape=(None, 30, 40)))
            model.add(TimeDistributed(Conv1D(filters=16, kernel_size=2, activation='relu')))
            model.add(TimeDistributed(Dropout(0.2)))
            model.add(TimeDistributed(MaxPooling1D(pool_size=3)))
            model.add(TimeDistributed(Flatten()))
            model.add(Dense(10, activation='relu'))
            model.add(Dense(1, activation='sigmoid'))
            model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=["accuracy"])


model.summary()
Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 time_distributed_72 (TimeDi  (None, None, 14, 16)     1296      
 stributed)                                                      
                                                                 
 time_distributed_73 (TimeDi  (None, None, 13, 16)     528       
 stributed)                                                      
                                                                 
 time_distributed_74 (TimeDi  (None, None, 13, 16)     0         
 stributed)                                                      
                                                                 
 time_distributed_75 (TimeDi  (None, None, 4, 16)      0         
 stributed)                                                      
                                                                 
 time_distributed_76 (TimeDi  (None, None, 64)         0         
 stributed)                                                      
                                                                 
 dense_24 (Dense)            (None, None, 10)          650       
                                                                 
 dense_25 (Dense)            (None, None, 1)           11        
                                                                 
=================================================================
Total params: 2,485
Trainable params: 2,485
Non-trainable params: 0
_________________________________________________________________

The Dimension passed to the dense layer is (None, None, ) and the output from keras suggests to have it flattened to (None, ):

ValueError: `logits` and `labels` must have the same shape, received ((None, None, 2) vs (None,)).

I struggle to reshape the model in order to obtain the desired (None, 1) dimension on the final dense layers.

Solution

You actually need a 2D tensor with the shape (batch_size, features) and using a flatten layer on None dimensions will not work. Rather remove the last TimeDistributed layer and add a GlobalMaxPool2D (or GlobalAvgPool2D) layer and it will work:

import tensorflow as tf

model = tf.keras.Sequential()
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Conv1D(filters=16, kernel_size=2, activation='relu'),
                          input_shape=(None, 30, 40)))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Conv1D(filters=16, kernel_size=2, activation='relu')))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dropout(0.2)))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.MaxPooling1D(pool_size=3)))
model.add(tf.keras.layers.GlobalAvgPool2D())
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.summary()

Answered By – AloneTogether

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published