## Issue

I’m creating a model using the Keras functional API.

The layer architecture is as follows:

```
n = tf.keras.layers.Dense(1)(input)
for i in tf.range(n):
output = tf.keras.layers.Dense(4)(input)
```

I then concat the outputs and return for a tensor with shape [1, None, 4] where [1] is the batch dimension, [None] is n, and [4] is the output from the second dense layer.

My loss function involves comparing the shape of the expected output, and comparing the outputs.

```
loss = tf.convert_to_tensor(abs(tf.shape(logits)[1] - tf.shape(expected)[1])) * 100.
```

When running this on a custom training loop, I’m getting the error

```
ValueError: No gradients provided for any variable: (['while/dense/kernel:0',
'while/dense/bias:0', 'while/while/dense_1/kernel:0', 'while/while/dense_1/bias:0'],).
Provided `grads_and_vars` is ((None, <tf.Variable 'while/dense/kernel:0' shape=(786432, 1)
```

## Solution

Shape is not differentiable, you cannot do things like this with gradient based learning. Problems like this need to be tackled with more powerful tools, e.g. reinforcement learning where one considers n as an action, and get policy gradient for that.

A rule of thumb to remember is that you cannot really backprop through **discrete** objects. You need to produce floats, as gradients require **smooth** functions. In your case n should be an integer (what does a loop over a float mean?) so this should be your first warning sign. The other being shape itself, which is also an integer. A **target** can be discrete, but not the prediction. Note that even in classification we do not output **class** we output **probability** as probability is smooth.

You could build your model by assuming some maximum number of N and treat it more like a classification where you supervise N directly, and use some form of masking to keep all the results around.

Answered By – lejlot

**This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 **