## Issue

My LSTM RNN has to predict a single letter(Y), given preceding words before(X).

For example, if “Oh, say! can you see by the dawn’s early ligh” is given as X, then Y would be “t”(part of National Anthem). Each Alpabets are one-hot coded. So, g in one-hot coded is for example, [0,0,0,1,0,0,0,0,0,0,0,0,…,0,0,0,0,0,0].

`dataX:[batch_size,20,num_of_classes], dataY:[batch_size,1,num_of_classes]`

In this case, what loss function would be best for prediction?

Both X and Y are one-hot encoded, X are many and Y is one.

I rarely find loss functions which takes one-hot as parameter(such as, parameter for logits or target).

## Solution

What you are looking for is the cross entropy between

**Y_** (ground truth) and **Y** (probabilities)

You could use a basic hand coded cross entropy like

```
y = tf.nn.softmax( logit_layer )
loss = -tf.reduce_mean(tf.reduce_mean( y_ * tf.log(y) ))
```

Or you could use the built in TensorFlow function

```
loss = tf.nn.softmax_cross_entropy_with_logits( labels=y_, logits=logit_layer)
```

Your **Y** output would be something like [0.01,0.02,0.01,.98,0.02,…] and your **logit_layer** is just the raw output before applying softmax.

Answered By – Anton Codes

**This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 **