difference between categorical and binary cross entropy

Issue

Using keras I have to train a model to predict either the image belongs to class 0 or class 1. I am confused in binary and categorical_cross_entropy. I have searched for that but I am still confused. Some have mentioned that we only use categorical cross entropy when we are trying to predict multi-classes and we should use one-hot-encoder vector for this. So it means we dont need any one-hot-encoded vector labels when we are going to train using binary_cross_entrpoy. Some have suggested to represent one_hot vectors as [0. 1.] (if class is 1) or [1. 0.] (if class is 0) for binary_cross_entropy.
I am using one hot encoders [0 1] or [1 0] with categorical cross entropy. My last layer is

model.add(Dense(num_classes, activation='softmax'))
  
# Compile model
model.compile(loss='categorical_crossentropy', 
              optimizer='adadelta', 
              metrics=['accuracy'])

Solution

They are mathematically identical for 2 classes hence binary. In other words, 2 class categorical cross entropy is the same as single output binary cross entropy. To give a more tangible example these are identical:

model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', ...)
# is the same as
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', ...)

Which one to use? To avoid one-hot encoding categorical outputs, if you only have 2 classes it is easier – from a coding perspective – to use binary cross entropy. The binary case might be computationally more efficient depending on the implementation.

Answered By – nuric

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published