Per this case, the activation function does not depend mediante scores of other classes per \(C\) more than \(C_1 = C_i\). So the gradient respect onesto the each score \(s_i\) con \(s\) will only depend on the loss given by its binary problem.
- Caffe: Sigmoid Ciclocampestre-Entropy Loss Layer
- Pytorch: BCEWithLogitsLoss
- TensorFlow: sigmoid_cross_entropy.
Focal Loss
, from Facebook, sopra this paper. They claim preciso improve one-stage object detectors using Focal Loss puro train a detector they name RetinaNet. Focal loss is per Ciclocampestre-Entropy Loss that weighs the contribution of each sample esatto the loss based durante the classification error. The timore is that, if per sample is already classified correctly by the CNN, its contribution onesto the loss decreases. Continue lendo