-
Mean vs. Total Cross-Entropy Loss
Mainly a reminder to myself after wasting a full day debugging gradients
-
Initializing the Bias in Output Layers
Should you initialize the bias in the output layer to predict the positive rate?
-
Why is Numpy Faster than Pure Python?
TLDR; Numpy leverages contiguous memory allocation and vectorizes operations over entire arrays.