How to Choose Loss Functions When Training Deep Learning Neural Networks Mean Squared Error Loss Mean Squared Logarithmic Error Loss Mean Absolute Error Loss Binary Classification Loss Functions Binary Cross-Entropy Loss Hinge Loss Squared Hinge Loss Multi-Class Classification Loss Functions Multi-Class Cross-Entropy Loss Sparse Multiclass Cross-Entropy Loss Kullback Leibler Divergence Loss How to Fix Vanishing Gradients Using the Rectified Linear Activation Function Deeper MLP Model with ReLU for Two Circles Problem Review Average Gradient Size During Training How to Develop a Weighted Average Ensemble for Deep Learning Neural Networks Multilayer Perceptron Model Model Averaging Ensemble Grid Search Weighted Average Ensemble Weighted Average MLP Ensemble