Theme

Custom Colors

Accent

#d4943a

Background

#0c0e11

Header

#141719

Cards

#1a1d21

Accessibility Presets

Font

Code and tool outputs stay monospace.

Readability

Font Size

16px

Line Height

1.6

Letter Spacing

0px

Web & Dev

Machine Learning Loss Functions

Regression, classification, and task-specific losses — what each measures and watch-outs.

Updated Apr 19, 2026 2 min read

Regression

Loss	Formula	Notes
MSE / L2	(1/N) Σ (y − ŷ)²	Smooth; penalizes outliers heavily
MAE / L1	(1/N) Σ \|y − ŷ\|	Robust to outliers; gradient constant
Huber	L2 if \|e\| < δ else L1	Smooth + robust
Log-cosh	Σ log(cosh(e))	Smooth everywhere, outlier-resistant
Quantile	Σ max(q·e, (q−1)·e)	Regression for a specific quantile

Classification

Loss	Formula	Notes
Binary cross-entropy	−[y·log(p) + (1−y)·log(1−p)]	Use with sigmoid output
Categorical cross-entropy	−Σ yᵢ · log(pᵢ)	Use with softmax output
Sparse categorical CE	Index-label version	Same as above with integer labels
Hinge	max(0, 1 − y·ŷ)	SVMs; y ∈ {−1, +1}
Focal loss	−(1 − p_t)^γ · log(p_t)	Imbalanced classification
Label smoothing	Replace hot 1 with 1−ε	Prevents overconfidence

Task-specific

Loss	Use
Triplet loss	Metric learning — embeddings
Contrastive (InfoNCE)	Self-supervised, CLIP-style training
Dice coefficient	Image segmentation (handles imbalance)
IoU / Jaccard	Segmentation / detection
CTC loss	Sequence prediction without alignment (speech, OCR)
DPO / reward modeling	RLHF — preference-based fine-tuning
KL divergence	Distillation, variational methods

Notes

Add a small regularization term (L1 / L2 on weights) to reduce overfitting.
Log-likelihoods should be computed in log-space to avoid numerical underflow — use log_softmax + NLL instead of softmax + log.

Was this article helpful?