Hosmer-Lemeshow Test

Overview

Definition

The Hosmer-Lemeshow Test is a statistical test for goodness of fit for logistic regression models. It assesses whether the observed event rates match expected event rates in subgroups of the model population.

1. Procedure

Predict Probabilities: Calculate predicted probabilities for all observations.
Group Data: Sort observations by predicted probability and divide them into $g$ groups (typically deciles, $g = 10$ ).
Compare: In each group, calculate the expected number of events versus observed events.
Chi-Square Statistic: $H = \sum_{j = 1}^{g} \frac{(O_{j} - E_{j})^{2}}{N_{j} π_{j} (1 - π_{j})}$ Where $O_{j}$ is observed events, $E_{j}$ is expected events, and $π_{j}$ is the average predicted probability in group $j$ .

2. Hypothesis

$H_{0}$ : The model fits the data well (No significant difference between observed and predicted).
$H_{1}$ : The model does not fit the data well.

Interpretation:

p > 0.05: Evidence of good fit (Fail to reject $H_{0}$ ).
p < 0.05: Evidence of poor fit (Reject $H_{0}$ ).

Limitation

The test is sensitive to grouping method and sample size. It is often recommended to use it alongside calibration plots.

3. Python Implementation

Note: Not available in standard sklearn. Custom implementation or libraries like scikit-learn-extra or statistical packages are needed.

# Conceptual implementation
# Group data by deciles of predicted probability
# Calculate Chi-square between observed and expected counts

Binary Logistic Regression - The model being tested.
ROC & AUC - Measures discrimination (distinguishing classes) rather than calibration (accuracy of probability).
Confusion Matrix - Classification performance.

Hosmer-Lemeshow Test

Overview

1. Procedure

2. Hypothesis

3. Python Implementation

4. Related Concepts