StepMix

For StepMixR, please refer to this repository.

A Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical data (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for both clustering and supervised learning.

Additional features include:

Support for missing values through Full Information Maximum Likelihood (FIML);
Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;
Covariates and distal outcomes;
Parametric and non-parametric bootstrapping.

Reference

If you find StepMix useful, please leave a ⭐ and consider citing our Journal of Statistical Software paper:

@Article{,
  title = {{StepMix}: A {Python} Package for Pseudo-Likelihood
    Estimation of Generalized Mixture Models with External
    Variables},
  author = {Sacha Morin and Robin Legault and F{\'e}lix Lalibert{\'e}
    and Zsuzsa Bakk and Charles-{\'E}douard Gigu{\`e}re and Roxane
    {de la Sablonni{\`e}re} and {\'E}ric Lacourse},
  journal = {Journal of Statistical Software},
  year = {2025},
  volume = {113},
  number = {8},
  pages = {1--39},
  doi = {10.18637/jss.v113.i08},
}

Install

You can install StepMix with pip, preferably in a virtual environment:

pip install stepmix

Quickstart

A StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either numpy.arrayor pandas.DataFrame. Categories should be integer-encoded and 0-indexed.

from stepmix.stepmix import StepMix

# Categorical StepMix Model with 3 latent classes
model = StepMix(n_components=3, measurement="categorical")
model.fit(data)

# Allow missing values
model_nan = StepMix(n_components=3, measurement="categorical_nan")
model_nan.fit(data_nan)

For binary data you can also use measurement="binary" or measurement="binary_nan". For continuous data, you can fit a Gaussian Mixture with diagonal covariances using measurement="continuous" or measurement="continuous_nan".

Set verbose=1 for a detailed output.

Please refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.

Tutorials

Detailed tutorials are available in notebooks:

Generalized Mixture Models with StepMix: an in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example and covers:
1. Gaussian Mixtures (Latent Profile Analysis);
2. Binary Mixtures (LCA);
3. Categorical Mixtures (LCA);
4. Mixed Categorical and Continuous Mixtures;
5. Missing Values through Full-Information Maximum Likelihood.
Stepwise Estimation with StepMix: a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:
1. LCA models with distal outcomes;
2. LCA models with covariates;
3. 1-step, 2-step and 3-step estimation;
4. Corrections (BCH or ML) and other options for 3-step estimation;
5. Putting it All Together: A Complete Model with Missing Values
Model Selection:
1. Selecting the number of components in a mixture model (n_components) with cross-validation;
2. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);
3. Fit indices: AIC, BIC and other metrics.
Parameters, Bootstrapping and CI: a tutorial discussing how to:
1. Access StepMix parameters;
2. Bootstrap StepMix estimators;
3. Quickly plot confidence intervals.
Supervised and Semi-Supervised Learning with StepMix:
1. Binary Classification;
2. Multiclass Classification;
3. Semi-Supervised Learning;
4. Cross-Validation.
Deriving p-values in StepMix: a tutorial demonstrating how to transform SM parameters into conventional regression coefficients and how to derive p-values. The tutorial covers models with:
1. Continuous covariate;
2. Binary covariate;
3. Categorical covariate;
4. Multiple covariates (different distributions);
5. Binary distal outcome;

Name		Name	Last commit message	Last commit date
Latest commit History 443 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
stepmix		stepmix
test		test
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README-dev.md		README-dev.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StepMix

Reference

Install

Quickstart

Tutorials

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

Labo-Lacourse/stepmix

Folders and files

Latest commit

History

Repository files navigation

StepMix

Reference

Install

Quickstart

Tutorials

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages