SEMBA


What is semba?

semba is a Python package for bayesian and (soon) probabalistic structural equation modelling (SEM). The project is powered by the other SEM software semopy and probabalistic programming framework Numpyro. One can think of semba as a bayesian offspring of semopy, and indeed, there is little difference between and the two in terms of usability. The aim of the project is to harness the power of MCMC samplers for Pythonist SEM enthuasiasts while keeping it coding-wise as close to the semopy experience as possible.

What semba has to offer?

semba is a young package that, and many features are still planned, yet at the moment its selling points are:
  • Impose arbitrary priors on model parameters;
  • Efficient parameter estimation by means of MCMC methods;
  • Almost complete mimicria of semopy models and methods: no need to learn two different packages.

What will semba offer in the foreseeable future?

  • A greater mimicria;
  • Bayesian treatment of the Gaussian Process SEM proposed here under the notion of ModelGeneralizedEffects that can be used to model complex phenomena such as spatial, temporal data or even both;
  • Probabalistic approach to SEM that lets user to impose arbitrary distribution assumptions on variables and to introduce complex nonlinearity.

Where to start?

The best place to start is to get familar with semopy first at its website as there is no difference in core syntax.

Then, one can proceed to installing semba via pip:

pip install semba
See that using semba is no different from using semopy:
from semopy.examples import political_democracy as ex
import semba

desc, data = ex.get_model(), ex.get_data()
model = semba.Model(desc)
model.fit(data)
ins = model.inspect()
print(ins.head())
This produces an output:
       lval op   rval   Estimate       5.0%      95.0%        std      n_eff      r_hat
_b12  dem60  ~  ind60       1.32       0.72       1.86       0.35   1,098.38       1.00
_b13  dem65  ~  ind60       0.52       0.14       0.89       0.23   1,119.16       1.00
_b14  dem65  ~  dem60       0.90       0.71       1.07       0.11     479.93       1.00
_b1      x1  ~  ind60       1.00          -          -          -          -          -
_b2      x2  ~  ind60       2.17       1.94       2.39       0.14   1,028.65       1.00
For more details, proceed to the tutorial on bayesian SEM page.

What to cite?

For this time being, you can cite the semopy 2 paper:

@misc{semopy2,
      title={semopy 2: A Structural Equation Modeling Package with Random Effects in Python}, 
      author={Georgy Meshcheryakov and Anna A. Igolkina and Maria G. Samsonova},
      year={2021},
      eprint={2106.01140},
      archivePrefix={arXiv},
      primaryClass={stat.AP}
}