SEMBA


Priors and syntax

This tutorial assumes that a reader is familiar with core semopy concepts, such as syntax and models. If that's not the case, read Introduction and Syntax sections of the semopy tutorial. semba uses exactly the same syntax as semopy for specifying SEM models, however there are new operations that are used to impose custom priors on parameters.

There are default priors for loadings and (co)variances that are assumed automatically. Namely,

Regression coefficients
    \( \mathcal{N} (x_0 ,0.75^2)\) if \(x_0 \neq 0\), else \( \mathcal{N} (0 , 1.5^2)\)
Variance of observed variables
\(\mathcal{HN} (x_0^2 {\frac{\pi}{2}})\)
Covariance of observed variables
    \(\mathcal{N} (x_0, 1 )\)
Variance of latent variables
\(\mathcal{HN} ({\frac{\pi}{2}})\)
Covariance if one variable is latent
    \(\mathcal{N} (0, 1 )\)
Random effects variance
\(\mathcal{HN} ({\frac{\pi}{2^3}})\)
Random effects covariance
    \(\mathcal{N} (0, 0.5^2 )\)
where \(x_0\) is a starting value and \(\mathcal{HN}\) is a half-normal distribution. The latter two are applicable only for ModelEffects.

To override default priors and to impose a custom priot on a specific parameter, a following set of operations can be used:

PRIOR(dist, args)
Imposes a custom prior on parameters that are listed after operation. For example,
y ~ a*x1 + b*x2 + c*x3
PRIOR(Normal, loc=0, scale=2) a b
PRIOR(Exponential, rate=0.5) c
PRIOR_LOADING(dist, args)
Overrides default prior for regression coefficients. An example:
PRIOR_LOADING(Normal, loc=0, scale=1)
PRIOR_VARIANCE(dist, args)
Overrides default prior for variances of observed variables. An example:
PRIOR_VARIANCE(HalfCauchy, scale=1)
PRIOR_COVARIANCE(dist, args)
Overrides default prior for covariances of observed variables.
PRIOR_VARIANCE_LATENT(dist, args)
Overrides default prior for variances of latent variables.
PRIOR_COVARIANCE_LATENT(dist, args)
Overrides default prior for covariances of latent variables.
PRIOR_VARIANCE_RF(dist, args)
Overrides default prior for variances of random effects.
PRIOR_COVARIANCE_RF(dist, args)
Overrides default prior for covariances of random effects.

Any Numpyro distribution can be passed to the operations above. For a detailed list with possible parameters, see official Numpyro documentation here.

Fitting model to data

Like in semopy, in semba one first choses a model from the list:
Model
Conventional SEM model.
ModelMeans
SEM model with a fixed effects part that can be used to model intercepts.
ModelEffects
SEM model with random effects distributed with a static covariance matrix.
(soon) ModelGeneralizedEffects
SEM model with random effects distributed with a parametrised covariance matrix, aka. a Gaussian Process SEM.
then fits it to a data by means of the fit method. See the example below where we first export the Political Democracy dataset from semopy and then we fit it with semba's model:
from semopy.examples import political_democracy as ex
import semba

desc, data = ex.get_model(), ex.get_data()
model = semba.Model(desc)
model.fit(data)
ins = model.inspect()
print(ins)
This produces an output:
       lval  op   rval   Estimate       5.0%      95.0%        std      n_eff      r_hat
_b12  dem60   ~  ind60      -1.57      -2.37      -0.88       0.47   1,225.98       1.00
_b13  dem65   ~  ind60      -0.65      -1.15      -0.15       0.32   1,035.10       1.00
_b14  dem65   ~  dem60       0.91       0.71       1.08       0.11     612.31       1.00
_b1      x1   ~  ind60       1.00          -          -          -          -          -
_b2      x2   ~  ind60      -2.48      -2.99      -1.94       0.32     867.74       1.00
_b3      x3   ~  ind60      -2.28      -2.77      -1.79       0.30     696.56       1.00
_b4      y1   ~  dem60       1.00          -          -          -          -          -
_b5      y2   ~  dem60       1.41       1.07       1.75       0.20     818.40       1.00
_b6      y3   ~  dem60       1.15       0.88       1.41       0.16     915.11       1.00
_b7      y4   ~  dem60       1.42       1.14       1.67       0.17     514.43       1.00
_b8      y5   ~  dem65       1.00          -          -          -          -          -
_b9      y6   ~  dem65       1.22       0.96       1.48       0.16     856.59       1.00
_b10     y7   ~  dem65       1.30       1.07       1.57       0.15     869.51       1.00
_b11     y8   ~  dem65       1.29       1.03       1.54       0.16     800.41       1.00
_c15  dem60  ~~  dem60       2.99       2.11       3.99       0.58     765.03       1.00
_c17  dem65  ~~  dem65       0.33       0.01       0.64       0.22     725.92       1.00
_c20  ind60  ~~  ind60       0.30       0.17       0.42       0.08     731.05       1.00
_c1      y1  ~~     y5       0.70       0.13       1.23       0.34     872.95       1.00
_c2      y1  ~~     y1       2.23       1.43       2.92       0.47     874.95       1.00
_c3      y2  ~~     y4       0.79      -0.10       1.57       0.52     685.30       1.00
_c4      y2  ~~     y6       1.42       0.53       2.14       0.49   1,343.32       1.00
_c5      y2  ~~     y2       6.47       4.72       8.12       1.02   1,227.05       1.00
_c6      y3  ~~     y7       0.72      -0.12       1.50       0.51   1,285.40       1.00
_c7      y3  ~~     y3       5.10       3.59       6.38       0.86   1,342.31       1.00
_c8      y4  ~~     y8       0.35      -0.25       1.09       0.40   1,043.56       1.00
_c9      y4  ~~     y4       3.03       1.87       4.17       0.70     675.44       1.00
_c10     y6  ~~     y8       1.04       0.37       1.84       0.47     976.38       1.00
_c11     y6  ~~     y6       4.52       3.24       5.73       0.77     914.27       1.00
_c12     x2  ~~     x2       0.38       0.07       0.65       0.18     430.96       1.00
_c13     y7  ~~     y7       3.44       2.38       4.45       0.66   1,143.13       1.00
_c14     x1  ~~     x1       1.35       0.99       1.67       0.21   1,372.02       1.00
_c16     x3  ~~     x3       0.39       0.15       0.66       0.16     481.36       1.00
_c18     y5  ~~     y5       2.56       1.71       3.36       0.51   1,020.10       1.00
_c19     y8  ~~     y8       3.21       2.32       4.31       0.62     792.91       1.00

Here, "Estimate" is a mean across chain values after burn-in samples, "5.0%" and "95.0%" are confidence intervals, "std" is standard deviation of estimates, "n_eff" is an effective number of samples and "r_hat" is a Rubin-Gelman statistics that quantifies a divergence between different chains. Values of "r_hat" that are strongly different from 1.00 may indicate a presense of local minima and/or that a model is overidentified. In the example above, we've used the default number of chains, which is 1, and hence it is not informative.

Hyperparameters of interest that can be passed to fit are:

num_samples
Number of MCMC samples. The default value is number of parameters times 30. Note that this is a very generous default value, usually it is suffice a much smaller value, so if a performance of concern, consider setting a custom smaller value.
num_warmup
Number of warmup/burn-in samples. The default values is num_samples/5.
num_chains
Number of MCMC chains. The default is 1.
solver
Name of solver. At the moment, it is reserved for MCMC kernels. All kernels that are available in Numpyro can be utilised. The default is "NUTS".
seed
Seed for pseudo random value generator. The default is 0.

Multiple chains behavior

Actually, there is an extra argument to the fit method, chains_mode, that is set to "best". This makes semba to use only a single best (in terms of likelihood) chain for parameter estimation. Such behavior handles a problem of local minima to a much greater extent than averaging over all chains. If you still want to join all chains, pass "mean" instead.

For example, the article_example with default priors and seed value exhibits local minimum that is far off from "true" parameter estimates. This is effectively solved by introducing extra chains:

from semopy.examples import example_article as ex
from semopy.utils import compare_results
from semba import Model
import numpy as np
desc, data = ex.get_model(), ex.get_data()
true = ex.get_params()
m = Model(desc)
r = m.fit(data, num_chains=5)
print('{:.2f}%'.format(np.mean(compare_results(m, true)) * 100))
Mean absolute percentage error is:
18.79%

Limitations

We aim to make all features of semopy available to semba, however some things are offl-limits due to the bayesian nature of the package, namely:

  1. Different estimators such as ULW, WLS, GLS, etc.
  2. Although conventional SEM fit indices can be computed via calc_stats function, but they have no theoretical grounding.
  3. Bounds and constraints.