Fitting distributions to data

This tutorial explains how to fit the distributions provided by the projnormal package to data. The classes and methods used for fitting the distributions data are available in projnormal.models.

We’ll demonstrate fitting two different projected normal distributions:

The standard projected normal distribution (ProjNormal).
The projected normal distribution with a constant term (ProjNormalConst).

Fitting the Projected Normal Distribution

Let’s start by defining some (true) parameters for the distribution, and generating synthetic data that we can then use to fit the distribution.

import projnormal
import torch

# Distribution parameters
N_DIM = 3
mean_x = projnormal.param_sampling.make_mean(N_DIM)
cov_x = projnormal.param_sampling.make_spdm(N_DIM)

# Initialize the projected normal distribution class for sampling
pn_sample = projnormal.classes.ProjNormal(
    n_dim=N_DIM,
    mean_x=mean_x,
    covariance_x=cov_x,
)

# Generate samples
samples = pn_sample.sample(2000)

Next, we initialize a new instance of the ProjNormal class, but without the true parameters. This instance will be used to fit the model to the generated samples via Maximum Likelihood Estimation (MLE).

# Fit the model using Maximum Likelihood Estimation (MLE)
pn_fit_mle = projnormal.classes.ProjNormal(n_dim=N_DIM)
pn_fit_mle.max_likelihood(samples, show_progress=False)

# Print the fitted parameters
print("True mean:\n", mean_x.numpy())
print("Fitted mean (MLE):\n", pn_fit_mle.mean_x.detach().numpy())

print("True covariance:\n", cov_x.numpy())
print("Fitted covariance (MLE):\n", pn_fit_mle.covariance_x.detach().numpy())

True mean:
 [0.8175252  0.5126397  0.26239854]
Fitted mean (MLE):
 [0.8300413  0.52009124 0.20133701]
True covariance:
 [[0.77011466 0.0586206  0.15711969]
 [0.05862059 1.0186366  0.4096994 ]
 [0.15711969 0.4096994  1.2112486 ]]
Fitted covariance (MLE):
 [[0.897333   0.07702081 0.20384723]
 [0.07702082 1.159111   0.49285483]
 [0.20384721 0.49285477 1.3652866 ]]

Now, we can also fit the model using Moment Matching, which uses the sample moments to estimate the parameters of the distribution.

# Fit the model using Moment Matching
mean_data = torch.mean(samples, dim=0)
cov_data = torch.cov(samples.T)
data_moments = {'mean': mean_data, 'covariance': cov_data}

# Fit the model using Moment Matching
pn_fit_mm = projnormal.classes.ProjNormal(n_dim=N_DIM)
pn_fit_mm.moment_match(data_moments=data_moments, show_progress=False)

# Print the fitted parameters
print("Fitted mean (Moment Matching):\n", pn_fit_mm.mean_x.detach().numpy())
print("Fitted covariance (Moment Matching):\n", pn_fit_mm.covariance_x.detach().numpy())

Fitted mean (Moment Matching):
 [0.7100353  0.5894815  0.38517708]
Fitted covariance (Moment Matching):
 [[0.73495376 0.16749287 0.23493129]
 [0.1674929  1.1156119  0.6227633 ]
 [0.23493129 0.62276334 1.3276391 ]]

Fitting the Projected Normal Distribution with a Constant Term

For illustration purposes, the example code below shows how to fit one of the variants of the projected normal distribution, that includes a constant term in the denominator.

# Define the constant term
const = torch.tensor(3.0)

# Initialize the projected normal distribution with constant term for sampling
pnc_sample = projnormal.classes.ProjNormalConst(
    n_dim=N_DIM,
    mean_x=mean_x,
    covariance_x=cov_x,
    const=const,
)

# Generate samples
samples_const = pnc_sample.sample(2000)

# Fit the model using Maximum Likelihood Estimation (MLE)
pnc_fit_mle = projnormal.classes.ProjNormalConst(n_dim=N_DIM)
pnc_fit_mle.max_likelihood(samples_const, show_progress=False)

# Print the fitted parameters
print("True mean:\n", mean_x.numpy())
print("Fitted mean (MLE):\n", pnc_fit_mle.mean_x.detach().numpy())

print("True covariance:\n", cov_x.numpy())
print("Fitted covariance (MLE):\n", pnc_fit_mle.covariance_x.detach().numpy())

print("True constant:\n", const.numpy())
print("Fitted constant (MLE):\n", pnc_fit_mle.const.detach().numpy())

True mean:
 [0.8175252  0.5126397  0.26239854]
Fitted mean (MLE):
 [0.7861881  0.52978915 0.3181694 ]
True covariance:
 [[0.77011466 0.0586206  0.15711969]
 [0.05862059 1.0186366  0.4096994 ]
 [0.15711969 0.4096994  1.2112486 ]]
Fitted covariance (MLE):
 [[0.6466029  0.06528498 0.13363323]
 [0.06528497 0.910243   0.3763635 ]
 [0.1336332  0.37636352 0.9838667 ]]
True constant:
 3.0
Fitted constant (MLE):
 2.4810271

Finally, we can fit the ProjNormalConst model using Moment Matching as well.

# Fit the model using Moment Matching
mean_data_const = torch.mean(samples_const, dim=0)
cov_data_const = torch.cov(samples_const.T)
data_moments_const = {'mean': mean_data_const, 'covariance': cov_data_const}

# Fit the model using Moment Matching
pnc_fit_mm = projnormal.classes.ProjNormalConst(n_dim=N_DIM)
pnc_fit_mm.moment_match(data_moments=data_moments_const, show_progress=False)

# Print the fitted parameters
print("Fitted mean (Moment Matching):\n", pnc_fit_mm.mean_x.detach().numpy())
print("Fitted covariance (Moment Matching):\n", pnc_fit_mm.covariance_x.detach().numpy())
print("Fitted constant (Moment Matching):\n", pnc_fit_mm.const.detach().numpy())

Fitted mean (Moment Matching):
 [0.73310655 0.56390226 0.38022247]
Fitted covariance (Moment Matching):
 [[0.6126298  0.175609   0.24141623]
 [0.17560898 1.0007228  0.56634337]
 [0.24141625 0.5663434  1.1068846 ]]
Fitted constant (Moment Matching):
 2.2650936