EDAspy.optimization.custom.probabilistic_models package
Submodules
EDAspy.optimization.custom.probabilistic_models.adaptiveunivariategaussian module
- class EDAspy.optimization.custom.probabilistic_models.adaptive_univariate_gaussian.AdaptUniGauss(variables: list, lower_bound: float, alpha: float = 0.5)[source]
Bases:
ProbabilisticModel
This class implements the adaptive univariate Gaussians. With this implementation we are updating N univariate Gaussians in each iteration. When a dataset is given, each column is updated independently. The implementation involves a matrix with two rows, in which the first row are the means and the second one, are the standard deviations. Each Gaussian mean is updates as follows, where the two best individuals and the worst are considered.
\[\mu_{l+1} = (1 - \alpha) \mu_l + \alpha (x^{best, 1}_l + x^{best, 2}_l - x^{worst}_l)\]- sample(size: int) array [source]
Samples new solutions from the probabilistic model. In each solution, each variable is sampled from its respective normal distribution.
- Parameters:
size – number of samplings of the probabilistic model.
- Returns:
array with the dataset sampled
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.discrete_bayesian_network module
- class EDAspy.optimization.custom.probabilistic_models.discrete_bayesian_network.BN(variables: list)[source]
Bases:
ProbabilisticModel
This probabilistic model is Discrete Bayesian Network. This implementation uses pgmpy library [1].
References
[1]: Ankan, A., & Panda, A. (2015). pgmpy: Probabilistic graphical models using python. In Proceedings of the 14th python in science conference (scipy 2015) (Vol. 10). Citeseer.
- learn(dataset: array, score: str = 'bicscore', *args, **kwargs)[source]
Learn a discrete Bayesian network from the dataset passed as argument.
- Parameters:
dataset – dataset from which learn the GBN.
score – score used for the score-based structure learning algorithm
EDAspy.optimization.custom.probabilistic_models.gaussian_bayesian_network module
- class EDAspy.optimization.custom.probabilistic_models.gaussian_bayesian_network.GBN(variables: list, white_list: list | None = None, black_list: list | None = None, evidences: dict | None = None)[source]
Bases:
ProbabilisticModel
This probabilistic model is Gaussian Bayesian Network. All the relationships between the variables in the model are defined to be linearly Gaussian, and the variables distributions are assumed to be Gaussian. This is a very common approach when facing to continuous data as it is relatively easy and fast to learn a Gaussian distributions between variables. This implementation uses Pybnesian library [1].
References
[1]: Atienza, D., Bielza, C., & Larrañaga, P. (2022). PyBNesian: an extensible Python package for Bayesian networks. Neurocomputing, 504, 204-209.
- learn(dataset: array, *args, **kwargs)[source]
Learn a Gaussian Bayesian network from the dataset passed as argument.
- Parameters:
dataset – dataset from which learn the GBN.
- print_structure() list [source]
Prints the arcs between the nodes that represent the variables in the dataset. This function must be used after the learning process.
- Returns:
list of arcs between variables
- Return type:
list
- logl(data: DataFrame)[source]
Returns de log-likelihood of some data in the model.
- Parameters:
data – dataset to evaluate its likelihood in the model.
- Returns:
log-likelihood of the instances in the model.
- Return type:
np.array
- get_mu(var_mus=None) array [source]
Computes the conditional mean of the Gaussians of each node in the GBN.
- Parameters:
var_mus (list) – Variables to compute its Gaussian mean. If None, then all the variables are computed.
- Returns:
Array with the conditional Gaussian means.
- Return type:
np.array
- get_sigma(var_sigma=None) array [source]
Computes the conditional covariance matrix of the model for the variables in the GBN.
- Parameters:
var_sigma (list) – Variables to compute its Gaussian mean. If None, then all the variables are computed.
- Returns:
Matrix with the conditional covariance matrix.
- Return type:
np.array
- inference(evidence, var_names) -> (<built-in function array>, <built-in function array>)[source]
Compute the posterior conditional probability distribution conditioned to some given evidences. :param evidence: list of values fixed as evidences in the model. :type evidence: list :param var_names: list of variables measured in the model. :type var_names: list :return: (posterior mean, posterior covariance matrix) :rtype: (np.array, np.array)
EDAspy.optimization.custom.probabilistic_models.kde_bayesian_network module
- class EDAspy.optimization.custom.probabilistic_models.kde_bayesian_network.KDEBN(variables: list, white_list: list | None = None, black_list: list | None = None)[source]
Bases:
ProbabilisticModel
This probabilistic model is a Kernel Density Estimation Bayesian network [1]. It allows dependencies between variables which have been estimated using KDE.
References
[1]: Atienza, D., Bielza, C., & Larrañaga, P. (2022). PyBNesian: an extensible Python package for Bayesian networks. Neurocomputing, 504, 204-209.
- learn(dataset: array, num_folds: int = 10, *args, **kwargs)[source]
Learn a KDE Bayesian network from the dataset passed as argument.
- Parameters:
dataset – dataset from which learn the KDEBN.
num_folds – Number of folds used for the SPBN learning. The higher, the more accurate, but also higher CPU demand. By default, it is set to 10.
- sample(size: int) array [source]
Samples the KDE Bayesian network several times defined by the user. The dataset is returned as a numpy matrix. The sampling process is implemented using probabilistic logic sampling.
- Parameters:
size – number of samplings of the KDE Bayesian network.
- Returns:
array with the dataset sampled.
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.multivariate_gaussian module
- class EDAspy.optimization.custom.probabilistic_models.multivariate_gaussian.MultiGauss(variables: list, lower_bound: float, upper_bound: float)[source]
Bases:
ProbabilisticModel
This class implements all the code needed to learn and sample multivariate Gaussian distributions defined by a vector of means and a covariance matrix among the variables. This is a simpler approach compared to Gaussian Bayesian networks, as multivariate Gaussian distributions do not identify conditional dependeces between the variables.
- sample(size: int) array [source]
Samples the multivariate Gaussian distribution several times defined by the user. The dataset is returned as a numpy matrix.
- Parameters:
size – number of samplings of the Gaussian Bayesian network.
- Returns:
array with the dataset sampled.
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.semiparametric_bayesian_network module
- class EDAspy.optimization.custom.probabilistic_models.semiparametric_bayesian_network.SPBN(variables: list, white_list: list | None = None, black_list: list | None = None)[source]
Bases:
ProbabilisticModel
This probabilistic model is a Semiparametric Bayesian network [1]. It allows dependencies between variables which have been estimated using KDE with variables which fit a Gaussian distribution.
References
[1]: Atienza, D., Bielza, C., & Larrañaga, P. (2022). PyBNesian: an extensible Python package for Bayesian networks. Neurocomputing, 504, 204-209.
- learn(dataset: array, num_folds: int = 10, *args, **kwargs)[source]
Learn a semiparametric Bayesian network from the dataset passed as argument.
- Parameters:
dataset – dataset from which learn the SPBN.
num_folds – Number of folds used for the SPBN learning. The higher, the more accurate, but also higher CPU demand. By default, it is set to 10.
max_iters – number maximum of iterations for the learning process.
- print_structure() list [source]
Prints the arcs between the nodes that represent the variables in the dataset. This function must be used after the learning process.
- Returns:
list of arcs between variables
- Return type:
list
- sample(size: int) array [source]
Samples the Semiparametric Bayesian network several times defined by the user. The dataset is returned as a numpy matrix. The sampling process is implemented using probabilistic logic sampling.
- Parameters:
size – number of samplings of the Semiparametric Bayesian network.
- Returns:
array with the dataset sampled.
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.univariate_binary module
- class EDAspy.optimization.custom.probabilistic_models.univariate_binary.UniBin(variables: list, upper_bound: float, lower_bound: float)[source]
Bases:
ProbabilisticModel
This is the simplest probabilistic model implemented in this package. This is used for binary EDAs where all the solutions are binary. The implementation involves a vector of independent probabilities [0, 1]. When sampling, a random float is sampled [0, 1]. If the float is below the probability, then the sampling is a 1. Thus, the probabilities show probabilities of a sampling being 1.
- sample(size: int) array [source]
Samples new solutions from the probabilistic model. In each solution, each variable is sampled from its respective binary probability.
- Parameters:
size – number of samplings of the probabilistic model.
- Returns:
array with the dataset sampled.
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.univariate_categorical module
- EDAspy.optimization.custom.probabilistic_models.univariate_categorical.obtain_probabilities(array) dict [source]
- class EDAspy.optimization.custom.probabilistic_models.univariate_categorical.UniCategorical(variables: list)[source]
Bases:
ProbabilisticModel
This probabilistic model is discrete and univariate.
- learn(dataset: array, *args, **kwargs)[source]
Estimates the independent categorical probability distribution for each variable.
- Parameters:
dataset – dataset from which learn the probabilistic model.
- sample(size: int) array [source]
Samples new solutions from the probabilistic model. In each solution, each variable is sampled from its respective categorical distribution.
- Parameters:
size – number of samplings of the probabilistic model.
- Returns:
array with the dataset sampled
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.univariate_gaussian module
- class EDAspy.optimization.custom.probabilistic_models.univariate_gaussian.UniGauss(variables: list, lower_bound: float)[source]
Bases:
ProbabilisticModel
This class implements the univariate Gaussians. With this implementation we are updating N univariate Gaussians in each iteration. When a dataset is given, each column is updated independently. The implementation involves a matrix with two rows, in which the first row are the means and the second one, are the standard deviations.
- sample(size: int) array [source]
Samples new solutions from the probabilistic model. In each solution, each variable is sampled from its respective normal distribution.
- Parameters:
size – number of samplings of the probabilistic model.
- Returns:
array with the dataset sampled
- Return type:
np.array
EDAspy.optimization.custom.probabilistic_models.univariate_kde module
- class EDAspy.optimization.custom.probabilistic_models.univariate_kde.UniKDE(variables: list)[source]
Bases:
ProbabilisticModel
This class implements the univariate Kernel Density Estimation. With this implementation we are updating N univariate KDE in each iteration. When a dataset is given, each column is updated independently.
- sample(size: int) array [source]
Samples new solutions from the probabilistic model. In each solution, each variable is sampled from its respective normal distribution.
- Parameters:
size – number of samplings of the probabilistic model.
- Returns:
array with the dataset sampled
- Return type:
np.array