beetroots.inversion.results package

Subpackages

beetroots.inversion.results.utils package

Submodules

beetroots.inversion.results.abstract_results module

class beetroots.inversion.results.abstract_results.ResultsExtractor[source]

Bases: ABC

extractor of the results of an inversion

abstract main(**kwargs)[source]

beetroots.inversion.results.results_mcmc module

class beetroots.inversion.results.results_mcmc.ResultsExtractorMCMC(path_data_csv_out_mcmc: str, path_img: str, path_raw: str, N_MCMC: int, T_MC: int, T_BI: int, freq_save: int, max_workers: int)[source]

Bases: ResultsExtractor

extractor of inference results for the Markov chain data that was saved.

Parameters:

path_data_csv_out_mcmc (str) – path to the csv file in which the performance of estimators is to be saved
path_img (str) – path to the folder in which images are to be saved
path_raw (str) – path to the raw .hdf5 files
N_MCMC (int) – number of Markov chains to run per posterior distribution
T_MC (int) – total size of each Markov chain
T_BI (int) – duration of the Burn-in phase
freq_save (int) – frequency of saved iterates, 1 means that all iterates were saved (used to show correct Markov chain sizes in chain plots)
max_workers (int) – maximum number of workers that can be used for results extraction

N_MCMC

number of Markov chains to run per posterior distribution

Type:: int

T_BI

duration of the Burn-in phase

Type:: int

T_MC

total size of each Markov chain

Type:: int

freq_save

frequency of saved iterates, 1 means that all iterates were saved (used to show correct Markov chain sizes in chain plots)

Type:: int

main(posterior: Posterior, model_name: str, scaler: Scaler, list_names: List[str], list_idx_sampling: List[int], list_fixed_values: List[float | None] | ndarray, plot_1D_chains: bool, plot_2D_chains: bool, plot_ESS: bool, plot_comparisons_yspace: bool, estimator_plot: PlotsEstimator | None = None, analyze_regularization_weight: bool = False, Theta_true_scaled: ndarray | None = None, list_lines_fit: List[str] | None = None, list_lines_valid: List[str] = [], y_valid: ndarray | None = None, sigma_a_valid: ndarray | None = None, omega_valid: ndarray | None = None, sigma_m_valid: ndarray | None = None, point_challenger: Dict = {}, list_CI: List[int] = [68, 90, 95, 99])[source]

performs the data extraction, in this order:

step 1 : clppd (see beetroots.inversion.results.utils.clppd)
step 2 : kernel analysis (see beetroots.inversion.results.utils.kernel)
step 3 : objective evolution (see beetroots.inversion.results.utils.objective) (the objective is the negative log posterior pdf)
step 4 : MAP estimator from samples (see beetroots.inversion.results.utils.lowest_obj_estimator)
step 5 : deal with whole Markov chain for MMSE and histograms, in a pixel-wise approach to avoid overloading the memory (see beetroots.inversion.results.utils.mc)
step 6 : save global MMSE performance (see beetroots.inversion.results.utils.mmse_ci)
step 7 (if map) : plot maps of ESS (see beetroots.inversion.results.utils.ess_plots)
step 8 : model checking with Bayesian p-value computation (see beetroots.inversion.results.utils.bayes_pval_plots)
step 9 (if the true value is known): plot how many components have their true value between min and max of Markov chain (see beetroots.inversion.results.utils.valid_mc)
step 10 : plot comparison of distributions of $y_n$ and $f(\theta_n)$ for all $n \in [\![1, N]\!]$ (see beetroots.inversion.results.utils.y_f_Theta)
step 11 (if analyze_regularization_weight) : analysis of the regularization weight $\tau$ automatic tuning (see beetroots.inversion.results.utils.regularization_weights)

Parameters:

posterior (Posterior) – probability distribution used to generate the Markov chain(s)
model_name (str) – name of the model, used to identify the posterior distribution
scaler (Scaler) – contains the transformation of the Theta values from their natural space to their scaled space (in which the sampling happens) and its inverse
list_names (List[str]) – names of the D physical parameters to appear in plots (for instance, $P_{th}$ for thermal pressure)
list_idx_sampling (List[int]) – indices of the physical parameters that were sampled (the other ones were fixed)
list_fixed_values (List[float]) – list of used values for the parameters fixed during the sampling
plot_1D_chains (bool) – wether to plot each of the $N \times D$ 1D chains and histograms for each physical parameter $\theta_{nd}$
plot_2D_chains (bool) – wether to plot each of the $N \times D \times (D-1)$ 2D chains and histograms for pairs of parameters $(\theta_{n d_1}, \theta_{n d_2})$ with $1 \leq d_1 < d_2 \leq D$
plot_ESS (bool) – wether to plot the Effective sample size maps (only used when $N > 1$)
plot_comparisons_yspace (bool) – whether to plot comparisons of the distribution on $y_n$ and $\mathcal{A}(f(\theta_n))$ (with $\mathcal{A}$ the noise model). Offers a visualization to understand the model checking based on the Bayesian p-value [Palud et al., 2023]
estimator_plot (Optional[PlotsEstimator], optional) – object used to plot the estimator figures, by default None
analyze_regularization_weight (bool, optional) – wether to analyze the evaluation of the regularization weight $\tau$, by default False
Theta_true_scaled (Optional[np.ndarray], optional) – true value for the inferred physical parameter $\Theta$ (only possible for toy cases), by default None
list_lines_fit (Optional[List[str]], optional) – names of the observables used for the inversion, by default None
list_lines_valid (List[str], optional) – names of the available observables not used for the inversion (can be used for model checking), by default []
y_valid (Optional[np.ndarray], optional) – observation values for the observables not used for inversion. If provided, must have shape (N, L_valid). by default None
sigma_a_valid (Optional[np.ndarray], optional) – additive noise standard deviation values for the observables not used for inversion. If provided, must have shape (N, L_valid). by default None
omega_valid (Optional[np.ndarray], optional) – censor threshold values for the observables not used for inversion. If provided, must have shape (N, L_valid)., by default None
sigma_m_valid (Optional[np.ndarray], optional) – multiplicative noise standard deviation values for the observables not used for inversion. If provided, must have shape (N, L_valid)., by default None
point_challenger (Dict, optional) – other estimator that can come from the literature, provided to be compared with the inference results, by default {}
list_CI (List[int], optional) – list of credibility intervals to evaluate (in percent), by default [68, 90, 95, 99]

max_workers

maximum number of workers that can be used for results extraction

Type:: int

path_data_csv_out_mcmc

path to the csv file in which the performance of estimators is to be saved

Type:: str

path_img

path to the folder in which images are to be saved

Type:: str

path_raw

path to the raw .hdf5 files

Type:: str

beetroots.inversion.results.results_optim_map module

class beetroots.inversion.results.results_optim_map.ResultsExtractorOptimMAP(path_data_csv_out_optim_map: str, path_img: str, path_raw: str, N_MCMC: int, T_MC: int, T_BI: int, freq_save: int, max_workers: int)[source]

Bases: ResultsExtractor

extractor of inference results for the data of the optimization runs that that was saved.

Parameters:

path_data_csv_out_optim_map (str) – path to the csv file in which the performance of estimators is to be saved
path_img (str) – path to the folder in which images are to be saved
path_raw (str) – path to the raw .hdf5 files
N_MCMC (int) – number of optimization procedures to run per posterior distribution
T_MC (int) – total size of each optimization procedure
T_BI (int) – duration of the Burn-in phase
freq_save (int) – frequency of saved iterates, 1 means that all iterates were saved (used to show correct optimization procedure sizes in chain plots)
max_workers (int) – maximum number of workers that can be used for results extraction

N_MCMC

number of optimization procedures to run per posterior distribution

Type:: int

T_BI

duration of the Burn-in phase

Type:: int

T_MC

total size of each optimization procedure

Type:: int

freq_save

frequency of saved iterates, 1 means that all iterates were saved (used to show correct optimization procedure sizes in chain plots)

Type:: int

main(posterior: Posterior, model_name: str, scaler: Scaler, list_idx_sampling: List[int], list_fixed_values: List[float | None] | ndarray, estimator_plot: PlotsEstimator | None, Theta_true_scaled: ndarray | None = None)[source]

performs the data extraction, in this order:

step 1 : clppd
step 2 : kernel analysis
step 3 : objective evolution
step 4 : MAP estimator from samples
step 5 : model checking with Bayesian p-value

Parameters:

posterior (Posterior) – probability distribution. The goal of the optimization procedure was to find its mode, i.e., the minimum of its negative log pdf.
model_name (str) – name of the model, used to identify the posterior distribution
scaler (Scaler) – contains the transformation of the Theta values from their natural space to their scaled space (in which the sampling happens) and its inverse
list_idx_sampling (List[int]) – indices of the physical parameters that were sampled (the other ones were fixed)
list_fixed_values (List[float]) – list of used values for the parameters fixed during the sampling
estimator_plot (PlotsEstimator) – object used to plot the estimator figures
Theta_true_scaled (Optional[np.ndarray], optional) – true value for the inferred physical parameter $\Theta$ (only possible for toy cases), by default None

max_workers

maximum number of workers that can be used for results extraction

Type:: int

path_data_csv_out_optim_map

path to the csv file in which the performance of estimators is to be saved

Type:: str

path_img

path to the folder in which images are to be saved

Type:: str

path_raw

path to the raw .hdf5 files

Type:: str

classmethod read_estimator(path_data_csv_out_optim_map: str, model_name: str) → Tuple[ndarray, DataFrame][source]

reads the value of an already estimated MAP from a csv file.

Parameters:

path_data_csv_out_optim_map (str) – path to the csv file containing an already estimated MAP
model_name (str) – name of the model, used to identify the posterior distribution

Returns:

np.ndarray – MAP estimator
pd.DataFrame – original DataFrame read from the csv file

beetroots.inversion.results package

Subpackages

Submodules

beetroots.inversion.results.abstract_results module

beetroots.inversion.results.results_mcmc module

beetroots.inversion.results.results_optim_map module

Module contents