beetroots.inversion.results package

Subpackages

Submodules

beetroots.inversion.results.abstract_results module

class beetroots.inversion.results.abstract_results.ResultsExtractor[source]

Bases: ABC

extractor of the results of an inversion

abstract main(**kwargs)[source]

beetroots.inversion.results.results_mcmc module

class beetroots.inversion.results.results_mcmc.ResultsExtractorMCMC(path_data_csv_out_mcmc: str, path_img: str, path_raw: str, N_MCMC: int, T_MC: int, T_BI: int, freq_save: int, max_workers: int)[source]

Bases: ResultsExtractor

extractor of inference results for the Markov chain data that was saved.

Parameters:
  • path_data_csv_out_mcmc (str) – path to the csv file in which the performance of estimators is to be saved

  • path_img (str) – path to the folder in which images are to be saved

  • path_raw (str) – path to the raw .hdf5 files

  • N_MCMC (int) – number of Markov chains to run per posterior distribution

  • T_MC (int) – total size of each Markov chain

  • T_BI (int) – duration of the Burn-in phase

  • freq_save (int) – frequency of saved iterates, 1 means that all iterates were saved (used to show correct Markov chain sizes in chain plots)

  • max_workers (int) – maximum number of workers that can be used for results extraction

N_MCMC

number of Markov chains to run per posterior distribution

Type:

int

T_BI

duration of the Burn-in phase

Type:

int

T_MC

total size of each Markov chain

Type:

int

freq_save

frequency of saved iterates, 1 means that all iterates were saved (used to show correct Markov chain sizes in chain plots)

Type:

int

main(posterior: Posterior, model_name: str, scaler: Scaler, list_names: List[str], list_idx_sampling: List[int], list_fixed_values: List[float | None] | ndarray, plot_1D_chains: bool, plot_2D_chains: bool, plot_ESS: bool, plot_comparisons_yspace: bool, estimator_plot: PlotsEstimator | None = None, analyze_regularization_weight: bool = False, Theta_true_scaled: ndarray | None = None, list_lines_fit: List[str] | None = None, list_lines_valid: List[str] = [], y_valid: ndarray | None = None, sigma_a_valid: ndarray | None = None, omega_valid: ndarray | None = None, sigma_m_valid: ndarray | None = None, point_challenger: Dict = {}, list_CI: List[int] = [68, 90, 95, 99])[source]

performs the data extraction, in this order:

  • step 1 : clppd (see beetroots.inversion.results.utils.clppd)

  • step 2 : kernel analysis (see beetroots.inversion.results.utils.kernel)

  • step 3 : objective evolution (see beetroots.inversion.results.utils.objective) (the objective is the negative log posterior pdf)

  • step 4 : MAP estimator from samples (see beetroots.inversion.results.utils.lowest_obj_estimator)

  • step 5 : deal with whole Markov chain for MMSE and histograms, in a pixel-wise approach to avoid overloading the memory (see beetroots.inversion.results.utils.mc)

  • step 6 : save global MMSE performance (see beetroots.inversion.results.utils.mmse_ci)

  • step 7 (if map) : plot maps of ESS (see beetroots.inversion.results.utils.ess_plots)

  • step 8 : model checking with Bayesian p-value computation (see beetroots.inversion.results.utils.bayes_pval_plots)

  • step 9 (if the true value is known): plot how many components have their true value between min and max of Markov chain (see beetroots.inversion.results.utils.valid_mc)

  • step 10 : plot comparison of distributions of \(y_n\) and \(f(\theta_n)\) for all \(n \in [\![1, N]\!]\) (see beetroots.inversion.results.utils.y_f_Theta)

  • step 11 (if analyze_regularization_weight) : analysis of the regularization weight \(\tau\) automatic tuning (see beetroots.inversion.results.utils.regularization_weights)

Parameters:
  • posterior (Posterior) – probability distribution used to generate the Markov chain(s)

  • model_name (str) – name of the model, used to identify the posterior distribution

  • scaler (Scaler) – contains the transformation of the Theta values from their natural space to their scaled space (in which the sampling happens) and its inverse

  • list_names (List[str]) – names of the D physical parameters to appear in plots (for instance, $P_{th}$ for thermal pressure)

  • list_idx_sampling (List[int]) – indices of the physical parameters that were sampled (the other ones were fixed)

  • list_fixed_values (List[float]) – list of used values for the parameters fixed during the sampling

  • plot_1D_chains (bool) – wether to plot each of the \(N \times D\) 1D chains and histograms for each physical parameter \(\theta_{nd}\)

  • plot_2D_chains (bool) – wether to plot each of the \(N \times D \times (D-1)\) 2D chains and histograms for pairs of parameters \((\theta_{n d_1}, \theta_{n d_2})\) with \(1 \leq d_1 < d_2 \leq D\)

  • plot_ESS (bool) – wether to plot the Effective sample size maps (only used when \(N > 1\))

  • plot_comparisons_yspace (bool) – whether to plot comparisons of the distribution on \(y_n\) and \(\mathcal{A}(f(\theta_n))\) (with \(\mathcal{A}\) the noise model). Offers a visualization to understand the model checking based on the Bayesian p-value [Palud et al., 2023]

  • estimator_plot (Optional[PlotsEstimator], optional) – object used to plot the estimator figures, by default None

  • analyze_regularization_weight (bool, optional) – wether to analyze the evaluation of the regularization weight \(\tau\), by default False

  • Theta_true_scaled (Optional[np.ndarray], optional) – true value for the inferred physical parameter \(\Theta\) (only possible for toy cases), by default None

  • list_lines_fit (Optional[List[str]], optional) – names of the observables used for the inversion, by default None

  • list_lines_valid (List[str], optional) – names of the available observables not used for the inversion (can be used for model checking), by default []

  • y_valid (Optional[np.ndarray], optional) – observation values for the observables not used for inversion. If provided, must have shape (N, L_valid). by default None

  • sigma_a_valid (Optional[np.ndarray], optional) – additive noise standard deviation values for the observables not used for inversion. If provided, must have shape (N, L_valid). by default None

  • omega_valid (Optional[np.ndarray], optional) – censor threshold values for the observables not used for inversion. If provided, must have shape (N, L_valid)., by default None

  • sigma_m_valid (Optional[np.ndarray], optional) – multiplicative noise standard deviation values for the observables not used for inversion. If provided, must have shape (N, L_valid)., by default None

  • point_challenger (Dict, optional) – other estimator that can come from the literature, provided to be compared with the inference results, by default {}

  • list_CI (List[int], optional) – list of credibility intervals to evaluate (in percent), by default [68, 90, 95, 99]

max_workers

maximum number of workers that can be used for results extraction

Type:

int

path_data_csv_out_mcmc

path to the csv file in which the performance of estimators is to be saved

Type:

str

path_img

path to the folder in which images are to be saved

Type:

str

path_raw

path to the raw .hdf5 files

Type:

str

beetroots.inversion.results.results_optim_map module

class beetroots.inversion.results.results_optim_map.ResultsExtractorOptimMAP(path_data_csv_out_optim_map: str, path_img: str, path_raw: str, N_MCMC: int, T_MC: int, T_BI: int, freq_save: int, max_workers: int)[source]

Bases: ResultsExtractor

extractor of inference results for the data of the optimization runs that that was saved.

Parameters:
  • path_data_csv_out_optim_map (str) – path to the csv file in which the performance of estimators is to be saved

  • path_img (str) – path to the folder in which images are to be saved

  • path_raw (str) – path to the raw .hdf5 files

  • N_MCMC (int) – number of optimization procedures to run per posterior distribution

  • T_MC (int) – total size of each optimization procedure

  • T_BI (int) – duration of the Burn-in phase

  • freq_save (int) – frequency of saved iterates, 1 means that all iterates were saved (used to show correct optimization procedure sizes in chain plots)

  • max_workers (int) – maximum number of workers that can be used for results extraction

N_MCMC

number of optimization procedures to run per posterior distribution

Type:

int

T_BI

duration of the Burn-in phase

Type:

int

T_MC

total size of each optimization procedure

Type:

int

freq_save

frequency of saved iterates, 1 means that all iterates were saved (used to show correct optimization procedure sizes in chain plots)

Type:

int

main(posterior: Posterior, model_name: str, scaler: Scaler, list_idx_sampling: List[int], list_fixed_values: List[float | None] | ndarray, estimator_plot: PlotsEstimator | None, Theta_true_scaled: ndarray | None = None)[source]

performs the data extraction, in this order:

  • step 1 : clppd

  • step 2 : kernel analysis

  • step 3 : objective evolution

  • step 4 : MAP estimator from samples

  • step 5 : model checking with Bayesian p-value

Parameters:
  • posterior (Posterior) – probability distribution. The goal of the optimization procedure was to find its mode, i.e., the minimum of its negative log pdf.

  • model_name (str) – name of the model, used to identify the posterior distribution

  • scaler (Scaler) – contains the transformation of the Theta values from their natural space to their scaled space (in which the sampling happens) and its inverse

  • list_idx_sampling (List[int]) – indices of the physical parameters that were sampled (the other ones were fixed)

  • list_fixed_values (List[float]) – list of used values for the parameters fixed during the sampling

  • estimator_plot (PlotsEstimator) – object used to plot the estimator figures

  • Theta_true_scaled (Optional[np.ndarray], optional) – true value for the inferred physical parameter \(\Theta\) (only possible for toy cases), by default None

max_workers

maximum number of workers that can be used for results extraction

Type:

int

path_data_csv_out_optim_map

path to the csv file in which the performance of estimators is to be saved

Type:

str

path_img

path to the folder in which images are to be saved

Type:

str

path_raw

path to the raw .hdf5 files

Type:

str

classmethod read_estimator(path_data_csv_out_optim_map: str, model_name: str) Tuple[ndarray, DataFrame][source]

reads the value of an already estimated MAP from a csv file.

Parameters:
  • path_data_csv_out_optim_map (str) – path to the csv file containing an already estimated MAP

  • model_name (str) – name of the model, used to identify the posterior distribution

Returns:

  • np.ndarray – MAP estimator

  • pd.DataFrame – original DataFrame read from the csv file

Module contents