combine_solutions module

Take the best compensation settings among several runs.

Basic usage: ./combine_solutions.py -d /path/to/sim1/ /path/to/sim2/ /path/to/sim3

This is useful when different strategies are best on different fault scenarios.

The compensation settings are sorted against a column in the evaluations.csv files.

Reminder on the directory structure and conventions:

simulation_1 <- project folder: 000000_ref <- simulation folder (holds a fault scenario) | 0_Envelope1D | 1_TraceWin

000001 | 0_Envelope1D | 1_TraceWin

…

evaluations.csv
simulation_2: 000000_ref

000001

…

combine_bests(paths: Sequence[Path | str], criterion_to_minimize: str = 'Lost power over whole linac in W.', out_folder: Path | str = '', copy: bool = False) → None

Compare several solutions, and concatenate the best one.

To determine the best solution for every fault scenario, we open each simulation’s evaluations.csv, look at the value of the column criterion_to_minimize and keep the simulation with the lowest.

Parameters:

paths (Sequence[pathlib.Path]) – Project folders (where evaluations.csv and every simulation is).
criterion_to_minimize (str, optional) – The evaluations.csv column against which simulations are compared. The default is "Lost power over whole linac in W." (we keep simulations with the lowest lost power).
out_folder (pathlib.Path | str, optional) – Where every best simulation folder will be gathered. If not provided, we create a combined/ folder in the last common ancestor of all provided paths.
copy (bool, optional) – To create hard-copies of the original simulation folders instead of creating a symlink. The default is False.

_infer_an_output_folder(paths: Sequence[Path], output_folder_name: str = 'combined') → Path: Return output folder in the last common parent of paths.

_select_best_simulations(paths: Sequence[Path], criterion_to_minimize: str) → tuple[Series, DataFrame]

Give the name of the best solution according to criterion_to_minimize

Parameters:

paths (Sequence[pathlib.Path]) – Path to project folders to be compared.
criterion_to_minimize (str) – The quantity that we want to minimize. It must be a column name in evaluations.csv.

Returns:

best_simulation_folders (pd.Series) – For each fault scenario, holds the path to the best simulation folder.
combined (pandas.DataFrame) – A evaluations.csv where each row holds the values for the best simulation. You must ensure that all evaluations.csv have the same columns.

_reconstruct_folder_names(n_simulations: int) → Series: Reconstruct the name of the simulation folders.

_concat_evaluations_files(evaluations: dict[str, DataFrame], best_solutions: Sequence[Path]) → DataFrame

Concatenate the evaluations, taking only the best.

Parameters:

evaluations (dict[str, pandas.DataFrame]) – Keys are user-defined names for every simulation. Values are corresponding evaluations.csv files; their columns must be the same. They must have the same indexes.
best_solutions (Sequence[str]) – For every calculation, defines the evaluation we want to keep. The lenght must be the same as the length of every object in evaluations. Contained strings must all be keys of evaluations.

Returns:

combined – Same columns as evaluations. Every row contains the evaluations.csv row of the best solution.

Return type:

pandas.DataFrame

_gather_best_simulations_in_same_place(best_simulation_folders: Collection[Path], out_folder: Path, create_symlinks_instead_of_hard_copies: bool, dirs_exist_ok: bool = True) → None: Concatenate the best simulation folders in a single place.

_copy_or_create_symlink(src: Path, dst: Path, create_symlinks_instead_of_hard_copies: bool, dirs_exist_ok: bool) → None: Copy src to dst or create symlinks.

_load_evaluation(evaluation_folder: Path, evaluation_namecol: Sequence[str] | None = None, new_name: Sequence[str] | None = None) → DataFrame

Load the file and rename column header.

Parameters:

evaluation_folder (pathlib.Path) – Folder where the evaluations.csv file is.
evaluation_namecol (Sequence[str] | None, optional) – Name of the column in the evaluations.csv for the sorting; if not provided, we keep all the columns. The default is None.
new_name (Sequence[str] | None, optional) – If provided, the loaded columns will be renamed with this. The length of new_name must match evaluation_namecol.

Returns:

Holds all the values of evaluation_namecol in evaluation_path.

Return type:

pandas.DataFrame

main()