domainlab.utils package¶
Submodules¶
domainlab.utils.flows_gen_img_model module¶
domainlab.utils.generate_benchmark_plots module¶
generate the benchmark plots by calling the gen_bencmark_plots(…) function
- domainlab.utils.generate_benchmark_plots.boxplot(dataframe_in, obj, file=None)[source]¶
generate the boxplots dataframe_in: dataframe containing the data with columns
[param_idx, task , algo, epos, te_d, seed, params, obj1, …, obj2]
obj: objective to be considered in the plot (needs to be contained in dataframe_in) file: foldername to save the plots (if None, the plot will not be saved)
- domainlab.utils.generate_benchmark_plots.boxplot_stochastic(dataframe_in, obj, file=None)[source]¶
generate boxplot for stochastic variation dataframe_in: dataframe containing the data with columns
[param_idx, task , algo, epos, te_d, seed, params, obj1, …, obj2]
obj: objective to be considered in the plot (needs to be contained in dataframe_in) file: foldername to save the plots (if None, the plot will not be saved)
- domainlab.utils.generate_benchmark_plots.boxplot_systematic(dataframe_in, obj, file=None)[source]¶
generate boxplot for ssystemtic variation dataframe_in: dataframe containing the data with columns
[param_idx, task , algo, epos, te_d, seed, params, obj1, …, obj2]
obj: objective to be considered in the plot (needs to be contained in dataframe_in) file: foldername to save the plots (if None, the plot will not be saved)
- domainlab.utils.generate_benchmark_plots.gen_benchmark_plots(agg_results: str, output_dir: str, use_param_index: bool = True)[source]¶
generate the benchmark plots from a csv file containing the aggregated restults. The csv file must have the columns: [param_index, task, algo, epos, te_d, seed, params, …] all columns after seed are intrepreted as objectives of the results, they can e.g. be acc, precision, recall, specificity, f1, auroc.
agg_results: path to the csv file output_dir: path to a folder which shall contain the results skip_gen: Skips the actual plotting, used to speed up testing.
- domainlab.utils.generate_benchmark_plots.gen_plots(dataframe: DataFrame, output_dir: str, use_param_index: bool)[source]¶
dataframe: dataframe with columns [‘param_index’,’task’,’ algo’,’ epos’,’ te_d’,’ seed’,’ params’,’ acc’,’precision’,…]
- domainlab.utils.generate_benchmark_plots.radar_plot(dataframe_in, file=None, distinguish_hyperparam=True)[source]¶
- dataframe_in: dataframe containing the data with columns
[algo, epos, te_d, seed, params, obj1, …, obj2]
file: filename to save the plots (if None, the plot will not be saved) distinguish_param_setups: if True the plot will not only distinguish between models,
but also between the parameter setups
- domainlab.utils.generate_benchmark_plots.round_vals_in_dict(df_column_in, use_param_index)[source]¶
replaces the dictionary by a string containing only the significant digits of the hyperparams or (if use_param_index = True) by the parameter index df_column_in: columns of the dataframe containing the param index and the dictionary of
hyperparams in the form [param_index, params]
use_param_index: usage of param_index instead of exact values
- domainlab.utils.generate_benchmark_plots.scatterplot(dataframe_in, obj, file=None, kde=True, distinguish_hyperparam=False)[source]¶
- dataframe: dataframe containing the data with columns
[algo, epos, te_d, seed, params, obj1, …, obj2]
obj1 & obj2: name of the objectives which shall be plotted against each other file: filename to save the plots (if None, the plot will not be saved) kde: if True the distribution of the points will be estimated and plotted as kde plot distinguish_param_setups: if True the plot will not only distinguish between models,
but also between the parameter setups
- domainlab.utils.generate_benchmark_plots.scatterplot_matrix(dataframe_in, use_param_index, file=None, kind='reg', distinguish_param_setups=True)[source]¶
- dataframe: dataframe containing the data with columns
[algo, epos, te_d, seed, params, obj1, …, obj2]
file: filename to save the plots (if None, the plot will not be saved) reg: if True a regression line will be plotted over the data distinguish_param_setups: if True the plot will not only distinguish between models,
but also between the parameter setups
domainlab.utils.get_git_tag module¶
domainlab.utils.hyperparameter_gridsearch module¶
gridsearch for the hyperparameter space
def add_next_param_from_list is an recursive function to make cartesian product along all the scalar hyper-parameters, this resursive function is used in def grid_task
- domainlab.utils.hyperparameter_gridsearch.add_next_param_from_list(param_grid: dict, grid: dict, grid_df: DataFrame)[source]¶
can be used in a recoursive fassion to add all combinations of the parameters in param_grid to grid_df param_grid: dictionary with all possible values for each parameter
{‘p1’: [1, 2, 3], ‘p2’: [0, 5], …}
- grid: a grid which will build itself in the recursion, start with grid = {}
after one step grid = {p1: 1}
grid_df: dataframe which will save the finished grids task_name: task name also: G_MODEL_NA name
- domainlab.utils.hyperparameter_gridsearch.add_references_and_check_constraints(grid_df_prior, grid_df, referenced_params, config, task_name)[source]¶
in the last step all parameters which are referenced need to be add to the grid. All gridpoints not satisfying the constraints are removed afterwards.
use the parameters in the dataframe of shared parameters and add them to the dictionary of parameters for the current task only the shared parameters specified in the config are respected shared_df: Dataframe of shared hyperparameters dict_param_grids: dictionary of the parameter grids config: config for the current task
go back from the data frame format of the shared hyperparamters to a list format
- domainlab.utils.hyperparameter_gridsearch.grid_task(grid_df: DataFrame, task_name: str, config: dict, shared_df: DataFrame)[source]¶
create grid for one sampling task for a method and add it to the dataframe
- domainlab.utils.hyperparameter_gridsearch.lognormal_grid(param_config)[source]¶
get a normal distributed grid given the specifications in the param_config param_config: config which needs to contain ‘num’, ‘mean’, ‘std’
- domainlab.utils.hyperparameter_gridsearch.loguniform_grid(param_config)[source]¶
get a loguniform distributed grid given the specifications in the param_config param_config: config which needs to contain ‘num’, ‘max’, ‘min’
- domainlab.utils.hyperparameter_gridsearch.normal_grid(param_config, lognormal=False)[source]¶
get a normal distributed grid given the specifications in the param_config param_config: config which needs to contain ‘num’, ‘mean’, ‘std’
- domainlab.utils.hyperparameter_gridsearch.rais_error_if_num_not_specified(param_name: str, param_config: dict)[source]¶
for each parameter a number of grid points needs to be specified This function raises an error if this is not the case param_name: parameter name under consideration param_config: config of this parameter
- domainlab.utils.hyperparameter_gridsearch.round_to_discreate_grid_normal(grid, param_config)[source]¶
round the values of the grid to the grid spacing specified in the config for normal and lognormal grids
- domainlab.utils.hyperparameter_gridsearch.round_to_discreate_grid_uniform(grid, param_config)[source]¶
round the values of the grid to the grid spacing specified in the config for uniform and loguniform grids
- domainlab.utils.hyperparameter_gridsearch.sample_grid(param_config)[source]¶
given the parameter config, this function samples all parameters which are distributed according the the categorical, uniform, loguniform, normal or lognormal distribution.
- domainlab.utils.hyperparameter_gridsearch.sample_gridsearch(config: dict, dest: str | None = None) DataFrame [source]¶
create the hyperparameters grid according to the given config, which should be the dictionary of the full benchmark config yaml. Result is saved to ‘output_dir/hyperparameters.csv’ of the config if not specified explicitly.
Note: Parts of the yaml content are executed. Thus use this only with trusted config files.
domainlab.utils.hyperparameter_retrieval module¶
retrieval for hyperparameters
domainlab.utils.hyperparameter_sampling module¶
Samples the hyperparameters according to a benchmark configuration file.
# Structure of this file: - Class Hyperparameter # Inherited Classes # Functions to sample hyper-parameters and log into csv file
- class domainlab.utils.hyperparameter_sampling.CategoricalHyperparameter(name: str, config: dict)[source]¶
Bases:
Hyperparameter
A sampled hyperparameter, which is constraint to fixed, user given values and datatype
- class domainlab.utils.hyperparameter_sampling.Hyperparameter(name: str)[source]¶
Bases:
object
Represents a hyperparameter. The datatype of .val is int if step and p1 is integer valued, else float.
p1: min or mean p2: max or scale reference: None or name of referenced hyperparameter
- class domainlab.utils.hyperparameter_sampling.ReferenceHyperparameter(name: str, config: dict)[source]¶
Bases:
Hyperparameter
Hyperparameter that references only a different one. Thus, this parameter is not sampled but set after sampling.
- class domainlab.utils.hyperparameter_sampling.SampledHyperparameter(name: str, config: dict)[source]¶
Bases:
Hyperparameter
A numeric hyperparameter that shall be sampled
- domainlab.utils.hyperparameter_sampling.check_constraints(params: List[Hyperparameter], constraints) bool [source]¶
Check if the constraints are fulfilled.
add informations like task, G_MODEL_NA and constrainds to the shared samples Parameters: shared_samples: pd Dataframe with columns [G_METHOD_NA, G_MODEL_NA, ‘params’] config: dataframe with yaml configuration of the current task task_name: name of the current task
- domainlab.utils.hyperparameter_sampling.get_hyperparameter(name: str, config: dict) Hyperparameter [source]¶
Factory function. Instantiates the correct Hyperparameter
creates a dataframe with columns [task, G_MODEL_NA, params],
task and G_MODEL_NA are all for all rows, but params is filled with the shared parameters of shared_samples_full requested by task_config.
creates a shared config containing only information about the
shared hyperparameters requested by the task_config
- domainlab.utils.hyperparameter_sampling.is_dict_with_key(input_dict, key) bool [source]¶
Determines if the input argument is a dictionary and it has key
- domainlab.utils.hyperparameter_sampling.sample_hyperparameters(config: dict, dest: str | None = None, sampling_seed: int | None = None) DataFrame [source]¶
Samples the hyperparameters according to the given config, which should be the dictionary of the full benchmark config yaml. Result is saved to ‘output_dir/hyperparameters.csv’ of the config if not specified explicitly.
Note: Parts of the yaml content are executed. Thus use this only with trusted config files.
- domainlab.utils.hyperparameter_sampling.sample_parameters(init_params: List[Hyperparameter], constraints, shared_config=None, shared_samples=None) dict [source]¶
Tries to sample from the hyperparameter list.
Errors if in 10_0000 attempts no sample complying with the constraints is found.
- domainlab.utils.hyperparameter_sampling.sample_task(num_samples: int, task_name: str, conf_samp: tuple, shared_conf_samp: tuple)[source]¶
Sample one task and add it to the dataframe
sample one task and add it to the dataframe for task descriptions which only contain shared hyperparameters
domainlab.utils.logger module¶
A logger for our software
- class domainlab.utils.logger.Logger[source]¶
Bases:
object
static logger class
- static get_logger(logger_name='logger_6676', loglevel='INFO')[source]¶
returns a logger if no logger was created yet, it will create a logger with the name specified in logger_name with the level specified in loglevel. If the logger was created for the first time the arguments do not change anything at the behaviour anymore
- logger = None¶
domainlab.utils.override_interface module¶
- domainlab.utils.override_interface.override_interface(interface_class)[source]¶
overrides. :param interface_class: the interface class name, always specify this explicitly as otherwise interface_class is going to be the nearest function it decorate, and argument “method2override” of returned function “overrider” accept will be the current child class
class BaseClass() class Child(BaseClass): @overrides(BaseClass) def fun(self): pass
domainlab.utils.perf module¶
Classification Performance
- class domainlab.utils.perf.PerfClassif[source]¶
Bases:
object
Classification Performance
- classmethod cal_acc(model, loader_te, device)[source]¶
- Parameters:
model –
loader_te –
device – for final test, GPU can be used
domainlab.utils.perf_metrics module¶
Classification Performance
domainlab.utils.sanity_check module¶
This class is used to perform the sanity check on a task description
- class domainlab.utils.sanity_check.SanityCheck(args, task)[source]¶
Bases:
object
Performs a sanity check on the given args and the task when running dataset_sanity_check(self)
domainlab.utils.test_img module¶
domainlab.utils.u_import module¶
domainlab.utils.u_import_net_module module¶
import external neural network implementation
- domainlab.utils.u_import_net_module.build_external_obj_net_module_feat_extract(mpath, dim_y, remove_last_layer)[source]¶
The user provide a function to initiate an object of the neural network, which is fine for training but problematic for persistence of the trained model since it is created externally. :param mpath: path of external python file where the neural network architecture is defined :param dim_y: dimension of features
domainlab.utils.utils_class module¶
domainlab.utils.utils_classif module¶
- domainlab.utils.utils_classif.get_label_na(tensor_ind, list_str_na)[source]¶
given list of label names in strings, map tensor of index to label names
domainlab.utils.utils_cuda module¶
choose devices