Sample

class kerch.feature.Sample(*args, **kwargs)[source]

Bases: Stochastic

Parameters:

sample (Tensor(num_sample, dim_input), optional) – Sample points used to compute the kernel matrix. When an out-of-sample computation is asked, it will be given relative to these samples., defaults to None
sample_trainable (bool, optional) – True if the gradients of the sample points are to be computed. If so, a graph is computed and the sample can be updated. False just leads to a static computation., defaults to False
num_sample (int, optional) – Number of sample points. This parameter is neglected if sample is not None and overwritten by the number of points contained in sample., defaults to 1
dim_input (int, optional) – Dimension of each sample point. This parameter is neglected if sample is not None and overwritten by the dimension of the sample points., defaults to 1
idx_sample (float, optional) – Initializes the indices of the samples to be updated. All indices are considered if both idx_stochastic and prop_stochastic are None., defaults to None
prop_sample – Instead of giving indices, specifying a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_stochastic \(\leq 1\). All indices are considered if both idx_stochastic and prop_stochastic are None., defaults to None.
sample_transform (List[str]) – TODO
cache_level (str, optional) – Cache level for saving temporary execution results during the execution. The higher the cache, the more is saved. Defaults to 'normal'. We refer to the Cache Management documentation for further information.
logging_level (int, optional) – Logging level for this specific instance. If the value is None, the current default kerch global log level will be used. Defaults to None (default kerch logging level). We refer to the Logging in Kerch documentation for further information.

cache_keys(private: bool = False) → Iterable[str]

Returns an iterable containing the different cache keys. We refer to the Cache Management documentation for more information.

Parameters:: private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

property cache_level: str

Cache level for saving temporary execution results during the execution. The higher the cache, the more is saved. Defaults to 'normal' unless set otherwise during instantiation. The different possible values are:

"none": the cache is non-existent and everything is computed on the go.
"light": the cache is very light. For example, only the kernel matrix and statistics of the sample points are saved.
"normal": same as light, but the statistics of the out-of-sample points are also saved.
"heavy": in addition to the statistics, the final kernel matrices of the out-of-sample points are saved.
"total": every step of any computation is saved.

We refer to the Cache Management documentation for further information.

property current_sample: Tensor

property current_sample_projected: Tensor: Returns the sample that is currently used in the computations and for the normalizing and centering statistics if relevant.

property dim_input: int: Dimension of each datapoint.

property empty_sample: bool: Boolean specifying if the sample is empty or not.

property hparams_fixed: dict: Fixed hyperparameters of the module. By contrast with hparams_variable, these are the values that are fixed and cannot possibly change during the training. If applicable, these can be specific architecture values for example. We refer to the documentation of Kerch Module for further information.

property hparams_variable: dict

Variable hyperparameters of the module. By contrast with hparams_fixed, these are the values that are may change during the training and may be monitored at various stages during the training. If applicable, these can be kernel bandwidth parameters for example.

Note

All parameters that are potentially trainable, like a kernel bandwidth \(\sigma\) for example, are included in this dictionary, even if the corresponding trainable argument is set to False. In the latter case, they will be not evolve during training iterations, but will still be included in this dictionary.

We refer to the documentation of Kerch Module for further information.

property idx: Tensor: Indices used when performing various operations. This is only relevant in the case of stochastic training.

init_sample(sample=None, idx_sample=None, prop_sample=None)[source]

Initializes the sample set (and the stochastic indices).

Parameters:

sample (Tensor, optional) – Sample points used for the various computations. When an out-of-sample computation is asked, it will be given relative to these samples. In case of overwriting a current sample, num_sample and dim_input are also overwritten. If None is specified, the sample data will be initialized according to num_sample and dim_input specified during the construction. If a previous sample set has been used, it will keep the same dimension by consequence. A last case occurs when sample is of the class torch.nn.Parameter: the sample will then use those values, and they can thus be shared with the level calling this method., defaults to None
idx_sample (int[], optional) – Initializes the indices of the samples to be updated. All indices are considered if both idx_sample and prop_sample are None., defaults to None
prop_sample – Instead of giving indices, specifying a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_sample \(\leq 1\). All indices are considered if both idx_sample and prop_sample are None., defaults to None.

property num_idx: int: Number of selected indices when performing various operations. This is only relevant in the case of stochastic training.

property num_sample: int: Number of datapoints in the sample set.

print_cache(private: bool = False) → None

Prints the cache content. We refer to the Cache Management documentation for further information.

Parameters:: private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

reset(recurse=False, reset_persisting=True) → None

Resets the cache to be empty. We refer to the Cache Management documentation for more information.

Parameters:

recurse (bool, optional) – If True, resets the cache of this module and also of its potential children. otherwise, it only resets the cache for this module. Defaults to True.
reset_persisting (bool, optional) – Persisting elements are meant to resist to a cache reset (see _save()). The option allows to also reset them if True. Defaults to True.

property sample: Parameter: Full original raw sample without any transform or potential stochastic selection.

property sample_trainable: bool: Boolean if the sample data can be trained.

property sample_transform: TransformTree: Default transform used by the sample.

stochastic(idx=None, prop=None)

Resets which subset of the samples are to be used until the next call of this function. This is relevant in the case of stochastic training.

Parameters:

idx (int[], optional) – Indices of the sample subset relative to the original sample set., defaults to None
prop (double, optional) – Instead of giving indices, passing a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_stochastic \(\leq 1\)., defaults to None.

If None is specified for both idx_stochastic and prop_stochastic, all samples are used and the subset equals the original sample set. This is also the default behavior if this function is never called, nor the parameters specified during initialization.

Note

Both idx_stochastic and prop_stochastic cannot be filled together as conflict would arise.

train(mode=True): Activates the training mode, which disables the gradients computation and disables stochasticity. For the gradients and other things, we refer to the torch.nn.Module documentation. For the stochastic part, when put in evaluation mode (False), all the sample points are used for the computations, regardless of the previously specified indices.

transform_input(data) → Tensor | None[source]: Apply to value the same transform as on the sample.

transform_sample_revert(data) → Tensor[source]: Get back the original value from a projected value, by using the same transform as the sample, but in reverse. This is not always feasible, depending on the transform used (normalizations are typically not invertible as they are transform which are not bijective).

update_sample(sample_values, idx_sample=None)[source]

Updates the sample set. In contradiction to init_samples, this only updates the values of the sample and sets the gradients of the updated values to zero if relevant.

Parameters:

sample_values (Tensor) – Values given to the updated samples.
idx_sample (int[], optional) – Indices of the samples to be updated. All indices are considered if None., defaults to None