Ridge Regression

class kerch.level.Ridge(*args, **kwargs)[source]

Bases: LSSVM

property C: Tensor

Returns the explicit matrix on the sample datapoints.

\[C = \frac{1}{\texttt{num_idx}}\sum_i^\texttt{num_idx} \phi(x_i)\phi(x_i)^\top.\]
property Corr: Tensor

Returns the correlation matrix of the sample. Same as calling self.corr().

property Cov: Tensor

Returns the covariance matrix of the sample. Same as calling self.cov().

property H: Tensor
property K: Tensor

Returns the kernel matrix on the sample data. Same result as calling k(), but faster. It is loaded from memory if already computed and unchanged since then, to avoid re-computation when recurrently called.

\[K_{ij} = k(x_i,x_j).\]
property Phi: Tensor

Returns the explicit feature map \(\phi(\cdot)\) of the sample datapoints. Same as calling phi(), but slightly faster. It is loaded from memory if already computed and unchanged since then, to avoid re-computation when recurrently called.

property PhiW: Tensor
property W: Tensor
after_step() None

Performs after-step operations, for example a transform of the parameters onto some manifold.

attach_to(weight_fn) None
property attached: bool

Boolean indicating whether the view is attached to another multi_view.

before_step() None

Performs steps required before each training step.

property bias: Tensor
property bias_trainable: bool
c(x=None, transform=None) Tensor

Out-of-sample explicit matrix.

\[C = \frac{1}{\texttt{num_x}}\sum_{i}^{\texttt{num_x}} \phi(x_i)\phi(x_i)^\top.\]
Parameters:

x (Tensor(num_x,dim_input), optional) – Out-of-sample points (first dimension). If None, the default sample will be used., defaults to None

Returns:

Explicit matrix

Return type:

Tensor[dim_feature, dim_feature]

cache_keys(private: bool = False) Iterable[str]

Returns an iterable containing the different cache keys. We refer to the Cache Management documentation for more information.

Parameters:

private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

property cache_level: str

Cache level for saving temporary execution results during the execution. The higher the cache, the more is saved. Defaults to 'normal' unless set otherwise during instantiation. The different possible values are:

  • "none": the cache is non-existent and everything is computed on the go.

  • "light": the cache is very light. For example, only the kernel matrix and statistics of the sample points are saved.

  • "normal": same as light, but the statistics of the out-of-sample points are also saved.

  • "heavy": in addition to the statistics, the final kernel matrices of the out-of-sample points are saved.

  • "total": every step of any computation is saved.

We refer to the Cache Management documentation for further information.

property centered: bool

Indicates whether the feature map is centered relative to its sample or equivalently is the kernel in centered in its RKHS space, spanned by the sample.

corr(x=None) Tensor

Returns the correlation matrix fo the provided input.

Parameters:

x (Tensor[N, dim_input], optional) – Out-of-sample points (first dimension). If None, the default sample will be used., defaults to None

Returns:

Correlation matrix

Return type:

Tensor[dim_feature, dim_feature]

cov(x=None) Tensor

Returns the covariance matrix fo the provided input.

Parameters:

x (Tensor[N, dim_input], optional) – Out-of-sample points (first dimension). If None, the default sample will be used. Defaults to None.

Returns:

Covariance matrix

Return type:

Tensor[dim_feature, dim_feature]

property current_sample: Tensor
property current_sample_projected: Tensor

Returns the sample that is currently used in the computations and for the normalizing and centering statistics if relevant.

property current_target: Tensor

Returns the target that are currently used in the computations, taking the stochastic aspect into account if relevant.

detach() None
property dim_feature: int

Dimension of the explicit feature map if relevant.

property dim_input: int

Dimension of each datapoint.

property dim_output: int

Output dimension.

property dual_correlation: Tensor

Correlation of the hidden variables \(\mathbf{h}^\top \mathbf{h}\). This should be the identity provided that the hidden variables lie on the Stiefel manifold.

property dual_param: Tensor

Dual parameter of size [num_idx, dim_output].

property dual_projector: Tensor

Projector on the subspace spanned by the hidden variables \(\mathbf{h}\mathbf{h}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the hidden variables lie on the Stiefel manifold.

property dual_trainable: bool

Returns whether the hidden variables are trainable (a gradient can be computed on it).

property empty_sample: bool

Boolean specifying if the sample is empty or not.

property eta: float

Level weight \(\eta\), e.g. for the weight of the loss.

abstract property explicit: bool

True if the method has an explicit formulation, False otherwise.

explicit_preimage(phi_image: Tensor | None = None, method: str = 'explicit', **kwargs) Tensor

Computes a pre-image of an explicit feature map of the kernel, given by phi_image. Different methods are available:

Parameters:
  • phi_image (torch.Tensor [num_points, dim_feature], optional) – Explicit feature map image to be inverted. If not specified (None), the explicit feature map on the sample is used.

  • method (str, optional) – Pre-image method to be used. Defaults to 'explicit'.

  • **kwargs (dict, optional) – Additional parameters of the pre-image method used. Please refer to its documentation for further details.

Returns:

Pre-image

Return type:

torch.Tensor [num_points, dim_input]

forward(x=None, representation=None) Tensor

Passes datapoints through the kernel.

Parameters:
  • x (Tensor(,dim_input)) – Datapoints to be passed through the kernel.

  • representation (str, optional) – Chosen representation. If dual, an out-of-sample kernel matrix is returned. If primal is specified, it returns the explicit feature map., defaults to dual

Returns:

Out-of-sample kernel matrix or explicit feature map depending on representation.

Raises:

RepresentationError

property gamma: float
property hparams_fixed: dict

Dictionnary containing the hyper-parameters and their values. This can be relevant for monitoring.

property hparams_variable: dict

Dictionnary containing the parameters and their values. This can be relevant for monitoring.

property idx: Tensor

Indices used when performing various operations. This is only relevant in the case of stochastic training.

implicit_preimage(k_image: Tensor | None = None, method: str = 'knn', **kwargs)

Computes a pre-image of coefficients in the RKHS of the kernel, given by k_image. Different methods are available:

Parameters:
  • k_image (torch.Tensor [num_points, num_idx], optional) – RKHS coefficients to be inverted. If not specified (None), the kernel matrix on the sample is used.

  • method (str, optional) – Pre-image method to be used. Defaults to 'knn'.

  • **kwargs (dict, optional) – Additional parameters of the pre-image method used. Please refer to its documentation for further details.

Returns:

Pre-image

Return type:

torch.Tensor [num_points, dim_input]

init_parameters(representation=None, overwrite=True) None

Initializes the model parameters: the weight in primal and the hidden values in dual. This is suitable for gradient-based training.

Parameters:
  • representation (str, optional) – ‘primal’ or ‘dual’

  • overwrite (bool, optional) – Does not initialize already initialized parameters if False., defaults to True

init_sample(sample=None, idx_sample=None, prop_sample=None)

Initializes the sample set (and the stochastic indices).

Parameters:
  • sample (Tensor, optional) – Sample points used for the various computations. When an out-of-sample computation is asked, it will be given relative to these samples. In case of overwriting a current sample, num_sample and dim_input are also overwritten. If None is specified, the sample data will be initialized according to num_sample and dim_input specified during the construction. If a previous sample set has been used, it will keep the same dimension by consequence. A last case occurs when sample is of the class torch.nn.Parameter: the sample will then use those values, and they can thus be shared with the level calling this method., defaults to None

  • idx_sample (int[], optional) – Initializes the indices of the samples to be updated. All indices are considered if both idx_sample and prop_sample are None., defaults to None

  • prop_sample – Instead of giving indices, specifying a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_sample \(\leq 1\). All indices are considered if both idx_sample and prop_sample are None., defaults to None.

k(x=None, y=None, explicit=None, transform=None) Tensor

Returns a kernel matrix, either of the sample, either out-of-sample, either fully out-of-sample.

\[K = [k(x_i,y_j)]_{i,j=1}^{\texttt{num_x}, \texttt{num_y}},\]

with \(\{x_i\}_{i=1}^\texttt{num_x}\) the out-of-sample points (x) and \(\{y_i\}_{j=1}^\texttt{num_y}\) the sample points (y).

Note

In the case of centered kernels on an out-of-sample, this computation is more expensive as it requires to center according to the sample data, which implies computing a statistic on the out-of-sample kernel matrix and thus also computing it.

Parameters:
  • x (Tensor[num_x, dim_input], optional) – Out-of-sample points (first dimension). If None, the default sample will be used. Defaults to None

  • y (Tensor[num_y, dim_input], optional) – Out-of-sample points (second dimension). If None, the default sample will be used. Defaults to None

Returns:

Kernel matrix

Return type:

Tensor[num_x, num_y]

Raises:

ExplicitError

property kappa: float
property kernel: Kernel
property kernel_transform: TransformTree

Default transform performed on the kernel

property level_trainable: bool

Specifies whether the parameters weight and hidden are trainable or not.

loss(representation=None) Tensor
losses(representation=None) dict

Different components of the losses.

property normalized: bool

Indicates whether the feature map is centered relative to its sample or equivalently is the kernel in centered in its RKHS space, spanned by the sample.

property num_idx: int

Number of selected indices when performing various operations. This is only relevant in the case of stochastic training.

property num_sample: int

Number of datapoints in the sample set.

property param_trainable: bool

Specifies whether the parameters weight and hidden are trainable or not.

phi(x=None, transform=None) Tensor

Returns the explicit feature map \(\phi(x)\) of the specified points.

Parameters:

x (Tensor[num_x, dim_input], optional) – The datapoints serving as input of the explicit feature map. If None, the sample will be used. Defaults to None.

Returns:

Explicit feature map \(\phi(x)\) of the specified points.

Return type:

Tensor[num_x, dim_feature]

Raises:

ExplicitError

phiw(x=None, representation=None) Tensor
Parameters:
  • x (torch.Tensor [num, dim_input]) – Input, defaults to the sample (None).

  • representation (str, optional) – ‘primal’ or ‘dual’.

Returns:

\(\phi(x)^\top W\) or \(k(x)^top H\)

Return type:

torch.Tensor [num, dim_output]

property primal_correlation: Tensor

Correlation of the weights \(\mathbf{w}^\top \mathbf{w}\). This should be the identity provided that the weights lie on the Stiefel manifold.

property primal_param: Tensor

Primal parameters of size [dim_feature, dim_output].

property primal_projector: Tensor

Projector on the subspace spanned by the weights \(\mathbf{w}\mathbf{w}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the weights lie on the Stiefel manifold.

print_cache(private: bool = False) None

Prints the cache content. We refer to the Cache Management documentation for further information.

Parameters:

private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

property representation: str
property requires_bias: bool
reset(recurse=False, reset_persisting=True) None

Resets the cache to be empty. We refer to the Cache Management documentation for more information.

Parameters:
  • recurse (bool, optional) – If True, resets the cache of this module and also of its potential children. otherwise, it only resets the cache for this module. Defaults to True.

  • reset_persisting (bool, optional) – Persisting elements are meant to resist to a cache reset (see _save()). The option allows to also reset them if True. Defaults to True.

property sample: Parameter

Full original raw sample without any transform or potential stochastic selection.

property sample_trainable: bool

Boolean if the sample data can be trained.

property sample_transform: TransformTree

Default transform used by the sample.

solve(sample=None, target=None, representation=None, **kwargs) None

Fits the model according to the input sample and output target. Many models have both a primal and a dual formulation to be fitted.

Parameters:
  • sample (Matrix, optional) – Input sample of the model., defaults to the sample provided by the model.

  • target (Matrix or vector, optional) – Target sample of the model, defaults to `None

  • representation (str, optional) – Representation of the model ("primal" or "dual")., defaults to "dual".

stochastic(idx=None, prop=None)

Resets which subset of the samples are to be used until the next call of this function. This is relevant in the case of stochastic training.

Parameters:
  • idx (int[], optional) – Indices of the sample subset relative to the original sample set., defaults to None

  • prop (double, optional) – Instead of giving indices, passing a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_stochastic \(\leq 1\)., defaults to None.

If None is specified for both idx_stochastic and prop_stochastic, all samples are used and the subset equals the original sample set. This is also the default behavior if this function is never called, nor the parameters specified during initialization.

Note

Both idx_stochastic and prop_stochastic cannot be filled together as conflict would arise.

property target: Tensor

target to be matched to.

train(mode=True)

Activates the training mode, which disables the gradients computation and disables stochasticity. For the gradients and other things, we refer to the torch.nn.Module documentation. For the stochastic part, when put in evaluation mode (False), all the sample points are used for the computations, regardless of the previously specified indices.

transform_input(data) Tensor | None

Apply to value the same transform as on the sample.

transform_sample_revert(data) Tensor

Get back the original value from a projected value, by using the same transform as the sample, but in reverse. This is not always feasible, depending on the transform used (normalizations are typically not invertible as they are transform which are not bijective).

update_dual(val: Tensor, idx_sample=None) None
update_sample(sample_values, idx_sample=None)

Updates the sample set. In contradiction to init_samples, this only updates the values of the sample and sets the gradients of the updated values to zero if relevant.

Parameters:
  • sample_values (Tensor) – Values given to the updated samples.

  • idx_sample (int[], optional) – Indices of the samples to be updated. All indices are considered if None., defaults to None

property watch_properties: list[str]

Properties to be watched (monitored). Their values can be accessed through the property watched_properties. This is relevant e.g. in case of training.

property watched_properties: dict

A dictionary containing the values of the properties that are specified in watch_properties.