Multi-View Kernel Principal Component Analysis

class kerch.level.MVKPCA(*args, **kwargs)[source]

Bases: _KPCA, MVLevel

Multi-View Kernel Principal Component Analysis.

property C: Tensor
property H: Tensor
property K: Tensor
property Ks: Tensor
property Phi: Tensor
property PhiW: Tensor
property W: Tensor
after_step() None

Performs after-step operations, for example a transform of the parameters onto some manifold.

attach_to(weight_fn) None
attach_view(view) None

Adds a view

property attached: bool

Boolean indicating whether the view is attached to another multi_view.

before_step() None

Performs steps required before each training step.

c(x=None) Tensor
cache_keys(private: bool = False) Iterable[str]

Returns an iterable containing the different cache keys. We refer to the Cache Management documentation for more information.

Parameters:

private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

property cache_level: str

Cache level for saving temporary execution results during the execution. The higher the cache, the more is saved. Defaults to 'normal' unless set otherwise during instantiation. The different possible values are:

  • "none": the cache is non-existent and everything is computed on the go.

  • "light": the cache is very light. For example, only the kernel matrix and statistics of the sample points are saved.

  • "normal": same as light, but the statistics of the out-of-sample points are also saved.

  • "heavy": in addition to the statistics, the final kernel matrices of the out-of-sample points are saved.

  • "total": every step of any computation is saved.

We refer to the Cache Management documentation for further information.

detach() None
detach_all() None

Detaches all the known.

property dim_feature: int

Dimension of the explicit feature map if relevant.

property dim_output: int

Output dimension.

property dims_feature: List[int]

List containing the feature dimension of each view.

property dims_feature_cumulative: List[int]

List containing the cumulative feature dimensions of each view with the previus ones.

draw_h(num: int = 1) Tensor

Draws a \(h^\star\) normally.

Parameters:

num (int, optional) – Number of \(h^\star\) to be sampled, defaults to 1.

Returns:

Latent representation.

Return type:

torch.Tensor [num, dim_output]

draw_k(num: int = 1, posterior: bool = False) Tensor

Draws a dual representation k given its posterior distribution.

Parameters:
  • posterior (bool, optional) – Indicates whether phi has to be drawn from its posterior distribution or its conditional given the prior of h. Defaults to True.

  • num (int, optional) – Number of k to be sampled, defaults to 1.

Returns:

Dual representation.

Return type:

Tensor[num, num_idx]

draw_phi(num: int = 1, posterior: bool = True) Tensor

Draws a primal representation phi given its posterior distribution.

Parameters:
  • posterior (bool, optional) – Indicates whether phi has to be drawn from its posterior distribution or its conditional given the prior of h. Defaults to True.

  • num (int, optional) – Number of phi to be sampled, defaults to 1.

Returns:

Primal representation.

Return type:

Tensor[num, dim_input]

property dual_correlation: Tensor

Correlation of the hidden variables \(\mathbf{h}^\top \mathbf{h}\). This should be the identity provided that the hidden variables lie on the Stiefel manifold.

property dual_param: Tensor

Dual parameter of size [num_idx, dim_output].

property dual_projector: Tensor

Projector on the subspace spanned by the hidden variables \(\mathbf{h}\mathbf{h}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the hidden variables lie on the Stiefel manifold.

property dual_trainable: bool

Returns whether the hidden variables are trainable (a gradient can be computed on it).

property eta: float

Level weight \(\eta\), e.g. for the weight of the loss.

forward(x=None, representation=None)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

h(phi: Tensor | None = None, k: Tensor | None = None) Tensor

Draws a h given the maximum a posteriori of the distribution. By choosing the input, you either choose a primal or dual representation.

Parameters:
  • phi (Tensor[N, dim_input], optional) – Primal representation.

  • k (Tensor[N, num_idx], optional) – Dual representation.

Returns:

MAP of h given phi or k.

Return type:

Tensor[N, dim_output]

property hparams_fixed: dict

Fixed hyperparameters of the module. By contrast with hparams_variable, these are the values that are fixed and cannot possibly change during the training. If applicable, these can be specific architecture values for example. We refer to the documentation of Kerch Module for further information.

property hparams_variable: dict

Variable hyperparameters of the module. By contrast with hparams_fixed, these are the values that are may change during the training and may be monitored at various stages during the training. If applicable, these can be kernel bandwidth parameters for example.

Note

All parameters that are potentially trainable, like a kernel bandwidth \(\sigma\) for example, are included in this dictionary, even if the corresponding trainable argument is set to False. In the latter case, they will be not evolve during training iterations, but will still be included in this dictionary.

We refer to the documentation of Kerch Module for further information.

property idx: Tensor

Indices used when performing various operations. This is only relevant in the case of stochastic training.

init_parameters(representation=None, overwrite=True) None

Initializes the model parameters: the weight in primal and the hidden values in dual. This is suitable for gradient-based training.

Parameters:
  • representation (str, optional) – ‘primal’ or ‘dual’

  • overwrite (bool, optional) – Does not initialize already initialized parameters if False., defaults to True

k(x: Tensor | Parameter | dict | list | str | None = None) Tensor
k_map(h: Tensor) Tensor

RKHS representation \(k(x^\star,\mathtt{sample})\) given a latent representation \(h^\star\).

\[k(x^\star, x_j) = KH^\toph^\star,\]

with \(K\) the kernel matrix on the sample self.K and \(H\) the hidden vectors self.hidden.

Parameters:

h (Tensor[N, dim_output]) – Latent representation \(h^\star\).

Returns:

RKHS representation \(k(x^\star,\mathtt{sample})\).

Return type:

Tensor[N, num_idx]

ks(x: Tensor | Parameter | dict | list | str | None = None) Iterator[Tensor]
property level_trainable: bool

Specifies whether the parameters weight and hidden are trainable or not.

loss(representation=None) Tensor

Reconstruction error on the sample.

losses(representation=None) dict

Different components of the losses.

model_variance(as_tensor=False, normalize=True) float | Tensor

Total variance learnt by the model given by the sum of the eigenvalues.

Parameters:

as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

Warning

For this value to strictly be interpreted as a variance, the corresponding kernel (or feature map) has to be normalized. We refer to the remark of total_variance.

property named_views: Iterator[Tuple[str, View]]
property num_idx: int

Number of selected indices when performing various operations. This is only relevant in the case of stochastic training.

property num_views: int
phi(x: Tensor | Parameter | dict | list | str | None = None) Tensor
phi_map(h: Tensor) Tensor

Feature representation \(\phi(x^\star)\) given a latent representation \(h^\star\).

\[\phi(x^\star) = = W h^\star.\]
Parameters:

h (Tensor[N, dim_output]) – Latent representation \(h^\star\).

Returns:

Feature representation \(\phi(x^\star)\).

Return type:

Tensor[N, dim_feature]

phis(x: Tensor | Parameter | dict | list | str | None = None) Iterator[Tensor]
phiw(x=None, representation=None) Tensor
Parameters:
  • x (torch.Tensor [num, dim_input]) – Input, defaults to the sample (None).

  • representation (str, optional) – ‘primal’ or ‘dual’.

Returns:

\(\phi(x)^\top W\) or \(k(x)^top H\)

Return type:

torch.Tensor [num, dim_output]

predict(known, **kwargs)[source]
property primal_correlation: Tensor

Correlation of the weights \(\mathbf{w}^\top \mathbf{w}\). This should be the identity provided that the weights lie on the Stiefel manifold.

property primal_param: Tensor

Primal parameters of size [dim_feature, dim_output].

property primal_projector: Tensor

Projector on the subspace spanned by the weights \(\mathbf{w}\mathbf{w}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the weights lie on the Stiefel manifold.

print_cache(private: bool = False) None

Prints the cache content. We refer to the Cache Management documentation for further information.

Parameters:

private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

project(known: dict, representation: str = None) Tensor[source]
relative_variance(as_tensor=False) float | Tensor

Relative variance learnt by the model given by `model_variance/total_variance. This number is always comprised between 0 and 1 and avoids any considerations on normalization.

Parameters:

as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

property representation: str
reset(recurse=False, reset_persisting=True) None

Resets the cache to be empty. We refer to the Cache Management documentation for more information.

Parameters:
  • recurse (bool, optional) – If True, resets the cache of this module and also of its potential children. otherwise, it only resets the cache for this module. Defaults to True.

  • reset_persisting (bool, optional) – Persisting elements are meant to resist to a cache reset (see _save()). The option allows to also reset them if True. Defaults to True.

solve(sample=None, target=None, representation=None, **kwargs) None

Solves the model by decomposing the kernel matrix or the covariance matrix in principal components (eigendecomposition).

Fits the model according to the input sample and output target. Many models have both a primal and a dual formulation to be fitted.

Parameters:

representation (str, optional) – Representation of the model ("primal" or "dual")., defaults to "dual".

property sqrt_vals: Tensor
stochastic(idx=None, prop=None)

Resets which subset of the samples are to be used until the next call of this function. This is relevant in the case of stochastic training.

Parameters:
  • idx (int[], optional) – Indices of the sample subset relative to the original sample set., defaults to None

  • prop (double, optional) – Instead of giving indices, passing a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_stochastic \(\leq 1\)., defaults to None.

If None is specified for both idx_stochastic and prop_stochastic, all samples are used and the subset equals the original sample set. This is also the default behavior if this function is never called, nor the parameters specified during initialization.

Note

Both idx_stochastic and prop_stochastic cannot be filled together as conflict would arise.

total_variance(as_tensor=False, normalize=True, representation=None) float | Tensor

Total variance contained in the feature map. In primal formulation, this is given by \(\DeclareMathOperator{\tr}{tr}\tr(C)\), where \(C = \sum\phi(x)\phi(x)^\top\) is the covariance matrix on the sample. In dual, this is given by \(\DeclareMathOperator{\tr}{tr}\tr(K)\), where \(K_{ij} = k(x_i,x_j)\) is the kernel matrix on the sample.

Parameters:

as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

Warning

For this value to strictly be interpreted as a variance, the corresponding kernel (or feature map) has to be normalized. In that case however, the total variance will amount to the dimension of the feature map in primal and the number of datapoints in dual.

train(mode=True)

Activates the training mode, which disables the gradients computation and disables stochasticity. For the gradients and other things, we refer to the torch.nn.Module documentation. For the stochastic part, when put in evaluation mode (False), all the sample points are used for the computations, regardless of the previously specified indices.

update_dual(val: Tensor, idx_sample=None) None
property vals: Tensor

Eigenvalues of the model. The model has to be fitted for these values to exist.

view(id) View
property views: Iterator[View]
views_by_name(names: List[str]) Iterator[View]
property watch_properties: list[str]

Properties to be watched (monitored). Their values can be accessed through the property watched_properties. This is relevant e.g. in case of training.

property watched_properties: dict

A dictionary containing the values of the properties that are specified in watch_properties.

weights_by_name(names: List[str])