Multi-View Kernel Principal Component Analysis

class kerch.level.MVKPCA(*args, **kwargs)[source]

Bases: _KPCA, MVLevel

Multi-View Kernel Principal Component Analysis.

property C: Tensor

property H: Tensor

property K: Tensor

property Ks: Tensor

property Phi: Tensor

property PhiW: Tensor

property W: Tensor

after_step() → None: Performs after-step operations, for example a transform of the parameters onto some manifold.

attach_to(weight_fn) → None

attach_view(view) → None: Adds a view

property attached: bool: Boolean indicating whether the view is attached to another multi_view.

before_step() → None: Performs steps required before each training step.

c(x=None) → Tensor

cache_keys(private: bool = False) → Iterable[str]

Returns an iterable containing the different cache keys. We refer to the Cache Management documentation for more information.

Parameters:: private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

property cache_level: str

Cache level for saving temporary execution results during the execution. The higher the cache, the more is saved. Defaults to 'normal' unless set otherwise during instantiation. The different possible values are:

"none": the cache is non-existent and everything is computed on the go.
"light": the cache is very light. For example, only the kernel matrix and statistics of the sample points are saved.
"normal": same as light, but the statistics of the out-of-sample points are also saved.
"heavy": in addition to the statistics, the final kernel matrices of the out-of-sample points are saved.
"total": every step of any computation is saved.

We refer to the Cache Management documentation for further information.

detach() → None

detach_all() → None: Detaches all the known.

property dim_feature: int: Dimension of the explicit feature map if relevant.

property dim_output: int: Output dimension.

property dims_feature: List[int]: List containing the feature dimension of each view.

property dims_feature_cumulative: List[int]: List containing the cumulative feature dimensions of each view with the previus ones.

draw_h(num: int = 1) → Tensor

Draws a \(h^\star\) normally.

Parameters:: num (int, optional) – Number of \(h^\star\) to be sampled, defaults to 1.
Returns:: Latent representation.
Return type:: torch.Tensor [num, dim_output]

draw_k(num: int = 1, posterior: bool = False) → Tensor

Draws a dual representation k given its posterior distribution.

Parameters:

posterior (bool, optional) – Indicates whether phi has to be drawn from its posterior distribution or its conditional given the prior of h. Defaults to True.
num (int, optional) – Number of k to be sampled, defaults to 1.

Returns:

Dual representation.

Return type:

Tensor[num, num_idx]

draw_phi(num: int = 1, posterior: bool = True) → Tensor

Draws a primal representation phi given its posterior distribution.

Parameters:

posterior (bool, optional) – Indicates whether phi has to be drawn from its posterior distribution or its conditional given the prior of h. Defaults to True.
num (int, optional) – Number of phi to be sampled, defaults to 1.

Returns:

Primal representation.

Return type:

Tensor[num, dim_input]

property dual_correlation: Tensor: Correlation of the hidden variables \(\mathbf{h}^\top \mathbf{h}\). This should be the identity provided that the hidden variables lie on the Stiefel manifold.

property dual_param: Tensor: Dual parameter of size [num_idx, dim_output].

property dual_projector: Tensor: Projector on the subspace spanned by the hidden variables \(\mathbf{h}\mathbf{h}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the hidden variables lie on the Stiefel manifold.

property dual_trainable: bool: Returns whether the hidden variables are trainable (a gradient can be computed on it).

property eta: float: Level weight \(\eta\), e.g. for the weight of the loss.

forward(x=None, representation=None)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

h(phi: Tensor | None = None, k: Tensor | None = None) → Tensor

Draws a h given the maximum a posteriori of the distribution. By choosing the input, you either choose a primal or dual representation.

Parameters:

phi (Tensor[N, dim_input], optional) – Primal representation.
k (Tensor[N, num_idx], optional) – Dual representation.

Returns:

MAP of h given phi or k.

Return type:

Tensor[N, dim_output]

property hparams_fixed: dict: Fixed hyperparameters of the module. By contrast with hparams_variable, these are the values that are fixed and cannot possibly change during the training. If applicable, these can be specific architecture values for example. We refer to the documentation of Kerch Module for further information.

property hparams_variable: dict

Variable hyperparameters of the module. By contrast with hparams_fixed, these are the values that are may change during the training and may be monitored at various stages during the training. If applicable, these can be kernel bandwidth parameters for example.

Note

All parameters that are potentially trainable, like a kernel bandwidth \(\sigma\) for example, are included in this dictionary, even if the corresponding trainable argument is set to False. In the latter case, they will be not evolve during training iterations, but will still be included in this dictionary.

We refer to the documentation of Kerch Module for further information.

property idx: Tensor: Indices used when performing various operations. This is only relevant in the case of stochastic training.

init_parameters(representation=None, overwrite=True) → None

Initializes the model parameters: the weight in primal and the hidden values in dual. This is suitable for gradient-based training.

Parameters:

representation (str, optional) – ‘primal’ or ‘dual’
overwrite (bool, optional) – Does not initialize already initialized parameters if False., defaults to True

k(x: Tensor | Parameter | dict | list | str | None = None) → Tensor

k_map(h: Tensor) → Tensor

RKHS representation \(k(x^\star,\mathtt{sample})\) given a latent representation \(h^\star\).

\[k(x^\star, x_j) = KH^\toph^\star,\]

with \(K\) the kernel matrix on the sample self.K and \(H\) the hidden vectors self.hidden.

Parameters:: h (Tensor[N, dim_output]) – Latent representation \(h^\star\).
Returns:: RKHS representation \(k(x^\star,\mathtt{sample})\).
Return type:: Tensor[N, num_idx]

ks(x: Tensor | Parameter | dict | list | str | None = None) → Iterator[Tensor]

property level_trainable: bool: Specifies whether the parameters weight and hidden are trainable or not.

loss(representation=None) → Tensor: Reconstruction error on the sample.

losses(representation=None) → dict: Different components of the losses.

model_variance(as_tensor=False, normalize=True) → float | Tensor

Total variance learnt by the model given by the sum of the eigenvalues.

Parameters:: as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

Warning

For this value to strictly be interpreted as a variance, the corresponding kernel (or feature map) has to be normalized. We refer to the remark of total_variance.

property named_views: Iterator[Tuple[str, View]]

property num_idx: int: Number of selected indices when performing various operations. This is only relevant in the case of stochastic training.

property num_views: int

phi(x: Tensor | Parameter | dict | list | str | None = None) → Tensor

phi_map(h: Tensor) → Tensor

Feature representation \(\phi(x^\star)\) given a latent representation \(h^\star\).

\[\phi(x^\star) = = W h^\star.\]

Parameters:: h (Tensor[N, dim_output]) – Latent representation \(h^\star\).
Returns:: Feature representation \(\phi(x^\star)\).
Return type:: Tensor[N, dim_feature]

phis(x: Tensor | Parameter | dict | list | str | None = None) → Iterator[Tensor]

phiw(x=None, representation=None) → Tensor

Parameters:

x (torch.Tensor [num, dim_input]) – Input, defaults to the sample (None).
representation (str, optional) – ‘primal’ or ‘dual’.

Returns:

\(\phi(x)^\top W\) or \(k(x)^top H\)

Return type:

torch.Tensor [num, dim_output]

predict(known, **kwargs)[source]

property primal_correlation: Tensor: Correlation of the weights \(\mathbf{w}^\top \mathbf{w}\). This should be the identity provided that the weights lie on the Stiefel manifold.

property primal_param: Tensor: Primal parameters of size [dim_feature, dim_output].

property primal_projector: Tensor: Projector on the subspace spanned by the weights \(\mathbf{w}\mathbf{w}^\top\). This is a rigorous projector provided its determinant is unity, e.g. when the weights lie on the Stiefel manifold.

print_cache(private: bool = False) → None

Prints the cache content. We refer to the Cache Management documentation for further information.

Parameters:: private (bool, optional) – Some cache elements are private and are not returned unless set to True. Defaults to False.

project(known: dict, representation: str = None) → Tensor[source]

relative_variance(as_tensor=False) → float | Tensor

Relative variance learnt by the model given by `model_variance/total_variance. This number is always comprised between 0 and 1 and avoids any considerations on normalization.

Parameters:: as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

property representation: str

reset(recurse=False, reset_persisting=True) → None

Resets the cache to be empty. We refer to the Cache Management documentation for more information.

Parameters:

recurse (bool, optional) – If True, resets the cache of this module and also of its potential children. otherwise, it only resets the cache for this module. Defaults to True.
reset_persisting (bool, optional) – Persisting elements are meant to resist to a cache reset (see _save()). The option allows to also reset them if True. Defaults to True.

solve(sample=None, target=None, representation=None, **kwargs) → None

Solves the model by decomposing the kernel matrix or the covariance matrix in principal components (eigendecomposition).

Fits the model according to the input sample and output target. Many models have both a primal and a dual formulation to be fitted.

Parameters:: representation (str, optional) – Representation of the model ("primal" or "dual")., defaults to "dual".

property sqrt_vals: Tensor

stochastic(idx=None, prop=None)

Resets which subset of the samples are to be used until the next call of this function. This is relevant in the case of stochastic training.

Parameters:

idx (int[], optional) – Indices of the sample subset relative to the original sample set., defaults to None
prop (double, optional) – Instead of giving indices, passing a proportion of the original sample set is also possible. The indices will be uniformly randomly chosen without replacement. The value must be chosen such that \(0 <\) prop_stochastic \(\leq 1\)., defaults to None.

If None is specified for both idx_stochastic and prop_stochastic, all samples are used and the subset equals the original sample set. This is also the default behavior if this function is never called, nor the parameters specified during initialization.

Note

Both idx_stochastic and prop_stochastic cannot be filled together as conflict would arise.

total_variance(as_tensor=False, normalize=True, representation=None) → float | Tensor

Total variance contained in the feature map. In primal formulation, this is given by \(\DeclareMathOperator{\tr}{tr}\tr(C)\), where \(C = \sum\phi(x)\phi(x)^\top\) is the covariance matrix on the sample. In dual, this is given by \(\DeclareMathOperator{\tr}{tr}\tr(K)\), where \(K_{ij} = k(x_i,x_j)\) is the kernel matrix on the sample.

Parameters:: as_tensor (bool, optional) – Indicated whether the variance has to be returned as a float or a torch.Tensor., defaults to False

Warning

For this value to strictly be interpreted as a variance, the corresponding kernel (or feature map) has to be normalized. In that case however, the total variance will amount to the dimension of the feature map in primal and the number of datapoints in dual.

train(mode=True): Activates the training mode, which disables the gradients computation and disables stochasticity. For the gradients and other things, we refer to the torch.nn.Module documentation. For the stochastic part, when put in evaluation mode (False), all the sample points are used for the computations, regardless of the previously specified indices.

update_dual(val: Tensor, idx_sample=None) → None

property vals: Tensor: Eigenvalues of the model. The model has to be fitted for these values to exist.

view(id) → View

property views: Iterator[View]

views_by_name(names: List[str]) → Iterator[View]

property watch_properties: list[str]: Properties to be watched (monitored). Their values can be accessed through the property watched_properties. This is relevant e.g. in case of training.

property watched_properties: dict: A dictionary containing the values of the properties that are specified in watch_properties.

weights_by_name(names: List[str])