Kernel Smoother

Functions

Coefficients-Based Smoother

kerch.method.smoother(coefficients: Tensor, observations: Tensor, num: int | str = 'all') Tensor[source]

Returns a weighted sum of the observations by the coefficients.

\[\mathtt{out}_{i,j} = \frac{\sum_l\mathtt{weights}_{i,l} * \mathtt{observations}_{l,j}}{\sum_l\mathtt{weights}_{i,l}}.\]
Parameters:
  • coefficients (torch.Tensor [num_points, num_observations]) – coefficients used in the smoother.

  • observations (torch.Tensor [num_observations, dim_observations]) – observation corresponding to each weight dimension.

  • num (int or str, optional) – Number of closest points to be used. Either an integer representing the number or the string 'all'. Defaults to 'all'.

Returns:

Weighted observations

Return type:

torch.Tensor [num_points, dim_observations]

Kernel Smoother

kerch.method.kernel_smoother(domain: Tensor, observations: Tensor, num: int | str = 'all', kernel_type: str = 'rbf', **kwargs) Tensor[source]

Smoothens the function \(f(\mathtt{domain}_{i,:}) = \mathtt{observation}_{i,:}\) with weights defined as a kernel on the domain.

\[\mathtt{out}_{i,j} = \frac{\sum_l k\left(\mathtt{domain}_{i,:},\mathtt{domain}_{l,:}\right) * \mathtt{observations}_{l,j}}{\sum_l k\left(\mathtt{domain}_{i,:},\mathtt{domain}_{l,:}\right)}.\]

The kernel is defined as in kerch.kernel.factory().

Parameters:
  • domain (torch.Tensor [num_observations, dim_domain]) – domain corresponding to each observation.

  • observations (torch.Tensor [num_observations, dim_observations]) – observation corresponding to each domain entry.

  • num (int or str, optional) – Number of closest points to be used. Either an integer representing the number or the string 'all'. Defaults to 'all'.

  • kernel_type (str, optional) – Type of kernel chosen. For the possible choices, please refer to the Factory Type column of the Kernel Module documentation. Defaults to kerch.DEFAULT_KERNEL_TYPE.

  • **kwargs (dict, optional) – Arguments to be passed to the kernel constructor, such as sample or sigma. If an argument is passed that does not exist (e.g. sigma to a linear kernel), it will just be neglected. For the default values, please refer to the default values of the requested kernel.

Returns:

Smoothed function \(f\) according to kernel \(k\).

Return type:

torch.Tensor [num_observations, dim_observations]

Example

Different Kernels, Same Bandwidth

As we can see in the following example, different kernels have different behaviors. They all use the same bandwidth as without specification, the bandwidth is based on the distance matrix which is here the same for all as they are all based on the same domain. The same bandwidth is not always appropriate for different kernels.

import kerch
import torch
from matplotlib import pyplot as plt

# data
fun = lambda x: torch.sin(x ** 2)

x_equal = torch.linspace(0, 2, 100)
x_nonequal = 2 * torch.sort(torch.rand(40)).values

y_original = fun(x_equal)
y_noisy = fun(x_nonequal) + .2 * torch.randn_like(x_nonequal)

plt.plot(x_equal, y_original, label="Original Data", color="black", linestyle='dotted')
plt.scatter(x_nonequal, y_noisy, label="Noisy Data", color="black")

# kernels
kernels = [('RBF', 'red'),
           ('Laplacian', 'orange'),
           ('Logistic', 'olive'),
           ('Epanechnikov', 'gold'),
           ('Quartic', 'chartreuse'),
           ('Silverman', 'green'),
           ('Triangular', 'teal'),
           ('Tricube', 'cyan'),
           ('Triweight', 'royalblue'),
           ('Uniform', 'purple')]

# kernel smoother
for name, c in kernels:
    y_reconstructed = kerch.method.kernel_smoother(domain=x_nonequal, observations=y_noisy, kernel_type=name.lower())
    plt.plot(x_nonequal, y_reconstructed, label=name, color=c)

# plot
plt.title('Kernel Smoothing')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(loc='lower center', ncol=3)

(Source code, png, hires.png, pdf)

../_images/smoother-1.png

Same Kernels, Different Bandwidths

In this example, we show how two different kernels react differently based on the prescribed bandwidth. We will consider two kernels, the Laplacian with a very heavy tail and the very restricted Triweight. We can first have a view at their respective shapes.

import kerch
import torch
from matplotlib import pyplot as plt

# domain
x = torch.linspace(-3, 3, 500)

# define the kernels
k_l1 = kerch.kernel.Laplacian(sample=x, sigma=1)
k_l2 = kerch.kernel.Laplacian(sample=x, sigma=2)
k_t1 = kerch.kernel.Triweight(sample=x, sigma=1)
k_t2 = kerch.kernel.Triweight(sample=x, sigma=2)

# plot the shapes
plt.plot(x, k_l1.k(y=0).squeeze(), label=f"Laplacian with $\sigma$={k_l1.sigma}", color='black')
plt.plot(x, k_l2.k(y=0).squeeze(), label=f"Laplacian with $\sigma$={k_l2.sigma}", color='black', linestyle='dashed')
plt.plot(x, k_t1.k(y=0).squeeze(), label=f"Triweight with $\sigma$={k_t1.sigma}", color='red')
plt.plot(x, k_t2.k(y=0).squeeze(), label=f"Triweight with $\sigma$={k_t2.sigma}", color='red', linestyle='dashed')

# annotate the plot
plt.title('Kernel Shape')
plt.xlabel('x')
plt.ylabel('k(x,y=0)')
plt.ylim(-.25, 1.1)
plt.legend(loc='lower center', ncol=2)

(Source code, png, hires.png, pdf)

../_images/smoother-2.png
import kerch
import torch
from matplotlib import pyplot as plt

# data
fun = lambda x: torch.sin(x ** 2)

x_equal = torch.linspace(0, 2, 100)
x_nonequal = 2 * torch.sort(torch.rand(40)).values

y_original = fun(x_equal)
y_noisy = fun(x_nonequal) + .2 * torch.randn_like(x_nonequal)

# plot
fig, axs = plt.subplots(1, 2)
for ax in axs.flatten():
    ax.plot(x_equal, y_original, label="Original Data", color="black", linestyle='dotted')
    ax.scatter(x_nonequal, y_noisy, label="Noisy Data", color="black")
    plt.title('Kernel Smoothing')
    ax.set_xlabel('x')
    ax.set_ylabel('y')

# kernel smoother
sigmas = [(0.05, 'red'),
          (0.2, 'green'),
          (0.5, 'cyan'),
          (1.0, 'purple')]
for s, c in sigmas:
    y_laplacian = kerch.method.kernel_smoother(domain=x_nonequal, observations=y_noisy, kernel_type='laplacian', sigma=s)
    y_triweight = kerch.method.kernel_smoother(domain=x_nonequal, observations=y_noisy, kernel_type='triweight', sigma=s)
    axs[0].plot(x_nonequal, y_laplacian, color=c, label=f"Bandwidth $\sigma$={s}")
    axs[1].plot(x_nonequal, y_triweight, color=c)

# plot
fig.suptitle('Kernel Smoothing')
axs[0].set_title('Laplacian')
axs[1].set_title('Triweight')
fig.legend(*axs[0].get_legend_handles_labels(), loc='lower center', ncol=3)

(Source code, png, hires.png, pdf)

../_images/smoother-3.png