GP-Zoo is a collection of implementations of significant papers on Gaussian Processes with readable, simple code, over the past decade; ranging from Sparse Gaussian Process Regression from Titsias, 2009, to Stochastic Variational Gaussian Processes for classification and regression, by Hensman, 2014.

Implementations

We start with the standard GPJax dataset for regression.

import jax.random as jr
key = jax.random.PRNGKey(0)
n = 1000
noise = 0.2

x = jr.uniform(key=key, minval=-5.0, maxval=5.0,
               shape=(n,)).sort().reshape(-1, 1)


def f(x): return jnp.sin(4 * x) + jnp.cos(2 * x)


signal = f(x)
y = signal + jr.normal(key, shape=signal.shape) * noise

xtest = jnp.linspace(-5.5, 5.5, 500).reshape(-1, 1)

z = jnp.linspace(-5.0, 5.0, 50).reshape(-1, 1)

fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(x, y, "o", alpha=0.3, label="Samples")
ax.plot(xtest, f(xtest), label="True function")
[ax.axvline(x=z_i, color="black", alpha=0.3, linewidth=1) for z_i in z]
plt.show()

GP-Zoo’s regression implementations are all based on this dataset, although it can be easily switched out by replacing x and y with a different dataset. For example, Stochastic Variational Gaussian Processes for Classification (Hensman et al, 2014), classifies the moons dataset:

n_samples = 100
noise = 0.1
random_state = 0
shuffle = True

X, y = make_moons(
    n_samples=n_samples, random_state=random_state, noise=noise, shuffle=shuffle
)
X = StandardScaler().fit_transform(X)  # Yes, this is useful for GPs

X, y = map(jnp.array, (X, y))

plt.scatter(X[:, 0], X[:, 1], c=y)

Examples of regression implementations

Stochastic Variational Gaussian Process

A parallelizable, scalable algorithm for Gaussian Processes, optimized for dealing with big data. Crosses are inducing points, while translucent blue dots are the original full dataset.

Sparse Gaussian Process

Based on the same dataset, a sparse Gaussian Process would include:

Titsias’s Sparse Gaussian Process

Based on Titsias et al (2009), the greedy-algorithm for selecting inducing points leads to a good approximation of an exact Gaussian Process.

Variational training of Inducing points

A technique based on minimzing the Evidence Lower Bound as a loss for how well the inducing points approximate the full dataset.