Skip to content

uqregressors.bayesian.sklearn_gp

This is a compatibility wrapper of the scikit-learn Gaussian Process Regressor such that predictions return intervals rather than means and/or standard deviations. For more advanced use cases, the gpytorch gp implemented in gp is more flexible.

SklearnGP

A wrapper for scikit-learn's GaussianProcessRegressor with prediction intervals.

This class provides a simplified interface to fit a Gaussian Process (GP) regressor, make predictions with uncertainty intervals, and save/load the model configuration.

Parameters:

Name Type Description Default
name str

Name of the model.

'GP_Regressor'
kernel Kernel

Kernel to use for the GP model.

RBF()
alpha float

Significance level for the prediction interval.

0.1
gp_kwargs dict

Additional keyword arguments for GaussianProcessRegressor.

None

Attributes:

Name Type Description
name str

Name of the model.

kernel Kernel

Kernel to use in the GP model.

alpha float

Significance level for confidence intervals (e.g., 0.1 for 90% CI).

gp_kwargs dict

Additional keyword arguments for the GaussianProcessRegressor.

model GaussianProcessRegressor

Fitted scikit-learn GP model.

fitted bool

Whether fit has been successfully called.

Source code in uqregressors/bayesian/sklearn_gp.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
class SklearnGP: 
    """
    A wrapper for scikit-learn's GaussianProcessRegressor with prediction intervals.

    This class provides a simplified interface to fit a Gaussian Process (GP) regressor,
    make predictions with uncertainty intervals, and save/load the model configuration.

    Args:
            name (str): Name of the model.
            kernel (sklearn.gaussian_process.kernels.Kernel): Kernel to use for the GP model.
            alpha (float): Significance level for the prediction interval.
            gp_kwargs (dict, optional): Additional keyword arguments for GaussianProcessRegressor.

    Attributes:
        name (str): Name of the model.
        kernel (sklearn.gaussian_process.kernels.Kernel): Kernel to use in the GP model.
        alpha (float): Significance level for confidence intervals (e.g., 0.1 for 90% CI).
        gp_kwargs (dict): Additional keyword arguments for the GaussianProcessRegressor.
        model (GaussianProcessRegressor): Fitted scikit-learn GP model.
        fitted (bool): Whether fit has been successfully called. 
    """
    def __init__(self, name="GP_Regressor", kernel = RBF(), 
                 alpha=0.1, 
                 gp_kwargs=None):

        self.name = name
        self.kernel = kernel 
        self.alpha = alpha 
        self.gp_kwargs = gp_kwargs or {}
        self.model = None
        self.fitted = False

    def fit(self, X, y): 
        """
        Fits the GP model to the input data.

        Args:
            X (np.ndarray): Feature matrix of shape (n_samples, n_features).
            y (np.ndarray): Target values of shape (n_samples,).
        Returns: 
            (GPRegressor): Fitted model.
        """
        model = GaussianProcessRegressor(kernel=self.kernel, **self.gp_kwargs)
        model.fit(X, y)
        self.model = model
        self.fitted = True
        return self 

    def predict(self, X):
        """
        Predicts the target values with uncertainty estimates.

        Args:
            X (np.ndarray): Feature matrix of shape (n_samples, n_features).

        Returns:
            (Union[Tuple[np.ndarray, np.ndarray, np.ndarray], Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]): Tuple containing:
                mean predictions,
                lower bound of the prediction interval,
                upper bound of the prediction interval.

        !!! note
            If `requires_grad` is False, all returned arrays are NumPy arrays.
            Otherwise, they are PyTorch tensors with gradients.
        """
        if not self.fitted: 
            raise ValueError("Model not yet fit. Please call fit() before predict().")

        preds, std = self.model.predict(X, return_std=True)
        z_score = st.norm.ppf(1 - self.alpha / 2)
        mean = preds
        lower = mean - z_score * std
        upper = mean + z_score * std
        return mean, lower, upper

    def save(self, path): 
        """
        Saves the model and its configuration to disk.

        Args:
            path (Union[str, Path]): Directory where model and config will be saved.
        """
        if not self.fitted: 
            raise ValueError("Model not yet fit. Please call fit() before save().")

        path = Path(path)
        path.mkdir(parents=True, exist_ok=True) 

        config = {
            k: v for k, v in self.__dict__.items()
            if k not in ["kernel", "model"]
            and not callable(v)
            and not isinstance(v, ())
        }
        config["kernel"] = self.kernel.__class__.__name__

        with open(path / "config.json", "w") as f:
            json.dump(config, f, indent=4)

        with open(path / "model.pkl", 'wb') as file: 
            pickle.dump(self, file)

    @classmethod
    def load(cls, path, device="cpu", load_logs=False): 
        """
        Loads a previously saved GPRegressor from disk.

        Args:
            path (Union[str, Path]): Path to the directory containing the saved model.
            device (str, optional): Unused, included for compatibility. Defaults to "cpu".
            load_logs (bool, optional): Unused, included for compatibility. Defaults to False.

        Returns:
            (GPRegressor): The loaded model instance.
        """
        path = Path(path)

        with open(path / "model.pkl", 'rb') as file: 
            model = pickle.load(file)

        return model

fit(X, y)

Fits the GP model to the input data.

Parameters:

Name Type Description Default
X ndarray

Feature matrix of shape (n_samples, n_features).

required
y ndarray

Target values of shape (n_samples,).

required

Returns: (GPRegressor): Fitted model.

Source code in uqregressors/bayesian/sklearn_gp.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
def fit(self, X, y): 
    """
    Fits the GP model to the input data.

    Args:
        X (np.ndarray): Feature matrix of shape (n_samples, n_features).
        y (np.ndarray): Target values of shape (n_samples,).
    Returns: 
        (GPRegressor): Fitted model.
    """
    model = GaussianProcessRegressor(kernel=self.kernel, **self.gp_kwargs)
    model.fit(X, y)
    self.model = model
    self.fitted = True
    return self 

load(path, device='cpu', load_logs=False) classmethod

Loads a previously saved GPRegressor from disk.

Parameters:

Name Type Description Default
path Union[str, Path]

Path to the directory containing the saved model.

required
device str

Unused, included for compatibility. Defaults to "cpu".

'cpu'
load_logs bool

Unused, included for compatibility. Defaults to False.

False

Returns:

Type Description
GPRegressor

The loaded model instance.

Source code in uqregressors/bayesian/sklearn_gp.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
@classmethod
def load(cls, path, device="cpu", load_logs=False): 
    """
    Loads a previously saved GPRegressor from disk.

    Args:
        path (Union[str, Path]): Path to the directory containing the saved model.
        device (str, optional): Unused, included for compatibility. Defaults to "cpu".
        load_logs (bool, optional): Unused, included for compatibility. Defaults to False.

    Returns:
        (GPRegressor): The loaded model instance.
    """
    path = Path(path)

    with open(path / "model.pkl", 'rb') as file: 
        model = pickle.load(file)

    return model

predict(X)

Predicts the target values with uncertainty estimates.

Parameters:

Name Type Description Default
X ndarray

Feature matrix of shape (n_samples, n_features).

required

Returns:

Type Description
Union[Tuple[ndarray, ndarray, ndarray], Tuple[Tensor, Tensor, Tensor]]

Tuple containing: mean predictions, lower bound of the prediction interval, upper bound of the prediction interval.

Note

If requires_grad is False, all returned arrays are NumPy arrays. Otherwise, they are PyTorch tensors with gradients.

Source code in uqregressors/bayesian/sklearn_gp.py
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
def predict(self, X):
    """
    Predicts the target values with uncertainty estimates.

    Args:
        X (np.ndarray): Feature matrix of shape (n_samples, n_features).

    Returns:
        (Union[Tuple[np.ndarray, np.ndarray, np.ndarray], Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]): Tuple containing:
            mean predictions,
            lower bound of the prediction interval,
            upper bound of the prediction interval.

    !!! note
        If `requires_grad` is False, all returned arrays are NumPy arrays.
        Otherwise, they are PyTorch tensors with gradients.
    """
    if not self.fitted: 
        raise ValueError("Model not yet fit. Please call fit() before predict().")

    preds, std = self.model.predict(X, return_std=True)
    z_score = st.norm.ppf(1 - self.alpha / 2)
    mean = preds
    lower = mean - z_score * std
    upper = mean + z_score * std
    return mean, lower, upper

save(path)

Saves the model and its configuration to disk.

Parameters:

Name Type Description Default
path Union[str, Path]

Directory where model and config will be saved.

required
Source code in uqregressors/bayesian/sklearn_gp.py
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
def save(self, path): 
    """
    Saves the model and its configuration to disk.

    Args:
        path (Union[str, Path]): Directory where model and config will be saved.
    """
    if not self.fitted: 
        raise ValueError("Model not yet fit. Please call fit() before save().")

    path = Path(path)
    path.mkdir(parents=True, exist_ok=True) 

    config = {
        k: v for k, v in self.__dict__.items()
        if k not in ["kernel", "model"]
        and not callable(v)
        and not isinstance(v, ())
    }
    config["kernel"] = self.kernel.__class__.__name__

    with open(path / "config.json", "w") as f:
        json.dump(config, f, indent=4)

    with open(path / "model.pkl", 'wb') as file: 
        pickle.dump(self, file)