Skip to content

uqregressors.bayesian.dropout

Monte Carlo Dropout is implemented as in Gal and Ghahramani 2016.

Monte Carlo Dropout

This module implements a Monte Carlo (MC) Dropout Regressor for regression on a one dimensional output with uncertainty quantification. It estimates predictive uncertainty by performing multiple stochastic forward passes through a dropout-enabled neural network.

Key features are
  • Customizable neural network architecture
  • Aleatoric uncertainty included with hyperparameter tau
  • Prediction Intervals based on Gaussian assumption
  • Customizable optimizer and loss function
  • Optional Input/Output Normalization

Warning

Using hyperparameter optimization to optimize the aleatoric uncertainty hyperparameter tau is often necessary to obtain correct predictive intervals!

MCDropoutRegressor

Bases: BaseEstimator, RegressorMixin

MC Dropout Regressor with uncertainty estimation using neural networks.

This class trains a dropout MLP and takes stochastic forward passes to provide predictive uncertainty intervals. It makes a Gaussian assumption on the output distribution, and often requires tuning of the hyperparameter tau

Parameters:

Name Type Description Default
name str

Name of the model instance.

'MC_Dropout_Regressor'
hidden_sizes List[int]

Hidden layer sizes for the MLP.

[64, 64]
dropout float

Dropout rate to apply after each hidden layer.

0.1
tau float

Precision parameter (used in predictive variance).

1e-06
use_paper_weight_decay bool

Whether to use paper's theoretical weight decay.

True
alpha float

Significance level (1 - confidence level) for prediction intervals.

0.1
requires_grad bool

Whether to track gradients in prediction output.

False
activation_str str

Activation function name (e.g., "ReLU", "Tanh").

'ReLU'
n_samples int

Number of stochastic forward passes for prediction.

100
learning_rate float

Learning rate for the optimizer.

0.001
epochs int

Number of training epochs.

200
batch_size int

Batch size for training.

32
optimizer_cls Optimizer

PyTorch optimizer class.

Adam
optimizer_kwargs dict

Optional kwargs to pass to optimizer.

None
scheduler_cls Optional[Callable]

Optional learning rate scheduler class.

None
scheduler_kwargs dict

Optional kwargs for the scheduler.

None
loss_fn Callable

Loss function for training (default: MSE).

mse_loss
device str

Device to run training/prediction on ("cpu" or "cuda").

'cpu'
use_wandb bool

If True, logs training to Weights & Biases.

False
wandb_project str

W&B project name.

None
wandb_run_name str

W&B run name.

None
random_seed Optional[int]

Seed for reproducibility.

None
scale_data bool

Whether to standardize inputs and outputs.

True
input_scaler Optional[TorchStandardScaler]

Custom input scaler.

None
output_scaler Optional[TorchStandardScaler]

Custom output scaler.

None
tuning_loggers List[Logger]

External loggers returned from hyperparameter tuning.

[]

Attributes:

Name Type Description
model MLP

Trained PyTorch MLP model.

input_dim int

Dimensionality of input features.

_loggers Logger

Training logger.

training_logs

Logs from training.

tuning_logs

Logs from hyperparameter tuning.

Source code in uqregressors\bayesian\dropout.py
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
class MCDropoutRegressor(BaseEstimator, RegressorMixin):
    """ 
    MC Dropout Regressor with uncertainty estimation using neural networks. 

    This class trains a dropout MLP and takes stochastic forward passes to provide predictive uncertainty intervals.
    It makes a Gaussian assumption on the output distribution, and often requires tuning of the hyperparameter tau

    Args:
        name (str): Name of the model instance.
        hidden_sizes (List[int]): Hidden layer sizes for the MLP.
        dropout (float): Dropout rate to apply after each hidden layer.
        tau (float): Precision parameter (used in predictive variance).
        use_paper_weight_decay (bool): Whether to use paper's theoretical weight decay.
        alpha (float): Significance level (1 - confidence level) for prediction intervals.
        requires_grad (bool): Whether to track gradients in prediction output.
        activation_str (str): Activation function name (e.g., "ReLU", "Tanh").
        n_samples (int): Number of stochastic forward passes for prediction.
        learning_rate (float): Learning rate for the optimizer.
        epochs (int): Number of training epochs.
        batch_size (int): Batch size for training.
        optimizer_cls (Optimizer): PyTorch optimizer class.
        optimizer_kwargs (dict): Optional kwargs to pass to optimizer.
        scheduler_cls (Optional[Callable]): Optional learning rate scheduler class.
        scheduler_kwargs (dict): Optional kwargs for the scheduler.
        loss_fn (Callable): Loss function for training (default: MSE).
        device (str): Device to run training/prediction on ("cpu" or "cuda").
        use_wandb (bool): If True, logs training to Weights & Biases.
        wandb_project (str): W&B project name.
        wandb_run_name (str): W&B run name.
        random_seed (Optional[int]): Seed for reproducibility.
        scale_data (bool): Whether to standardize inputs and outputs.
        input_scaler (Optional[TorchStandardScaler]): Custom input scaler.
        output_scaler (Optional[TorchStandardScaler]): Custom output scaler.
        tuning_loggers (List[Logger]): External loggers returned from hyperparameter tuning.

    Attributes:
        model (MLP): Trained PyTorch MLP model.
        input_dim (int): Dimensionality of input features.
        _loggers (Logger): Training logger.
        training_logs: Logs from training.
        tuning_logs: Logs from hyperparameter tuning.
    """
    def __init__(
        self,
        name="MC_Dropout_Regressor",
        hidden_sizes=[64, 64],
        dropout=0.1,
        tau=1.0e-6,
        use_paper_weight_decay=True,
        alpha=0.1,
        requires_grad=False, 
        activation_str="ReLU",
        n_samples=100,
        learning_rate=1e-3,
        epochs=200,
        batch_size=32,
        optimizer_cls=torch.optim.Adam,
        optimizer_kwargs=None,
        scheduler_cls=None,
        scheduler_kwargs=None,
        loss_fn=torch.nn.functional.mse_loss,
        device="cpu",
        use_wandb=False,
        wandb_project=None,
        wandb_run_name=None,
        random_seed=None,
        scale_data=True, 
        input_scaler=None,
        output_scaler=None, 
        tuning_loggers = []
    ):
        self.name=name
        self.hidden_sizes = hidden_sizes
        self.dropout = dropout
        self.tau = tau
        self.use_paper_weight_decay = use_paper_weight_decay
        self.alpha = alpha
        self.requires_grad = requires_grad
        self.activation_str = activation_str
        self.n_samples = n_samples
        self.learning_rate = learning_rate
        self.epochs = epochs
        self.batch_size = batch_size
        self.optimizer_cls = optimizer_cls
        self.optimizer_kwargs = optimizer_kwargs or {}
        self.scheduler_cls = scheduler_cls
        self.scheduler_kwargs = scheduler_kwargs or {}
        self.loss_fn = loss_fn

        self.device = device
        self.model = None

        self.use_wandb = use_wandb
        self.wandb_project = wandb_project
        self.wandb_run_name = wandb_run_name
        self.random_seed = random_seed
        self.input_dim = None

        self.scale_data = scale_data
        self.input_scaler = input_scaler or TorchStandardScaler()
        self.output_scaler = output_scaler or TorchStandardScaler()

        self._loggers = [] 
        self.training_logs = None
        self.tuning_loggers = tuning_loggers 
        self.tuning_logs = None

    def fit(self, X, y): 
        """
        Fit the MC Dropout model on training data.

        Args:
            X (array-like): Training features of shape (n_samples, n_features).
            y (array-like): Target values of shape (n_samples,).
        """
        X_tensor, y_tensor = validate_and_prepare_inputs(X, y, device=self.device)
        input_dim = X_tensor.shape[1]
        self.input_dim = input_dim

        if self.scale_data: 
            X_tensor = self.input_scaler.fit_transform(X_tensor)
            y_tensor = self.output_scaler.fit_transform(y_tensor)

        model, logger = self._fit_single_model(X_tensor, y_tensor)
        self._loggers.append(logger)

    def _fit_single_model(self, X_tensor, y_tensor):
        if self.random_seed is not None: 
            torch.manual_seed(self.random_seed)
            np.random.seed(self.random_seed)

        config = {
            "learning_rate": self.learning_rate,
            "epochs": self.epochs,
            "batch_size": self.batch_size,
        }

        logger = Logger(
            use_wandb=self.use_wandb,
            project_name=self.wandb_project,
            run_name=self.wandb_run_name,
            config=config,
        )

        activation = get_activation(self.activation_str)

        model = MLP(self.input_dim, self.hidden_sizes, self.dropout, activation)
        self.model = model.to(self.device)

        optimizer = self.optimizer_cls(
            self.model.parameters(), lr=self.learning_rate, **self.optimizer_kwargs
        )

        scheduler = None
        if self.scheduler_cls is not None:
            scheduler = self.scheduler_cls(optimizer, **self.scheduler_kwargs)

        dataset = TensorDataset(X_tensor, y_tensor)
        dataloader = DataLoader(dataset, batch_size=self.batch_size, shuffle=True)

        self.model.train()
        for epoch in range(self.epochs):
            epoch_loss = 0.0
            for xb, yb in dataloader: 
                optimizer.zero_grad()
                preds = self.model(xb)
                loss = self.loss_fn(preds, yb)
                loss.backward()
                optimizer.step()
                epoch_loss += loss

            if scheduler is not None:
                scheduler.step()

            if epoch % (self.epochs / 20) == 0:
                logger.log({"epoch": epoch, "train_loss": epoch_loss})

        logger.finish()

        return self, logger

    def predict(self, X):
        """
        Predicts the target values with uncertainty estimates.

        Args:
            X (np.ndarray): Feature matrix of shape (n_samples, n_features).

        Returns:
            (Union[Tuple[np.ndarray, np.ndarray, np.ndarray], Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]): Tuple containing:
                mean predictions,
                lower bound of the prediction interval,
                upper bound of the prediction interval.

        !!! note
            If `requires_grad` is False, all returned arrays are NumPy arrays.
            Otherwise, they are PyTorch tensors with gradients.
        """
        X_tensor = validate_X_input(X, input_dim=self.input_dim, device=self.device, requires_grad=self.requires_grad)
        if self.random_seed is not None: 
            torch.manual_seed(self.random_seed)
            np.random.seed(self.random_seed)

        if self.scale_data: 
            X_tensor = self.input_scaler.transform(X_tensor)

        self.model.train()
        preds = []
        with torch.no_grad():
            for _ in range(self.n_samples):
                preds.append(self.model(X_tensor))
        preds = torch.stack(preds)
        mean = preds.mean(dim=0)
        variance = torch.var(preds, dim=0) + 1 / self.tau 
        std = variance.sqrt()
        std_mult = torch.tensor(st.norm.ppf(1 - self.alpha / 2), device=mean.device)
        lower = mean - std * std_mult
        upper = mean + std * std_mult

        if self.scale_data: 
            mean = self.output_scaler.inverse_transform(mean).squeeze()
            lower = self.output_scaler.inverse_transform(lower).squeeze()
            upper = self.output_scaler.inverse_transform(upper).squeeze()

        else: 
            mean = mean.squeeze() 
            lower = lower.squeeze() 
            upper = upper.squeeze() 

        if not self.requires_grad: 
            return mean.detach().cpu().numpy(), lower.detach().cpu().numpy(), upper.detach().cpu().numpy()

        else: 
            return mean, lower, upper

    def save(self, path):
        """
        Save model weights, config, and scalers to disk.

        Args:
            path (str or Path): Directory to save model components.
        """
        path = Path(path)
        path.mkdir(parents=True, exist_ok=True)

        # Save config (exclude non-serializable or large objects)
        config = {
            k: v for k, v in self.__dict__.items()
            if k not in ["optimizer_cls", "optimizer_kwargs", "scheduler_cls", "scheduler_kwargs", "input_scaler", 
                         "output_scaler", "_loggers", "training_logs", "tuning_loggers", "tuning_logs"]
            and not callable(v)
            and not isinstance(v, (torch.nn.Module,))
        }

        config["optimizer"] = self.optimizer_cls.__class__.__name__ if self.optimizer_cls is not None else None
        config["scheduler"] = self.scheduler_cls.__class__.__name__ if self.scheduler_cls is not None else None
        config["input_scaler"] = self.input_scaler.__class__.__name__ if self.input_scaler is not None else None 
        config["output_scaler"] = self.output_scaler.__class__.__name__ if self.output_scaler is not None else None

        with open(path / "config.json", "w") as f:
            json.dump(config, f, indent=4)

        with open(path / "extras.pkl", 'wb') as f: 
            pickle.dump([self.optimizer_cls, 
                         self.optimizer_kwargs, self.scheduler_cls, self.scheduler_kwargs, self.input_scaler, self.output_scaler], f)

        # Save model weights
        torch.save(self.model.state_dict(), path / f"model.pt")

        for i, logger in enumerate(getattr(self, "_loggers", [])):
            logger.save_to_file(path, idx=i, name="estimator")

        for i, logger in enumerate(getattr(self, "tuning_loggers", [])): 
            logger.save_to_file(path, name="tuning", idx=i)

    @classmethod
    def load(cls, path, device="cpu", load_logs=False):
        """
        Load a saved MC dropout regressor from disk.

        Args:
            path (str or pathlib.Path): Directory path to load the model from.
            device (str or torch.device): Device to load the model onto.
            load_logs (bool): Whether to load training and tuning logs.

        Returns:
            (MCDropoutRegressor): Loaded model instance.
        """
        path = Path(path)

        # Load config
        with open(path / "config.json", "r") as f:
            config = json.load(f)
        config["device"] = device

        config.pop("optimizer", None)
        config.pop("scheduler", None)
        config.pop("input_scaler", None)
        config.pop("output_scaler", None)

        input_dim = config.pop("input_dim", None)
        model = cls(**config)

        with open(path / "extras.pkl", 'rb') as f: 
            optimizer_cls, optimizer_kwargs, scheduler_cls, scheduler_kwargs, input_scaler, output_scaler = pickle.load(f)

        # Recreate models
        model.input_dim = input_dim
        activation = get_activation(config["activation_str"])

        model.model = MLP(model.input_dim, config["hidden_sizes"], model.dropout, activation).to(device)
        model.model.load_state_dict(torch.load(path / f"model.pt", map_location=device))

        model.optimizer_cls = optimizer_cls 
        model.optimizer_kwargs = optimizer_kwargs 
        model.scheduler_cls = scheduler_cls 
        model.scheduler_kwargs = scheduler_kwargs
        model.input_scaler = input_scaler 
        model.output_scaler = output_scaler

        if load_logs: 
            logs_path = path / "logs"
            training_logs = [] 
            tuning_logs = []
            if logs_path.exists() and logs_path.is_dir(): 
                estimator_log_files = sorted(logs_path.glob("estimator_*.log"))
                for log_file in estimator_log_files:
                    with open(log_file, "r", encoding="utf-8") as f:
                        training_logs.append(f.read())

                tuning_log_files = sorted(logs_path.glob("tuning_*.log"))
                for log_file in tuning_log_files: 
                    with open(log_file, "r", encoding="utf-8") as f: 
                        tuning_logs.append(f.read())

            model.training_logs = training_logs
            model.tuning_logs = tuning_logs

        return model

    def _predictive_log_likelihood(self, X, y_true, tau):
        self.model.train()

        X_tensor = torch.tensor(X, dtype=torch.float32).to(self.device)
        y_true = y_true.reshape(-1)
        preds = []

        with torch.no_grad():
            for _ in range(self.n_samples):
                pred = self.model(X_tensor).cpu().numpy().squeeze()
                preds.append(pred)
        preds = np.stack(preds, axis=0)  # Shape: (T, N)

        mean_preds = preds.mean(axis=0)
        var_preds = preds.var(axis=0) + (1 / tau)

        log_likelihoods = (
            -0.5 * np.log(2 * np.pi * var_preds)
            - 0.5 * ((y_true - mean_preds) ** 2) / var_preds
        )

        return np.mean(log_likelihoods)

fit(X, y)

Fit the MC Dropout model on training data.

Parameters:

Name Type Description Default
X array - like

Training features of shape (n_samples, n_features).

required
y array - like

Target values of shape (n_samples,).

required
Source code in uqregressors\bayesian\dropout.py
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
def fit(self, X, y): 
    """
    Fit the MC Dropout model on training data.

    Args:
        X (array-like): Training features of shape (n_samples, n_features).
        y (array-like): Target values of shape (n_samples,).
    """
    X_tensor, y_tensor = validate_and_prepare_inputs(X, y, device=self.device)
    input_dim = X_tensor.shape[1]
    self.input_dim = input_dim

    if self.scale_data: 
        X_tensor = self.input_scaler.fit_transform(X_tensor)
        y_tensor = self.output_scaler.fit_transform(y_tensor)

    model, logger = self._fit_single_model(X_tensor, y_tensor)
    self._loggers.append(logger)

load(path, device='cpu', load_logs=False) classmethod

Load a saved MC dropout regressor from disk.

Parameters:

Name Type Description Default
path str or Path

Directory path to load the model from.

required
device str or device

Device to load the model onto.

'cpu'
load_logs bool

Whether to load training and tuning logs.

False

Returns:

Type Description
MCDropoutRegressor

Loaded model instance.

Source code in uqregressors\bayesian\dropout.py
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
@classmethod
def load(cls, path, device="cpu", load_logs=False):
    """
    Load a saved MC dropout regressor from disk.

    Args:
        path (str or pathlib.Path): Directory path to load the model from.
        device (str or torch.device): Device to load the model onto.
        load_logs (bool): Whether to load training and tuning logs.

    Returns:
        (MCDropoutRegressor): Loaded model instance.
    """
    path = Path(path)

    # Load config
    with open(path / "config.json", "r") as f:
        config = json.load(f)
    config["device"] = device

    config.pop("optimizer", None)
    config.pop("scheduler", None)
    config.pop("input_scaler", None)
    config.pop("output_scaler", None)

    input_dim = config.pop("input_dim", None)
    model = cls(**config)

    with open(path / "extras.pkl", 'rb') as f: 
        optimizer_cls, optimizer_kwargs, scheduler_cls, scheduler_kwargs, input_scaler, output_scaler = pickle.load(f)

    # Recreate models
    model.input_dim = input_dim
    activation = get_activation(config["activation_str"])

    model.model = MLP(model.input_dim, config["hidden_sizes"], model.dropout, activation).to(device)
    model.model.load_state_dict(torch.load(path / f"model.pt", map_location=device))

    model.optimizer_cls = optimizer_cls 
    model.optimizer_kwargs = optimizer_kwargs 
    model.scheduler_cls = scheduler_cls 
    model.scheduler_kwargs = scheduler_kwargs
    model.input_scaler = input_scaler 
    model.output_scaler = output_scaler

    if load_logs: 
        logs_path = path / "logs"
        training_logs = [] 
        tuning_logs = []
        if logs_path.exists() and logs_path.is_dir(): 
            estimator_log_files = sorted(logs_path.glob("estimator_*.log"))
            for log_file in estimator_log_files:
                with open(log_file, "r", encoding="utf-8") as f:
                    training_logs.append(f.read())

            tuning_log_files = sorted(logs_path.glob("tuning_*.log"))
            for log_file in tuning_log_files: 
                with open(log_file, "r", encoding="utf-8") as f: 
                    tuning_logs.append(f.read())

        model.training_logs = training_logs
        model.tuning_logs = tuning_logs

    return model

predict(X)

Predicts the target values with uncertainty estimates.

Parameters:

Name Type Description Default
X ndarray

Feature matrix of shape (n_samples, n_features).

required

Returns:

Type Description
Union[Tuple[ndarray, ndarray, ndarray], Tuple[Tensor, Tensor, Tensor]]

Tuple containing: mean predictions, lower bound of the prediction interval, upper bound of the prediction interval.

Note

If requires_grad is False, all returned arrays are NumPy arrays. Otherwise, they are PyTorch tensors with gradients.

Source code in uqregressors\bayesian\dropout.py
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
def predict(self, X):
    """
    Predicts the target values with uncertainty estimates.

    Args:
        X (np.ndarray): Feature matrix of shape (n_samples, n_features).

    Returns:
        (Union[Tuple[np.ndarray, np.ndarray, np.ndarray], Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]): Tuple containing:
            mean predictions,
            lower bound of the prediction interval,
            upper bound of the prediction interval.

    !!! note
        If `requires_grad` is False, all returned arrays are NumPy arrays.
        Otherwise, they are PyTorch tensors with gradients.
    """
    X_tensor = validate_X_input(X, input_dim=self.input_dim, device=self.device, requires_grad=self.requires_grad)
    if self.random_seed is not None: 
        torch.manual_seed(self.random_seed)
        np.random.seed(self.random_seed)

    if self.scale_data: 
        X_tensor = self.input_scaler.transform(X_tensor)

    self.model.train()
    preds = []
    with torch.no_grad():
        for _ in range(self.n_samples):
            preds.append(self.model(X_tensor))
    preds = torch.stack(preds)
    mean = preds.mean(dim=0)
    variance = torch.var(preds, dim=0) + 1 / self.tau 
    std = variance.sqrt()
    std_mult = torch.tensor(st.norm.ppf(1 - self.alpha / 2), device=mean.device)
    lower = mean - std * std_mult
    upper = mean + std * std_mult

    if self.scale_data: 
        mean = self.output_scaler.inverse_transform(mean).squeeze()
        lower = self.output_scaler.inverse_transform(lower).squeeze()
        upper = self.output_scaler.inverse_transform(upper).squeeze()

    else: 
        mean = mean.squeeze() 
        lower = lower.squeeze() 
        upper = upper.squeeze() 

    if not self.requires_grad: 
        return mean.detach().cpu().numpy(), lower.detach().cpu().numpy(), upper.detach().cpu().numpy()

    else: 
        return mean, lower, upper

save(path)

Save model weights, config, and scalers to disk.

Parameters:

Name Type Description Default
path str or Path

Directory to save model components.

required
Source code in uqregressors\bayesian\dropout.py
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
def save(self, path):
    """
    Save model weights, config, and scalers to disk.

    Args:
        path (str or Path): Directory to save model components.
    """
    path = Path(path)
    path.mkdir(parents=True, exist_ok=True)

    # Save config (exclude non-serializable or large objects)
    config = {
        k: v for k, v in self.__dict__.items()
        if k not in ["optimizer_cls", "optimizer_kwargs", "scheduler_cls", "scheduler_kwargs", "input_scaler", 
                     "output_scaler", "_loggers", "training_logs", "tuning_loggers", "tuning_logs"]
        and not callable(v)
        and not isinstance(v, (torch.nn.Module,))
    }

    config["optimizer"] = self.optimizer_cls.__class__.__name__ if self.optimizer_cls is not None else None
    config["scheduler"] = self.scheduler_cls.__class__.__name__ if self.scheduler_cls is not None else None
    config["input_scaler"] = self.input_scaler.__class__.__name__ if self.input_scaler is not None else None 
    config["output_scaler"] = self.output_scaler.__class__.__name__ if self.output_scaler is not None else None

    with open(path / "config.json", "w") as f:
        json.dump(config, f, indent=4)

    with open(path / "extras.pkl", 'wb') as f: 
        pickle.dump([self.optimizer_cls, 
                     self.optimizer_kwargs, self.scheduler_cls, self.scheduler_kwargs, self.input_scaler, self.output_scaler], f)

    # Save model weights
    torch.save(self.model.state_dict(), path / f"model.pt")

    for i, logger in enumerate(getattr(self, "_loggers", [])):
        logger.save_to_file(path, idx=i, name="estimator")

    for i, logger in enumerate(getattr(self, "tuning_loggers", [])): 
        logger.save_to_file(path, name="tuning", idx=i)

MLP

Bases: Module

A simple feedforward neural network with dropout for regression.

This MLP supports customizable hidden layer sizes, activation functions, and dropout. It outputs a single scalar per input — the predictive mean.

Parameters:

Name Type Description Default
input_dim int

Number of input features.

required
hidden_sizes list of int

Sizes of the hidden layers.

required
dropout float

Dropout rate (applied after each activation).

required
activation callable

Activation function (e.g., nn.ReLU).

required
Source code in uqregressors\bayesian\dropout.py
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
class MLP(nn.Module):
    """
    A simple feedforward neural network with dropout for regression.

    This MLP supports customizable hidden layer sizes, activation functions,
    and dropout. It outputs a single scalar per input — the predictive mean.

    Args:
        input_dim (int): Number of input features.
        hidden_sizes (list of int): Sizes of the hidden layers.
        dropout (float): Dropout rate (applied after each activation).
        activation (callable): Activation function (e.g., nn.ReLU).
    """
    def __init__(self, input_dim, hidden_sizes, dropout, activation):
        super().__init__()
        layers = []
        for h in hidden_sizes:
            layers.append(nn.Linear(input_dim, h))
            layers.append(activation())
            layers.append(nn.Dropout(dropout))
            input_dim = h
        layers.append(nn.Linear(hidden_sizes[-1], 1))
        self.model = nn.Sequential(*layers)

    def forward(self, x):
        return self.model(x)