My stuff
Probabilistic Safety for Bayesian Neural Networks

We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points,  T ⊆ Rm, we study the probability w.r.t. the BNN posterior that all the points in T are mapped to the same region S in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is vulnerable to adversarial attacks. We rely on relaxation techniques from non-convex optimization to develop a method for computing a lower bound on probabilistic safety for BNNs, deriving explicit procedures for the case of interval and linear function propagation techniques. We apply our methods to BNNs trained on a regression task, airborne collision avoidance, and MNIST, empirically showing that our approach allows one to certify probabilistic safety of BNNs with millions of parameters.

Although Neural Networks (NNs) have recently achieved state-of-the-art performance [14], they are susceptible to several vulnerabilities [3], with adversarial examples being one of the most prominent among them [28]. Since adversarial examples are arguably intuitively related to uncertainty [17], Bayesian Neural Networks (BNNs), i.e. NNs with a probability distribution placed over their weights and biases [24], have recently been proposed as a potentially more robust learning paradigm [6]. While retaining the advantages intrinsic to deep learning (e.g. representation learning), BNNs also enable principled evaluation of model uncertainty, which can

Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 2020.


be taken into account at prediction time to enable safe decision making.

Many techniques have been proposed for the evaluation of the robustness of BNNs, including generalisation of NN gradient-based adversarial attacks to BNN posterior distribution [19], statistical verification techniques for adversarial settings [7], as well as pointwise (i.e. for a specific test point  x∗) uncertainty evaluation techniques [27]. However, to the best of our knowledge, methods that formally (i.e., with certified bounds) give guarantees on the behaviour of BNNs against adversarial input perturbations in probabilistic settings are missing.

In this paper we aim at analysing probabilistic safety for BNNs. Given a BNN, a set  T ⊆ Rmin the input space, and a set  S ⊆ Rncin the output space, probabilistic safety is defined as the probability that for all  x ∈ Tthe prediction of the BNN is in S. In adversarial settings, the input region T is built around the neighborhood of a given test point  x∗, while the output region S is defined around the BNN prediction on  x∗, so that probabilistic safety translates into computing the probability that adversarial perturbations of  x∗cause small variations in the BNN output. Note that probabilistic safety represents a probabilistic variant of the notion of safety commonly used to certify deterministic NNs [26].

Unfortunately, computation of probabilistic safety for a BNN over a compact set T and for an output region S is not trivial, as it involves computing the probability that a deterministic NN sampled from the BNN posterior is safe (i.e., all the points in a given set are mapped by the network to a given output set), which is known to be NPcomplete [16]. Nevertheless, we derive a method for the computation of a certified lower bound for probabilistic safety. In particular, we show that the computation of probabilistic safety for BNNs is equivalent to computing the measure, w.r.t. BNN posterior, of the set of weights for which the resulting deterministic NN is safe, i.e., robust to adversarial perturbations. We compute a subset of such weights, ˆH, and build on relaxation techniques from non-linear optimisation to check whether, given a compact set T and a safe output region S, all the networks instantiated by weights in ˆHare safe. We provide lower bounds for the case of Interval Bound Propagation (IBP) and Linear Bound Propagation (LBP) for BNNs trained with variational inference (VI) [5]. However, we note that the derived bounds can be extended to other approximate Bayesian inference techniques.

We experimentally investigate the behaviour of our method on a regression task, the VCAS dataset and the MNIST dataset, for a range of properties and different BNN architectures, and applying bounds obtained both via IBP and LBP. We show that our method allows one to compute non-trivial, certified lower bounds on the probabilistic safety of BNNs with millions of weights in a few minutes.

In summary, this paper makes the following main contributions.1

We propose a framework based on relaxation techniques from non-convex optimisation for the analysis of probabilistic safety for BNNs with general activation functions and multiple hidden layers.

We derive explicit procedures based on IBP and LBP for the computation of the set of weights for which the corresponding NN sampled from the BNN posterior distribution is safe.

On various datasets we show that our algorithm can efficiently verify multi-layer BNNs with millions of parameters in a few minutes.

Related Work. Most existing certification methods are designed for deterministic neural networks (see e.g., [16]) and cannot be used for verification of Bayesian models against probabilistic properties. In particular, in [30, 32, 13] Interval Bound Propagation (IBP) and Linear Bound Propagation (LBP) approaches have been employed for certification of deterministic neural networks. However, these methods cannot be used for BNNs because they all assume that the weights of the networks are deterministic, i.e fixed to a given value, while in the Bayesian setting we need to certify the BNN for a continuous range of values for weights that are not fixed, but distributed according to the BNN posterior. We extend IBP and LBP to BNNs in Section 4.1.

Bayesian uncertainty estimates have been shown to empirically flag adversarial examples in [25, 27]. However, these approaches only consider pointwise uncertainty estimates, that is, specific to a particular test point. In contrast, probabilistic safety aims at computing the uncertainty for compact subspaces of input points, thus taking into account worst-case adversarial perturbations of the input point when considering its neighbourhood. A probabilistic property analogous to that considered in this paper has been studied for BNNs in [7, 23]. However, the solution methods presented in [7, 23] are statistical, with confidence bounds, and in practice cannot give certified guarantees for the computed values, which are important for safety-critical applications. Our approach, instead building on non-linear optimisation relaxation techniques, computes a certified lower bound.

Methods to compute probabilistic adversarial measures of robustness in Bayesian learning settings have been explored for Gaussian Processes (GPs), both for regression [8] and classification tasks [27, 4]. However, the vast majority of inference methods employed for BNNs do not have a Gaussian approximate distribution in latent/function space, even if they assume a Guassian distribution in the weight space [5]. Hence, certifica-tion techniques for GPs cannot be directly employed for BNNs. In fact, because of the non-linearity of the BNN structure, even in this case the distribution of the BNN is not Gaussian in function space. Though methods to approximate BNN inference with GP-based inference have been proposed [10], the guarantees obtained in this way would apply to the approximation and not the actual BNN, and would not provide error bounds. In contrast, our method is specifically tailored to take into account the non-linear nature of BNNs and can be directly applied to a range of approximate Bayesian inference techniques.


In this section we review BNNs. We write  f w(x) =[f w1 (x), . . . , f wnc(x)]to denote a BNN with  ncoutput units and an unspecified number of hidden layers, where w is the weight vector random variable. Given a distribution over w and  w ∈ Rnw, a weight vector sampled from the distribution of w, we denote with  f w(x)the corresponding deterministic neural network with weights fixed to w. Let  D = {(xi, yi), i ∈ {1, ..., ND}}be the training set. In Bayesian settings, we assume a prior distribution over the weights, i.e.  w ∼ p(w), so that learning amounts to computing the posterior distribution over the weights, p(w|D), via the application of Bayes rule. Unfortunately, because of the non-linearity introduced by the neural network architecture, the computation of the posterior cannot be done analytically [20].

In this work we focus on Gaussian Variational Inference (VI) approximations via Bayes by Backprop [5]. In particular, VI proceed by finding a suitable Gaussian approximating distribution  q(w) = N(w|µ, Σ)for the posterior distribution, i.e. such that  q(w) ≈ p(w|D). The core idea is that the mean and covariance of q(w) are iteratively optimized by minimizing a divergence measure between q(w) and p(w|D). In particular, in Section 4 we give explicit bounds for BNN certification for variational inference; however, we remark that the techniques developed in this paper also extend to other classes of distributions and approximate inference methods, such as Hamiltonian Monte Carlo [24] or dropout [12], with the caveat that in those cases the integral in Eqn. (2) (Section 4) will not be Gaussian and will need to be computed by means of Monte Carlo sampling techniques.

A BNN is a stochastic process whose randomness comes from the weight distribution. Therefore, in order to study robustness, its probabilistic nature should be considered. In this paper we focus on probabilistic safety, which is a widely employed measure to characterize the robustness of stochastic models [1, 18], and also represents a probabilistic generalization of the notion of safety commonly used to guarantee the robustness of deterministic neural networks against adversarial examples [26]. In particular, probabilistic safety for BNNs is defined as follows.

Definition 1. Let  f wbe a BNN, D a dataset,  T ⊂ Rma compact set of input points, and  S ⊆ Rnca safe set. Then, probabilistic safety is defined as


Psafe(T, S)is the probability that for all input points in T the output of the BNN belongs to a given output set S. Note that the probabilistic behaviour in  Psafe(T, S)is uniquely determined from the distribution over the weights random variable w. In particular, no distribution is assumed in the input space. Hence,  Psafe(T, S)represents a probabilistic measure of robustness. We make the following assumption on S.

Assumption 1. We assume that S is described by  nSlinear constraints on the values of the final layer of the BNN, that is,

S = {y ∈ Rnc|  CSy+dS ≥ 0, CS ∈ RnS×nc, dS ∈ RnS}

We stress that, as discussed in [13], this assumption encompasses most properties of interest for the verification of neural networks. We refer to  CSy + dS ≥ 0as the safety specification associated to the safe set S. Below, in Example 1, we illustrate the notion of probabilistic safety on an example.

Example 1. We consider a regression task where we learn a BNN from noisy data centred around the function y = x3, as illustrated in Figure 1. We let  T = [−ϵ, ϵ]and  S = [−δ, δ], with  ϵ = 0.2and  δ = 5be two intervals. Then, we aim at computing  Psafe(T, S), that is, the probability that for all  x ∈ [−ϵ, ϵ], f w(x) ∈ [−δ, δ].


Figure 1: Left: 50 points sampled uniformly from  [−4, 4] andtheir corresponding outputs according to  y = x3 + N(0, 5).Right: For  T = [−ϵ, ϵ] and S = [−δ, δ], with ϵ = 0.2 andδ = 5, we visualize the property  Psafe(T, S). The red dot and bar depicts the mean and the standard deviation of the BNN prediction in  x∗ = 0.

Probabilistic safety can be used in a regression setting to formally account for the uncertainty in the learned model. For instance, it can be employed in a model-based reinforcement learning scenario to ensure the safety of the learned model under uncertain environments [2]. In a classification problem,  Psafe(T, S)could be used to evaluate the uncertainty around the model predictions in adversarial settings [11]. We remark that probabilistic safety is also related to other adversarial robustness measures in the literature [9, 4]. In particular, it is straightforward to show (see Proposition 4 in the Supplementary Material in [31]) that


Moreover, if for  i ∈ {1, ..., nc}and  a ∈ R>0we assume that  S = {y ∈ Rnc | yi > a}, then it holds that


Approach Outline Probabilistic safety,  Psafe(T, S), is not trivial to compute for a given compact set T in the input space and safe set S that satisfies Assumption 1. In fact, already in the case of deterministic neural networks, safety has been shown to pose an NP-complete problem [32]. Therefore, in Section 4 we derive a method for lower-bounding (i.e., for the worst-case analysis) of Psafe(T, S), which can be used for certification of the probabilistic safety of BNNs. We first show that the computation of  Psafe(T, S)is equivalent to computing the maximal set of weights H such that the corresponding deterministic neural network is safe, i.e.,  H = {w ∈Rnw|∀x ∈ T, f w(x) ∈ S}. The computation of H is unfortunately itself not straightforward. Instead, in Section 5, we design a method to compute a subset of safe weights ˆH ⊆ H, and discuss how ˆHcan be used to compute a certified lower bound to  Psafe(T, S). Our method (detailed in Algorithm 1) works by sampling weights w from the posterior BNN distribution and iteratively building safe weight regions, in the form of disjoint hyper-rectangles, around the sampled weights. This requires a subroutine that, given ˆH, checks whether it constitutes a safe set of weights or not, that is, verifies if the statement ˆH ⊆ His true. In Section 4.1, we derive two alternative approaches for the solution of this problem, one based on Interval Bound Propagation (IBP) (introduced in Section 4.1.1) and the other on Linear Bound Propagation (LBP) (introduced in Section 4.1.2). These work by propagating the input region T and the weight rectangle ˆHthrough the neural network, in the form of intervals (in the case of IBP) and lines (in the case of LBP) and checking whether the resulting output region is a subset of S, thus deciding if ˆHis a safe set of weights or not.

We show that the computation of  Psafe(T, S)reduces to computing the maximal set of weights for which the corresponding deterministic neural network is safe. To formalize this concept, consider the following definition.

Definition 2. We say that  H ⊆ Rnwis the maximal safe set of weights from T to S, or simply the maximal safe set of weights, iff  H = {w ∈ Rnw | ∀x ∈ T, f w(x) ∈ S}.Furthermore, we say that ˆHis a safe set of weights from T to S, or simply a safe set of weights, iff ˆH ⊆ H.

If ˆHis a safe set of weights, then Definition 2 implies that for any  w ∈ ˆHthe corresponding neural network is safe, i.e.,  ∀x ∈ T, f w(x) ∈ S.Then, a trivial consequence of the definition of probabilistic safety is the following proposition. Proposition 1. Let H be the maximal safe set of weights from T to S. Assume that  w ∼ q(w). Then, it holds that


Proposition 1 simply translates the safety property from the function space to an integral computation on the weight space. As a consequence of Proposition 1, in the case of  q(w) = N(w|µ, Σ)with diagonal covariance, i.e., of posterior distribution computed by diagonal Gaussian VI, we obtain the following corollary.

Corollary 1. Assume that  Σ, the covariance matrix of the posterior distribution of the weights, is diagonal with diagonal elements  Σ1, ..., Σnw. Let ˆH1, ..., ˆHMbe M safe sets of weights such that, for  i ∈ {1, .., M}, ˆHi =[li1, ui1] × ... × [linw, uinw]and  ˆHi ∩ ˆHj = ∅for  i ̸= j. Then, it holds that


Proposition 1 and Corollary 1 guarantee that the computation of  Psafeis equivalent to the characterization of H, the maximal safe set of weights, and a lower bound for Psafecan be computed by considering the union of M safe sets of weights. In what follows, in Section 4.1, we derive a framework to efficiently check if a given set of weights is safe. Then, in Section 5 we present a method to generate safe sets of weights, which will be integrated in an algorithm for the computation of  Psafe(S, T)by making use of Proposition 1.


In this subsection we derive schemes for checking whether a given hyper-rectangle, ˆH, in the weight space is such that ˆH ⊆ H, that is, for  S = {y ∈ Rnc | CSy +dS ≥ 0, CS ∈ RnS×nc, dS ∈ RnS}, we want to check whether  CSf w(x) + dS ≥ 0, ∀x ∈ T, ∀w ∈ ˆH.This is equivalent to check:


In the following we show how IBP (Section 4.1.1) and LBP (Section 4.1.2) can be used to find a lower bound on the solution of the problem posed by Equation (3). The basic principles behind the two methods are depicted in Figure 2 for an illustrative one-dimensional case (IBP shown in plots (a)–(c), LBP in plots (d)–(f)). Given a bounding box in the input of each BNN layer (plot (a), as for a deterministic NN), and an interval in the weight space, ˆH(plot (b), which is due to the fact that the BNN has probabilistic weights), IBP propagates the two bounding boxes, as detailed in Section 4.1.1, to obtain a bounding box on the network output (plot (c)). The pro- cess is then iterated for each layer. In LBP, instead, the linear function that bounds the input at each layer (plot (d)) is combined with the weight space interval ˆH(plot (e)) to obtain a linear bound on the layer output (plot (f)) as detailed in Section 4.1.2. Intuitively, as LBP allows for a linear bound, it mimics more closely the behaviour of the network, thus giving better bounds, albeit at an increased computational cost. This is further investigated in the experiments in Section 6.


Figure 2: Example of IBP (first row) and LBP (second row) for BNNs. For IBP we consider an interval in the input space (a) together with an interval in the weight space (b). The two intervals are combined to obtain an interval in the output, which is guaranteed to contain the network output (c). In the LBP case, the input bound and the weight interval are combined to obtain linear bounds that contain the network output at any layer (d)-(e)-(f). In the last column we show the application of Algorithm 1 for the computation of the property illustrated in Example 1. We consider different values of the parameters required by the algorithm for both IBP and LBP. Notice that LBP tends to give tighter bounds (i.e. higher values) compared to IBP.

Before discussing IBP and LBP in detail, we first introduce common notation for the rest of the section. We consider fully-connected networks of the form:2


for k = 1, . . . , K, where K is the number of hidden layers,  σ(·)is a pointwise activation function,  W (k) ∈Rnk×nk−1and  b(k) ∈ Rnkare the matrix of weights and vector of biases that correspond to the kth layer of the network and  nkis the number of neurons in the kth hidden layer. We write  W (k)i:for the vector comprising the elements from the ith row of  W (k), and similarly W (k):jfor that comprising the elements from the jth column.  ζ(K+1)represents the final output of the network (or the logit in the case of classification networks), that is,  ζ(K+1) = f w(x). We write  W (k),Land  W (k),Ufor the lower and upper bound induced by ˆHfor  W (k)and b(k),Land  b(k),Ufor those of  b(k), for k = 0, . . . , K. Notice that  z(0), ζ(k+1)iand  z(k)iare all functions of the input point x and of the combined vector of weights w = [W (0), b(0), . . . , W (K), b(K)]. We omit the explicit dependency for simplicity of notation. Finally, we remark that, as both the weights and the input vary in a given set, Equation (5) defines a quadratic form.

4.1.1 Interval Bound Propagation

IBP has already been employed for fast certification of deterministic neural networks [13]. For a deterministic network, the idea is to propagate the input box around x, i.e.,  T = [xL, xU],3 through the first layer, so as to find values  z(1),Land  z(1),Usuch that  z(1) ∈ [z(1),L, z(1),U], and then iteratively propagate the bound through each consecutive layer for k = 1, . . . , K. The final box constraint in the output layer can then be used to check for the property of interest [13]. The only adjustment needed in our setting (that is, for the bounding of Equation (3)) is that at each layer we also need to propagate the interval on the weight matrix  [W (k),L, W (k),U]and that on the bias vector  [b(k),L, b(k),U]. This can be done by noticing that the minimum and maximum of each term of the bi-linear form of Equation (5), that is, of each monomial W (k)ij z(k)j, lies in one of the four corners of the interval [W (k),Lij , W (k),Uij ] × [z(k),Lj , z(k),Uj ], and by adding the minimum and maximum values respectively attained by b(k)i. As in the deterministic case, interval propagation through the activation function proceeds by observing that generally employed activation functions are monotonic, which permits the application of Equation (6) to the bounding interval. This is summarised in the following proposition.

Proposition 2. Let  f w(x)be the network defined by the set of Equations (4)(6), let for k = 0, . . . , K:


where i = 1, . . . , nk+1, j = 1, . . . , nk, and z(k),L =σ(ζ(k),L), z(k),U = σ(ζ(k),U)and:


Then we have that  ∀x ∈ Tand  ∀w ∈ ˆH:


The proposition above, whose proof is in the Supplementary Material (found in [31]), yields a bounding box for the output of the neural network in T and ˆH. This can be used directly to find a lower bound for Equation (3), and hence checking whether ˆHis a safe set of weights. As noticed in [13] and discussed in the Supplementary Material, for BNNs a slightly improved IBP bound can be obtained by eliding the last layer with the linear formula of the safety specification of set S.

4.1.2 Linear Bound Propagation

We now discuss how LBP can be used to lower-bound the solution of Equation (3), as an alternative to IBP. In LBP, instead of propagating bounding boxes, one finds lower and upper Linear Bounding Functions (LBFs) for each layer and then propagates them through the network. As the bounding function has an extra degree of freedom w.r.t. the bounding boxes obtained through IBP, LBP usually yields tighter bounds, though at an increased computational cost. Since in deterministic networks non-linearity comes only from the activation functions, LBFs in the deterministic case are computed by bounding the activation functions, and propagating the bounds through the affine function that defines each layer.

Similarly, in our setting, given T in the input space and ˆHfor the first layer in the weight space, we start with the observation that LBFs can be obtained and propagated through commonly employed activation functions for Equation (6), as discussed in [32].

Lemma 1. Let  f w(x)be defined by Equations (4)(6). For each hidden layer k = 1, . . . , K, consider a bounding box in the pre-activation function, i.e. such that ζ(k)i ∈ [ζ(k),Li , ζ(k),Ui ]for  i = 1, . . . , nk. Then there exist coefficients  α(k),Li , β(k),Li , α(k),Uiand  β(k),Uiof lower and upper LBFs on the activation function such that for all  ζ(k)i ∈ [ζ(k),Li , ζ(k),Ui ]it holds that:


The lower and upper LBFs can thus be minimised and maximised to propagate the bounds of  ζ(k)in order to compute a bounding interval  [z(k),L, z(k),U]for  z(k) =σ(ζ(k)). Then, LBFs for the monomials of the bi-linear form of Equation (5) can be derived using McCormick’s inequalities [22]:


for every  i = 1, . . . , nk, j = 1, . . . , nk−1and k = 1, . . . , K. The bounds of Equations (7)(8) can thus be used in Equation (5) to obtain LBFs on the pre-activation function of the following layer, i.e.  ζ(k+1). The final linear bound can be obtained by iterating the application of Lemma 1 and Equations (7)(8) through every layer. This is summarised in the following proposition, which is proved in the Supplementary Material (see [31]) along with an explicit construction of the LBFs.

Proposition 3. Let  f w(x)be the network defined by the set of Equations (4)(6). Then for every k = 0, . . . , K there exist lower and upper LBFs on the pre-activation function of the form:


where  ⟨·, ·⟩is the Frobenius product between matrices,  ·is the dot product between vectors, and the explicit formulas for the LBF coefficients, i.e.,  µ(k+1),Li , ν(l,k+1),Li, λ(k+1),Li , µ(k+1),Ui , ν(l,k+1),Ui, are given in the Supplementary Material [31].

Now let  ζ(k),Liand  ζ(k),Uirespectively be the minimum and the maximum of the right-hand side of the two equations above; then we have that  ∀x ∈ Tand  ∀w ∈ ˆH:


Again, while the lower and upper bounds on  f w(x)computed by Proposition 3 can be directly used to check whether ˆHis a safe set of weights, an improved bound can be obtained by directly considering the linear constraint form of S when computing the LBFs. This is further described in the Supplementary Material [31].


are added to ˆHin line 7. In line 10 we merge overlapping rectangles generated in the main algorithm loop, to guarantee that ˆHis a set of non-overlapping hyper-rectangles of weights (as for Corollary 1). This is done simply by iterating over the previously computed safe rectangles, and by merging them iteratively over each dimension. Finally, in lines 1215, by employing Corollary 1, we compute a lower bound for  Psafe(S, T). The following theorem, which is a straightforward consequence of Proposition 1, 2, 3 and Corollary 1, guarantees that Algorithm 1 returns a certified lower bound for  Psafe(S, T).

Theorem 1. Let p be as computed by Algorithm 1. Then, it holds that


In the general case, when a non-Gaussian distribution or full covariance matrix is employed (e.g. with SWAG [21] or HMC) the only modifications to make to the algorithm are in line 4, where an estimation of the weight variance can be used, and in line 14, which needs to be computed with Monte Carlo techniques. In this case, the weights sampled during integration can be utilised in line 3 of Algorithm 1. We expect that different posterior approximation methods with non-diagonal distributions will yield different probabilistic safety profiles, but stress that our method and its computational complexity is independent of the inference method used.

Example 2. Lower bounds for the property discussed in Example 1 are depicted in Figure 2 (rightmost plots). We analyse the influence of parameters involved in Algorithm 1 (the number of samples N and the weight margin γ), and the use of either IBP or LBP. As expected, we obtain higher values (that is, a tighter bound) as the number of samples increases, as this implies a better chance of finding a safe weight interval, and we observe similar behaviour for  γ. Notice also that the bounds provided by LBP (bottom row) are always slightly more tight than those provided by IBP (top row).

We explore the empirical performance of our framework. We begin with an extended analysis of the regression setting introduced in Example 1. We then turn our attention to an airborne collision avoidance scenario (VCAS) [16, 29]. Finally, we explore the scalability of our approach on the MNIST dataset, and show that we compute non-trivial lower bounds on probabilistic safety even for networks with over 1M parameters.

Regression Task In Figure 3 we explore the regression problem introduced in Example 1. We train a BNN with 128 hidden neurons for 10 epochs. We check the per-


Figure 3: Probabilistic safety for the regression task on various input regions. Each box in the plot represents a safety specification and is colored with the lower bound returned by our method on the probability that the specification is met. The purple region represents 2 standard deviations about the mean of the BNN.

formance of our methodology on this BNN under different properties by considering many input regions T and output sets S. In particular, we extend the property in Example 1 to multiple intervals along the  x−axis for different values of  ϵand  δand use LBP for the certifi-cation of safe weights regions. Namely, we explore four different property specifications for the combination of ϵ ∈ {0.1, 0.25}and  δ ∈ {2, 6}, where every box in the plot represents both an input (range along the x axis) and output region (range along the y axis) and is colored accordingly with the lower bound obtained (which hence represents a lower bound on the probability that the samples from the BNN will remain inside that box for each specific range of x-axis values). Naturally, we obtain higher values in regions in which the BNN is flat. Also the bounds decreases as soon as  ϵor  δincreases as these imply a tighter property specification.

Airborne Collision Avoidance We empirically evaluate probabilistic safety for the vertical collision avoidance system dataset (VCAS) [15]. The task of the original NN is to take as input the information about the geometric layout (heading, location, and speed) of the ownship and intruder, and return a warning if the ownship’s current heading puts it on course for a near midair collision (NMAC). There are four input variables describing the scenario ( Figure 4) and nine possible advisories corresponding to nine output dimensions. Each output is assigned a real-valued reward. The maximum reward advisory indicates the safest warning given the current intruder geometry. The three most prominent advisories are clear of conflict (COC), descend at a rate ≥1500 ft/min (DES1500), and climb at a rate  ≥1500 ft/min (CLI1500). We train a BNN with one hidden layer


Table 1: VCAS probabilistic safety. We see that, though LBP is 5 times slower than IBP, it gives slightly tighter lower bounds, similarly to the observations in Figure 2.

with 512 hidden neurons that focuses on the situation in which the ownship’s previous warning was COC, where we would like to predict if the intruder has moved into a position which requires action. This scenario is represented by roughly 5 million entries in the VCAS dataset and training our BNN with VI results in test accuracy of 91%. We use probabilistic safety to evaluate whether the network is robust to four properties, referred to as φ1, φ2, φ3and  φ4, which comprise a probabilistic extension of those considered for NNs in [16, 29]. Properties  φ1and  φ2test the consistency of DES1500 and CLI1500 advisories: given a region in the input space, φ1and  φ2ensure that the output is constrained such that DES1500 and CLI1500 are the maximal advisories for all points in the region, respectively. On the other hand, φ3and  φ4test that, given a hyper-rectangle in the input space, no point in the hyper-rectangle causes DES1500 or CLI1500 to be the maximal advisory. The properties we test are depicted in the centre and right plot of Figure 4. In Table 1 we report the results of the verification of the above properties, along with their computational times and the number of weights sampled for the veri-fication. Our implementation of Algorithm 1 with both LBP and IBP is able to compute a lower bound for probabilistic safety with these properties in a few hundreds of seconds.5 Notice that, also in this case, LBP gives tighter bounds than IBP, though it takes around 5 times longer to run, which is a similar trade-off to what is observed for deterministic NNs [32].

Image Classification on MNIST We train several BNNs on MNIST to study how our method scales with the number of neurons, considering networks that are hundreds of times larger than those for the VCAS dataset. For this analysis we consider image classifica-tion with the MNIST dataset. In order to model manipulations of an image we use the  l∞-norm  ϵ-ball around


Figure 4: VCAS encounter geometry and properties under consideration. Left: Taken from [15], a visualization of the encounter geometry and the four variables that describe it (distance  τ, ownship heading ˙hown, intruder heading ˙hint, and vertical separation h). Center: Visualization of ground truth labels (in color); red boxes indicate hyper-rectangles that make up the input areas for property φ1(red boxes in the blue area) and  φ2(red boxes in the green area). Right: Hyper-rectangle for visualization of properties  φ3 andφ4: we ensure that DES1500 is not predicted in the green striped box and CLI1500 is not predicted in the blue striped box.

test points. For all manipulations of magnitude up to  ϵ, we would like to check that the classification remains the same as that given by the (argmax of the) posterior predictive distribution. This can be done by first taking the classification according to the posterior on the test point x∗. Let i be the predicted class index, then we create a nC × nCmatrix,  CS, where the ith column is populated with ones and the diagonal is populated with negative ones, save for the ith entry which is set to one. Finally, ensuring that all entries of the vectors given by  CSf w(x)for all  x ∈ Tare positive indicates that the classification does not change.

Evaluation. Using IBP, we are able to certify that the probabilistic safety of more than half of the tested 100 images is greater than 0.9 in the case of a 2 hidden layer BNN with 256 nodes per layer. In Figure 5, we compare the lower bounds obtained with our approach with an empirical estimate obtained by sampling 500 posterior weights from the BNN posterior, so as to evaluate the tightness of the bounds obtained in practice. The results are given for the average over the same 100 images employed for the results reported in Figure 5. For 1 layer BNNs we use  ϵ = 0.025and for 2 layer BNNs we use ϵ = 0.001. Notice that tight bounds can be obtained even for BNNs with almost a million of parameters, in particular, for a two-layer BNN with 512 neurons per layer, we have that our bound is within 5% of the empirical results.

We considered probabilistic safety for BNNs, a worst-case probabilistic adversarial robustness measure, which can be used to certify a BNN against adversarial perturbations. We developed an algorithmic framework for the computation of probabilistic safety based on techniques from non-convex optimization, which computes a certi-fied lower bound of the measure. On experiments on various datasets we showed that our methods allows one to


Figure 5: Performance of our framework on BNN architectures with varying numbers of neurons. The height of each bar represents the mean of the probabilistic safety value computed on 100 test set images.

certify BNNs with general activation functions, multiple layers, and millions of parameters.

Acknowledgements This project was funded by the EU’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie (grant agreement No. 722022), the ERC under the European Unions Horizon 2020 research and innovation programme (grant agreement No. 834115) and the EPSRC Programme Grant on Mobile Autonomy (EP/M019918/1).

[1] Abate, A., Prandini, M., Lygeros, J., and Sastry, S. (2008). Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica, 44(11):2724–2734.

[2] Berkenkamp, F., Turchetta, M., Schoellig, A., and Krause, A. (2017). Safe model-based reinforcement learning with stability guarantees. In NeurIPS.

[3] Biggio, B. and Roli, F. (2018). Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 84:317–331.


[6] Carbone, G., Wicker, M., Laurenti, L., Patane, A., Bortolussi, L., and Sanguinetti, G. (2020). Robustness of Bayesian neural networks to gradient-based attacks. arXiv preprint arXiv:2002.04359.

[7] Cardelli, L., Kwiatkowska, M., Laurenti, L., Pao- letti, N., Patane, A., and Wicker, M. (2019). Statistical guarantees for the robustness of Bayesian neural networks. IJCAI.

[8] Cardelli, L., Kwiatkowska, M., Laurenti, L., and Patane, A. (2018). Robustness guarantees for Bayesian inference with Gaussian processes. In AAAI.

[9] Dvijotham, K., Garnelo, M., Fawzi, A., and Kohli, P. (2018). Verification of deep probabilistic models. arXiv preprint arXiv:1812.02795.

[10] Emtiyaz Khan, M., Immer, A., Abedi, E., and Ko- rzepa, M. (2019). Approximate inference turns deep networks into Gaussian processes. NeurIPS.

[11] Gal, Y. (2016). Uncertainty in deep learning. PhD thesis, University of Cambridge.

[12] Gal, Y. and Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In ICML, pages 1050–1059.

[13] Gowal, S., Dvijotham, K., Stanforth, R., Bunel, R., Qin, C., Uesato, J., Arandjelovic, R., Mann, T., and Kohli, P. (2018). On the effectiveness of interval bound propagation for training verifiably robust models. SecML 2018.

[14] He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In ICCV.

[15] Julian, K. D. and Kochenderfer, M. J. (2019). Guar- anteeing safety for neural network-based aircraft collision avoidance systems. DASC.

[16] Katz, G., Barrett, C., Dill, D. L., Julian, K., and Kochenderfer, M. J. (2017). Reluplex: An efficient SMT solver for verifying deep neural networks. In CAV.

[17] Kendall, A. and Gal, Y. (2017). What uncertainties do we need in Bayesian deep learning for computer vision? In NeurIPS.

[18] Laurenti, L., Lahijanian, M., Abate, A., Cardelli, L., and Kwiatkowska, M. (2020). Formal and efficient

synthesis for continuous-time linear stochastic hybrid processes. IEEE Transactions on Automatic Control.

[19] Liu, X., Li, Y., Wu, C., and Hsieh, C.-J. (2019). Adv-bnn: Improved adversarial defense through robust Bayesian neural network. ICLR.

[20] MacKay, D. J. (1992). A practical Bayesian frame- work for backpropagation networks. Neural computation, 4(3):448–472.

[21] Maddox, W. J., Izmailov, P., Garipov, T., Vetrov, D. P., and Wilson, A. G. (2019). A simple baseline for Bayesian uncertainty in deep learning. In Advances in Neural Information Processing Systems, pages 13132–13143.

[22] McCormick, G. P. (1976). Computability of global solutions to factorable nonconvex programs: Part i convex underestimating problems. Mathematical programming, pages 147–175.

[23] Michelmore, R., Wicker, M., Laurenti, L., Cardelli, L., Gal, Y., and Kwiatkowska, M. (2019). Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control. ICRA.

[24] Neal, R. M. (2012). Bayesian learning for neural networks. Springer Science & Business Media.

[25] Rawat, A., Wistuba, M., and Nicolae, M.-I. (2017). Adversarial phenomenon in the eyes of Bayesian deep learning. arXiv preprint arXiv:1711.08244.

[26] Ruan, W., Huang, X., and Kwiatkowska, M. (2018). Reachability analysis of deep neural networks with provable guarantees. IJCAI.

[27] Smith, L. and Gal, Y. (2018). Understanding mea- sures of uncertainty for adversarial example detection.

[28] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014). Intriguing properties of neural networks. ICLR.

[29] Wang, S., Pei, K., Whitehouse, J., Yang, J., and Jana, S. (2018). Formal security analysis of neural networks using symbolic intervals. In USENIX Security 18.

[30] Weng, T.-W., Zhang, H., Chen, H., Song, Z., Hsieh, C.-J., Boning, D., Dhillon, I. S., and Daniel, L. (2018). Towards fast computation of certified robustness for relu networks. ICML.

[31] Wicker, M., Laurenti, L., Patane, A., and Kwiatkowska, M. (2020). Probabilistic safety for Bayesian neural networks. arXiv:2004.10281.

[32] Zhang, H., Weng, T.-W., Chen, P.-Y., Hsieh, C.- J., and Daniel, L. (2018). Efficient neural network robustness certification with general activation functions. In NeurIPS, pages 4939–4948.

In Proposition 4 we show that probabilistic safety, as de-fined in Definition 1, gives a lower bound to other measures commonly used to guarantee the absence of adversarial examples.

Proposition 4. For  S ⊆ Rncit holds that


Moreover, if for  i ∈ {1, ..., nc}, we assume that S = {y ∈ Rnc | yi > a}. Then, it holds that


Proof of Proposition 4


where the last inequality is due to Markov’s inequality.

Algorithm 1 has a complexity which is linear in the number of samples, N, taken from the posterior distribution of w. The computational complexity of the method is then determined by the computational complexity of the method used to propagate a given interval ˆH(that is, line 5 in Algorithm 1). The cost of performing IBP is O(Knm) where K is the number of hidden layers and n × mis the size of the largest weight matrix  W (k), for k = 0, . . . , K. LBP is instead  O(K2nm).

In this section of the Supplementary Material we provide proofs for the main propositions stated in the paper.

C.1 Proposition 2

The bounding box can be computed iteratively in the number of hidden layers of the network, K. We show how to compute the lower bound of the bounding box; the computation for the maximum is analogous.

Consider the k-th network layer, for k = 0, . . . , K, we want to find for  i = 1, . . . nk+1:


As the activation function  σis monotonic, it suffice to find the minimum of: �nkj=1 W (k)ij z(k)j + b(k)i. Since W (k)ij z(k)jis a bi-linear form defined on an hyper-rectangle, it follows that it obtains its minimum in one of the four corners of the rectangle  [W (k),Lij , W (k),Uij ] ×[z(k),Lj , z(k),Uj ].


that is  z(k+1),Li = σ�ζ(k+1),Li �is a lower bound to the solution of the minimisation problem posed above.

C.2 Proposition 3

We first state the following Lemma that follows directly from the definition of linear functions:

Lemma 2. Let  f L(t) = �j aLj tj + bLand  f U(t) =�j aUj tj + bUbe lower and upper LBFs to a function g(t) ∀t ∈ T, i.e.  f L(t) ≤ g(t) ≤ f U(t) ∀t ∈ T. Consider two real coefficients  α ∈ Rand  β ∈ R. Define




That is, LBFs can be propagated through linear transformation by redefining the coefficients through Equations (9)(10).

We now proof Proposition 3 iteratively on k = 1, . . . , K that is that for  i = 1, . . . , nkthere exist  f (k),Li (x, W)and  f (k),Ui (x, W)lower and upper LBFs such that:


and iteratively find valid values for the LBFs coefficients, i.e.,  µ(k),Li , ν(l,k),Li , λ(k),Li , µ(k),Ui , ν(l,k),Uiand  λ(k),Ui.

For the first hidden-layer we have that  ζ(1)i =�j W (0)ij xj +b(0)i. By inequality (7) and using the lower bound for  b(0)iwe have:


which is a lower LBF on  ζ(1). Similarly using Equation (8) we obtain:


which is an upper LBF on  ζ(1). By setting:


we obtains LBFs  f (1),Li (x, W)and  f (1),Ui (x, W)of the form (11)(12).

Given the validity of Equations (11)(12) up to a certain k we now show how to compute the LBF for layer k + 1, that is given  f (k),Li (x, W)and f (k),Ui (x, W)we explicitly compute  f (k+1),Li (x, W)and  f (k+1),Ui (x, W). Let  ζ(k),Li = min f (k),Li (x, W)and  ζ(k),Ui = max f (k),Ui (x, W)the minimum and maximum of the two LBFs (that can be computed analytically as the functions are linear). For Lemma 1 there exists a set of coefficients such that  z(k)i = σ(ζ(k)i ) ≥α(k),Li ζ(k)i + β(k),Li. By Lemma 2 we know that there exists ¯f (k),Li (x, W)with coefficients  ¯µ(k),Li , ¯ν(l,k),Li, ¯λ(k),Liobtained through Equations 9–10 such that:


that is ¯f (k),Li (x, W)is a lower LBF on  z(k)iwith coefficients  ¯µ(k),Li , ¯ν(l,k),Li, ¯λ(k),Li. Analogously let ¯f (k),Ui (x, W)be the upper LBF on  z(k)icomputed in a similar way.

Consider now the bi-linear layer ζ(k+1)i =�j W (k)ij z(k)j + b(k)i. From Equation (7) we know that: W (k)ij z(k)j ≥ W (k),Lij z(k)j + W (k)ij z(k),Lj − W (k),Lij z(k),Lj. By applying Lemma 2 with  α = W (k),Lijand  β = 0we know that there exists a lower LBF ˆf (k),Lij (x, W)with a set of coefficients  a(k),Lij , b(l,k),Lijand  c(k),Lijcomputed applying Equations (9)(10) to  ¯µ(k),Li , ¯ν(l,k),Li, ¯λ(k),Lisuch that:  W (k),Lij z(k)j ≥ ˆf (k),Lij (x, W). Hence we have:


By setting


and re-arranging the elements in the above inequality, we finally obtain:


which is of the form of Equation (11) for the lower LBF for the k+1-th layer. Similarly an upper LBF of the form of Equation (12) can be obtained by using Equation (8) in the chain of inequalities above.

In this section of the Supplementary Material we discuss how the output of IBP and LBP can be used to check against specification of the form of Assumption 1, that is of the form:


with  CS ∈ RnS×ncand  dS ∈ RnS, ¯H = [wL, wU]and T = [xL, xU]. Let  i = 1, . . . , nSthen we need to check for every i whether:  CS,i · f w(x) + dS,i ≥ 0. Which is equivalent to compute


and checking whether that is greater or equal to zero or not. As presented in the main paper, Propositions 2 and 3 return a bounding box for the final output. Though this bounding box can be directly used to compute the minimum in Equation (13) (as this entails simply the minimisation of a linear function on a rectangular space), tighter bounds can be obtained both for IBP and for LBP. This is described in the following two subsections.


For IBP we can do something similar to what is done in the case of IBP for deterministic NN [13]. Instead of propagating the bounding box up until the very last layer to  f wi (x) = ζK+1i, we stop at the last hidden activation function values  z(K)i. We thus have:


Notice that �ncj=1 CS,ijW (K)jland �ncj=1 CS,ijb(K)jare linear transformation, of the weights and biases. The lower and upper bounds on  W (K)and  b(K)can hence be propagated through this two functions to obtain lower and upper bounds that account for the specification W (K),LS , W (K),US , b(K),LSand  b(K),US. Propagating that interval through the layer:


gives a solution to Equation (13).


For LBP one can simply proceed by propagating the linear bound obtained. In fact, Proposition 3 yields an upper and lower LBFs,  f (K+1),L(x, w)and  f (K+1),U(x, w)on f w(x)for every  x ∈ Tand  w ∈ ˆH. By Lemma 2 those two LBFs can simply be propagated through the linear specification of Equation (13) hence obtaining lower and upper LBFs on the full specification, which can then be minimised to checked for the validity of the interval ˆH.

All our experiments were conducted on a server equipped with two 24 core Intel Xenon 6252 processors and 256GB of RAM. For VCAS experiments no parallelization was necessary, whereas MNIST was parallelized over 25 concurrent threads.

Designed for Accessibility and to further Open Science