Probabilistic Safety Constraints for Learned High Relative Degree System Dynamics

2019·Arxiv

Abstract

Abstract

This paper focuses on learning a model of system dynamics online while satisfying safety constraints. Our motivation is to avoid offline system identification or hand-specified dynamics models and allow a system to safely and autonomously estimate and adapt its own model during online operation. Given streaming observations of the system state, we use Bayesian learning to obtain a distribution over the system dynamics. In turn, the distribution is used to optimize the system behavior and ensure safety with high probability, by specifying a chance constraint over a control barrier function. Keywords: Gaussian Process, high relative-degree system safety, control barrier function

1. Introduction

Unmanned vehicles promise to transform many aspects of our lives, including transportation, agriculture, mining, and construction. Successful use of autonomous robots in these areas critically depends on the ability of robots to safely adapt to changing operational conditions. Existing systems, however, rely on brittle hand-designed dynamics models and safety rules that often fail to account for both the complexity and uncertainty of real-world operation. Recent work (Deisenroth and Rasmussen, 2011; Dean et al., 2019; Sarkar et al., 2019; Coulson et al., 2019; Chen et al., 2018; Khojasteh et al., 2018; Liu et al., 2019; Umlauft and Hirche, 2019; Fan et al., 2020; Chowdhary et al., 2014) has demonstrated that learning-based system identification and control techniques may be successful at complex tasks and control objectives. However, two critical considerations for applying these techniques onboard autonomous systems remain unattended: learning online, relying on streaming data, and guaranteeing safe operation, despite the uncertainty inherent to learning algorithms.

Motivated by the utility of Lyapunov functions for certifying stability properties, (Ames et al.,

2016; Xu et al., 2017; Xu et al., 2015; Prajna et al., 2007; Ames et al., 2019) proposed Control Barrier Functions (CBFs) as a tool for characterizing the long-term safety of dynamical systems. A CBF certifies whether a control policy achieves forward invariance of a safe set C by evaluating if the system trajectory remains away from the boundary of C. Most of the literature on CBFs considers systems with known dynamics, low relative degree, no disturbances, and time-triggered control, in which the control inputs are recalculated at a fixed and sufficiently small period. This is limiting because, low control frequency in a time-triggered setting may lead to safety constraint violation in-between sampling times. On the other hand, high control frequency leads to inefficient use of computational resources and actuators. (Yang et al., 2019) extend the CBF framework to a self-triggered setup in which the longest time until a control input needs to be recomputed to guarantee safety is provided. CBF techniques handle nonlinear control-affine systems but many existing results apply only to relative-degree-one systems, in which the first time derivative of the CBF depends on the control input. This requirement is violated by many underactuated robot systems and motivated extensions to relative-degree-two systems, such as bipedal and car-like robots. (Hsu et al., 2015; Nguyen and Sreenath, 2016b). (Nguyen and Sreenath, 2016a) generalized these ideas by designing an exponential control barrier function (ECBF) capable of handling control-affine systems with any relative degree.

Providing safety guarantees for learning-based control techniques has lately been the focus of research. (Koller et al., 2018; Berkenkamp et al., 2016; Fisac et al., 2018; Bastani, 2019; Wabersich and Zeilinger, 2018; Biyik et al., 2019). In particular, the CBF framework have been extend to systems with unknown dynamics. For example, techniques for handling additive disturbances have been proposed in (Clark, 2019; Santoyo et al., 2019), while CBF conditions for systems with uncertain dynamics have been proposed in (Fan et al., 2019; Wang et al., 2018; Taylor and Ames, 2019; Cheng et al., 2019; Salehi et al., 2019). Furthermore, (Fan et al., 2019) study time-triggered CBF-based controllers for control-affine systems with relative degree one, where the input gain part of the dynamics is known and invertible. Bayesian learning is used in (Fan et al., 2019) to determine a distribution over the drift term of the dynamics. In particular, (Fan et al., 2019) compared the performances of Gaussian Process regression (Williams and Rasmussen, 2006), Dropout neural networks (Gal and Ghahramani, 2016), and ALPaCA (Harrison et al., 2018) in simulations. (Wang et al., 2018), (Cheng et al., 2019), and (Taylor and Ames, 2019) have studied time-triggered CBFbased control relative-degree-one systems in presence of additive uncertainty in the drift part of the dynamics. In (Wang et al., 2018), GP regression is used to approximate the unknown part of the 3D nonlinear dynamics of a quadrotor. (Cheng et al., 2019) proposed a two-layers control design architecture that integrates CBF-based controllers with model-free reinforcement learning. (Taylor and Ames, 2019) proposed adaptive CBFs to deal with parameter uncertainty. (Salehi et al., 2019) studies nonlinear systems only with drift terms and uses Extreme Learning Machines to approximate the dynamics.

Our work proposes a learning approach for estimating posterior distribution of robot dynamics from online data to design a control policy that guarantees safe operation. We make the following contributions. First, we develop a matrix variate Gaussian Process (GP) regression approach with efficient covariance factorization to learn the drift term and input gain terms of a nonlinear control-affine system. Second, we use the GP posterior to specify a probabilistic safety constraint and determine the longest time until a control input needs to be recomputed to guarantee safety with high probability. Finally, we extend our formulation to dynamical systems with arbitrary relative degree and show that a safety constraint can be specified only in terms of the mean and variance of the Lie derivatives of the CBF. Notation, proofs, and additional remarks are available in the appendix at arXiv (Khojasteh et al., 2019).

2. Background

Consider a control-affine nonlinear system:

where and are the system state and control input, respectively, at time t. Assume that the drift term and the input gain are locally Lipschitz. We study the problem of enforcing probabilistic safety properties via CBF when f and g are unknown. We first review key results on CBF-based safety for known dynamics (Ames et al., 2019).

2.1. Known Dynamics: Control Barrier Functions for Safety

Let be a safe set of system states. Assume is specified as the superlevel set of , a continuously differentiable function , such that . For any initial condition x(0), there exists a maximum time interval is a unique solution to (1) (Khalil, 2002). System (1) is safe with respect to set C if C is forward invariant, i.e., for any in C for all t in I(x(0)). System safety may be asserted as follows.

Definition 1 A function is a control barrier function (CBF) for the system in (1) if the control barrier condition (CBC), , is satisfied for all ; where CBCis any extended class function and and are the Lie derivatives of h along f and g, respectively.

Theorem 1 (Sufficient Condition for Safety (Ames et al., 2019)) Consider a safe set C with associated function . If for all , then any Lipschitz continuous control policy renders the system in (1) safe.

Ames et al. (2019) also provide a necessary condition for safety allowing a concise charaterization:

2.2. Known Dynamics: Optimization-based Safe Control

The results in Sec. 2.1 allow designing a control policy that guarantees system safety as long as CBCremains positive at all times. In practice, this is achieved by solving a quadratic program (QP) repeatedly at triggering times

where . While the QP above cannot be solved infinitely fast, Theorem 3 of Ames et al. (2016) shows that if are locally Lipschitz, then CBCare locally Lipschitz. Thus, for sufficiently small safety during the inter-triggering times as well.

3. Problem Statement

Consider a control-affine nonlinear system (1), where is unknown. Our objective is to estimate F(x) from online observations of the system state and control trajectory and ensure that (1) remains safe with respect to a set C.

Problem 1 Given a prior Gaussian Process distribution vecon the unknown system dynamics and a training set

, , compute the posterior Gaussian Process distribution conditioned on

Problem 2 Given a safe set C, and a safe system state , and the distribution of vec(F(x)) at time , choose a control input and triggering period such that:

where x(t) follows the dynamics in (1), and is a user-specified risk tolerance.

4. Matrix Variate Gaussian Process Regression of System Dynamics

We propose an efficient Gaussian Process (GP) regression approach to estimate a posterior distribution over the dynamics F(x) of the nonlinear control-affine systems (1). The posterior will be used to determine the distribution of CBC(x, u) in Sec. 53. Since F(x) is matrix-valued, we define a GP over its columnwise vectorization, vec. The controller can observe and without noise, but the measurements might be noisy. As the controller observes f(x) and g(x) together via , there may be a correlation between their different components. Thus, we develop an efficient factorization of based on the Matrix Variate Gaussian distribution (Sun et al., 2017; Louizos and Welling, 2016) to learn f(x) and g(x) together. We provide definition and properties of the MVG distribution in Appendix B.1. Two alternative approaches to infer a posterior over F(x) and their drawbacks are also discussed in Appendix B.1.

Note that if . Based on this observation, we propose the following GP parameterization for the vector-valued functions vec(F(x)):

The above parameterization is efficient as compared to learning the full covariance , because we need to learn smaller matrices, and . Fortunately, this parameterization also preserves its structure on inference.

that the inter-triggering times are sufficiently small. 3. We only consider epistemic but no aleatoric uncertainty. Namely, while F(x) is sampled from a GP, no additive disturbances are considered for the dynamics (1).

Consider the training set and a query test point . The train and test data are jointly Gaussian:

In the above formulation, the resulting posterior is independent of query control input, , which allows us to use this posterior in Sec. 5 to efficiently compute a safe control input. To simplify notation, let be a matrix with elements and define and . Applying a Schur complement, we can derive the posterior distribution of vecconditioned on as a Gaussian Process parameters:

B

This inference has a computation complexity of while the same for independent GP is . Since k >> m is common, the proposed model has almost same inference cost as independent GP. Step by step details are provided in Appendix C.1.2. For a given query control input , the posterior of

5. Self-triggered Control with Probabilistic Safety Constraints

Sec. 4 addressed Problem 1 by proposing an efficient Gaussian Process inference algorithm for nonlinear control-affine systems. Now, we consider Problem (2). As discussed in Sec. 2.1 if f and g are locally Lipschitz, then system (1) has a unique solution for any x(0) for all time t in I(x(0)). We assume the sample paths of the GP used to model the dynamics (1) are locally Lipschitz with high probability. Similar smoothness assumption has been made previously in Srinivas et al. (2010). As mentioned in Problem (2), we use a zero-order hold (ZOH) control mechanism in inter-triggering time, i.e., for . In detail, we assume that for any , and triggering time , there exists a constant , such that,

This assumption is valid for a large class of GPs, e.g., those with stationary kernels that are four times differentiable, such as squared exponential and some Matérn kernels (Ghosal et al., 2006; Shekhar et al., 2018). However, it may not hold for GPs with highly erratic sample paths.

The posterior of F(x)u in (6) induces a distribution over CBC(x, u). To ensure that safety in the sense of (4) is preserved over a period of time , we enforce a tighter constraint at time and determine the time for which it remains valid. In detail, we solve a chance-constrained version of (3) at time

where . The choice of and its effect on is discussed next.

Lemma 2 Consider the dynamics in (1) with posterior distribution in (6). Given and , CBCis a Gaussian random variable with the following parameters:

Using Lemma 2, we can rewrite the safety constraint as

where is the cumulative distribution function of the standard Gaussian. Note that if the control input is chosen so that , as the posterior variance of CBCtends to zero, the probability tends to one. Namely, as the uncertainty about the system dynamics tends to zero, our results reduce to the setting of Sec. 2.1, and safety can be ensured with probability one. Noting that , controller (8) can be rewritten as

The program (12) provides a probabilistic safety constraints at the triggering times . Next, we will extend our analysis to inter-triggering times . We continue by re-writing the Proposition 1 of (Yang et al., 2019) for our setup.

Proposition 1 Consider the system in (1) with zero-order hold control in inter-triggering times. If the event (7) occurs at the kth triggering time, then for all

Recall from Sec. 2.1 that h is a continuously differentiable function. Thus using Proposition 1, we notice for any inter-triggering time , there exist a constant

This is used in the next theorem which concerns Problem 2.

Theorem 3 Consider the system in (1) with safe set C. Assume the program (8) has a solution at triggering time , event (7) occurs at least with probability , and for all , satisfies the following Lipschitz property

Then (4) is valid for is given in (14).

Remark 4 Assuming in Theorem (3) is not restricting our results. Since, if the state of the system is safe and it does not change it remains safe.

6. Extension to Higher Relative-degree Systems

Next, we extend the probabilistic safety constraint formulation for systems with arbitrary relative degree, using an exponential control barrier function (ECBF) (Nguyen and Sreenath, 2016a; Ames et al., 2019) 4

Let be the relative degree of h(x), that is, and , . Define traverse dynamics with traverse vector

where are defined in Appendix A.

Definition 2 A function is an exponential control barrier function (ECBF) for the system in (1) if there exists a row vector such that the rth order condition CBC, which results in

If is chosen appropriately (see Appendix B.2), a control policy that ensures CBC0, renders the dynamics (1) safe with respect to set C. Thus, as in (8), we are interested in solving

Proposition 2 For a control-affine system of relative degree r, the expectation is affine in is quadratic in u (Proof in Sec C.2.3).

Proposition 3 For a control-affine system of relative degree r, as defined in (1), the system stays in the safe set C with ECBF h if the control is determined from the following Quadratically Constrained Quadratic Program (QCQP) (Proof in Sec C.2.4),

Solving the program (18) requires the knowledge of the mean and variance of CBC(see Thm. 8 in Appendix C.3.1 for CBC). In general, Monte Carlo sampling could be used to estimate these quantities. The chance constraint in (18) can be interpreted the standard deviation of CBCshould be smaller than the mean by a factor of

7. Simulations

We evaluate the proposed approach on a pendulum with mass m and length l with state and control-affine dynamics as depicted in Fig 1. A safe set is chosen as the complement of a radial region that needs to be avoided.

Figure 1: Top left: Pendulum simulation (left) with an unsafe (red) region. Top right: The pendulum trajectory (middle) resulting from the application of safe control inputs (right) is shown. Bottom row: Learned vs true pendulum dynamics using matrix variate Gaussian Process regression

The controller knows a priori that the system is control-affine with relative degree two, but it is not aware of f and g. The control barrier function is thus . We formulate a quadratically constrained quadratic program as in (18) for r = 2. We specify a task requiring the pendulum to track a reference control signal and specify the optimization objective as . We initialize the system with parameters . The system dynamics are approximated accurately (see Fig. 1) while the system remains in the safe region (see Fig. 1). An -greedy exploration strategy is used to sample . We use an exponentially decreasing -greedy scheme going from 1 to 0.01 in 100 steps. Negative control inputs get rejected by the CBF-based constraint, while positive inputs allow the pendulum to bounce back from the unsafe region.

8. Conclusion

Allowing artificial systems to safely adapt their own models during online operation will have significant implications for their successful use in unstructured, changing real-world environments. This paper developed a Bayesian inference approach to approximate system dynamics and their uncertainty from online observations. The posterior distribution over the dynamics may be used to enforce probabilistic constraints that guarantee safe online operation with high probability. Our results offer a promising approach for controlling complex systems in challenging environments. Future work will focus on extending the self-triggering time analysis to systems with higher relative degree and on applications of the proposed approach to real robot systems.

Acknowledgments

We gratefully acknowledge support from NSF awards CNS-1446891 and ECCS-1917177, and support from ARL DCIST CRA W911NF-17-2-0181.

References

Prithvi Akella, Mohamadreza Ahmadi, Richard M Murray, and Aaron D Ames. Formal test synthesis for safety-critical autonomous systems based on control barrier functions. arXiv preprint arXiv:2004.04227, 2020.

Mauricio A Alvarez, Lorenzo Rosasco, and Neil D Lawrence. Kernels for vector-valued functions: A review. Foundations and Trends in Machine Learning, 4(3):195–266, 2012.

A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada. Control barrier functions: Theory and applications. In 2019 18th European Control Conference (ECC), pages 3420–3431, June 2019. doi: 10.23919/ECC.2019.8796030.

Aaron D Ames, Xiangru Xu, Jessy W Grizzle, and Paulo Tabuada. Control barrier function based quadratic programs for safety critical systems. IEEE Transactions on Automatic Control, 62(8): 3861–3876, 2016.

Osbert Bastani. Safe planning via model predictive shielding. arXiv preprint arXiv:1905.10691, 2019.

Felix Berkenkamp, Angela P. Schoellig, and Andreas Krause. Safe controller optimization for quadrotors with Gaussian processes. In Proc. of the IEEE International Conference on Robotics and Automation (ICRA), pages 493–496, 2016.

Erdem Biyik, Jonathan Margoliash, Shahrouz Ryan Alimo, and Dorsa Sadigh. Efficient and safe exploration in deterministic markov decision processes with unknown transition models. In 2019 American Control Conference (ACC), pages 1792–1799. IEEE, 2019.

Steven Chen, Kelsey Saulnier, Nikolay Atanasov, Daniel D Lee, Vijay Kumar, George J Pappas, and Manfred Morari. Approximating explicit model predictive control using constrained neural networks. In 2018 Annual American Control Conference (ACC), pages 1520–1527. IEEE, 2018.

Richard Cheng, Gábor Orosz, Richard M Murray, and Joel W Burdick. End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3387–3395, 2019.

Girish Chowdhary, Hassan A Kingravi, Jonathan P How, and Patricio A Vela. Bayesian nonparametric adaptive control using Gaussian processes. IEEE transactions on neural networks and learning systems, 26(3):537–550, 2014.

Andrew Clark. Control barrier functions for complete and incomplete information stochastic systems. In 2019 American Control Conference (ACC), pages 2928–2935. IEEE, 2019.

Jeremy Coulson, John Lygeros, and Florian Dörfler. Data-enabled predictive control: in the shallows of the deepc. In 2019 18th European Control Conference (ECC), pages 307–312. IEEE, 2019.

Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, and Stephen Tu. On the sample complexity of the linear quadratic regulator. Foundations of Computational Mathematics, Aug 2019.

Marc Deisenroth and Carl E Rasmussen. PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on machine learning (ICML-11), pages 465–472, 2011.

David D Fan, Jennifer Nguyen, Rohan Thakker, Nikhilesh Alatur, Ali-akbar Agha-mohammadi, and Evangelos A Theodorou. Bayesian learning-based adaptive control for safety critical systems. arXiv preprint arXiv:1910.02325, 2019.

David D Fan, Ali-akbar Agha-mohammadi, and Evangelos A Theodorou. Deep learning tubes for tube MPC. arXiv preprint arXiv:2002.01587, 2020.

Jaime F Fisac, Anayo K Akametalu, Melanie N Zeilinger, Shahab Kaynama, Jeremy Gillula, and Claire J Tomlin. A general safety framework for learning-based control in uncertain robotic systems. IEEE Transactions on Automatic Control, 2018.

Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059, 2016.

Subhashis Ghosal, Anindya Roy, et al. Posterior consistency of Gaussian process prior for nonpara- metric binary regression. The Annals of Statistics, 34(5):2413–2429, 2006.

James Harrison, Apoorva Sharma, and Marco Pavone. Meta-learning priors for efficient online Bayesian regression. arXiv preprint arXiv:1807.08912, 2018.

Shao-Chen Hsu, Xiangru Xu, and Aaron D Ames. Control barrier function based quadratic programs with application to bipedal robotic walking. In 2015 American Control Conference (ACC), pages 4542–4548. IEEE, 2015.

Hassan K Khalil. Nonlinear systems; 3rd ed. Prentice-Hall, Upper Saddle River, NJ, 2002.

Mohammad Javad Khojasteh, Anatoly Khina, Massimo Franceschetti, and Tara Javidi. Learning- based attacks in cyber-physical systems. arXiv preprint arXiv:1809.06023, 2018.

Mohammad Javad Khojasteh, Vikas Dhiman, Massimo Franceschetti, and Nikolay Atanasov. Proba- bilistic safety constraints for learned high relative degree system dynamics. arXiv preprint arXiv: 1912.10116, 2019.

Torsten Koller, Felix Berkenkamp, Matteo Turchetta, and Andreas Krause. Learning-based model predictive control for safe exploration. In 2018 IEEE Conference on Decision and Control (CDC), pages 6059–6066. IEEE, 2018.

Anqi Liu, Guanya Shi, Soon-Jo Chung, Anima Anandkumar, and Yisong Yue. Robust regression for safe exploration in control. arXiv preprint arXiv:1906.05819, 2019.

Christos Louizos and Max Welling. Structured and efficient variational deep learning with matrix Gaussian posteriors. In International Conference on Machine Learning, pages 1708–1716, 2016.

Quan Nguyen and Koushil Sreenath. Exponential control barrier functions for enforcing high relative- degree safety-critical constraints. In 2016 American Control Conference (ACC), pages 322–328. IEEE, 2016a.

Quan Nguyen and Koushil Sreenath. Optimal robust control for constrained nonlinear hybrid systems with application to bipedal locomotion. In 2016 American Control Conference (ACC), pages 4807–4813. IEEE, 2016b.

Stephen Prajna, Ali Jadbabaie, and George J Pappas. A framework for worst-case and stochastic safety verification using barrier certificates. IEEE Transactions on Automatic Control, 52(8): 1415–1428, 2007.

Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V Dimarogonas, Stephen Tu, and Nikolai Matni. Learning control barrier functions from expert demonstrations. arXiv preprint arXiv:2004.03315, 2020.

Iman Salehi, Gang Yao, and Ashwin P Dani. Active sampling based safe identification of dynamical systems using extreme learning machines and barrier certificates. In 2019 International Conference on Robotics and Automation (ICRA), pages 22–28. IEEE, 2019.

Cesar Santoyo, Maxence Dutreix, and Samuel Coogan. A barrier function approach to finite-time stochastic system verification and control. arXiv preprint arXiv:1909.05109, 2019.

Tuhin Sarkar, Alexander Rakhlin, and Munther A Dahleh. Finite-time system identification for partially observed LTI systems of unknown order. arXiv preprint arXiv:1902.01848, 2019.

Shayle R Searle and Marvin HJ Gruber. Linear models. John Wiley & Sons, 1971.

Shubhanshu Shekhar, Tara Javidi, et al. Gaussian process bandits with adaptive discretization. Electronic Journal of Statistics, 12(2):3829–3874, 2018.

Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias Seeger. Gaussian process opti- mization in the bandit setting: no regret and experimental design. In In International Conference on Machine Learning, 2010.

Shengyang Sun, Changyou Chen, and Lawrence Carin. Learning Structured Weight Uncertainty in Bayesian Neural Networks. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 1283–1292, 2017.

Andrew J Taylor and Aaron D Ames. Adaptive safety with control barrier functions. arXiv preprint arXiv:1910.00555, 2019.

Jonas Umlauft and Sandra Hirche. Feedback linearization based on Gaussian processes with event- triggered online learning. IEEE Transactions on Automatic Control, 2019.

Kim P Wabersich and Melanie N Zeilinger. Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning. arXiv preprint arXiv:1812.05506, 2018.

Li Wang, Evangelos A Theodorou, and Magnus Egerstedt. Safe learning of quadrotor dynamics using barrier certificates. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 2460–2465. IEEE, 2018.

Christopher KI Williams and Carl Edward Rasmussen. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.

X. Xu, T. Waters, D. Pickem, P. Glotfelter, M. Egerstedt, P. Tabuada, J. W. Grizzle, and A. D. Ames. Realizing simultaneous lane keeping and adaptive speed regulation on accessible mobile robot testbeds. In IEEE Conference on Control Technology and Applications (CCTA), pages 1769–1775, 2017.

Xiangru Xu, Paulo Tabuada, Jessy W Grizzle, and Aaron D Ames. Robustness of control barrier functions for safety critical control. IFAC-PapersOnLine, 48(27):54–61, 2015.

Guang Yang, Calin Belta, and Roberto Tron. Self-triggered control for safety critical systems using control barrier functions. In 2019 American Control Conference (ACC), pages 4454–4459. IEEE, 2019.

Appendix A. Notations

We use for the Kronecker product, and erf for the Gauss error function. denotes the boundry of the set be the vectorization of , obtained by stacking the columns of X. For the functions and the Hessian and Jacobian are defined as follows

The following parameters are useful for our analysis in Sec. 6

Appendix B. Remarks

B.1. Matrix Variate Gaussians

Definition 3 The Matrix Variate Gaussian (MVG) distribution is a three-parameter distribution MN(M, A, B) describing a random matrix with probability density function:

where is the mean, and encode the covariance matrix of the rows and columns of X, respectively.

The two following lemmas are used to derive our results in Sec. 4.

Lemma 6 (Linear form on MVG) Let X follow a MVG distribution

and

We refer the reader for the proof of (23) to the paper (Sun et al., 2017), and the the proof of (24) is included in the appendix C.1.1.

Here we discuss two alternative approaches to infer a posterior over F(x), and mention the benifits of the proposed MVG framework with respect to them. The first alternative approach is to develop a decoupled GP regression per system dimension, which, unlike our MVG approach, does not model the dependencies between different components of f(x) and g(x). Moreover, the inference computational complexity of our MVG approach is while the same for independent GP is is the number of data points and m is the control dimension. Thus for k >> m which is common, the MVG has similar computational complexity as the independent GP approach but provides greater inference flexibility. The second alternative approach is the Coregionalization models (Alvarez et al., 2012), where we can simplify the covariance structure by assuming that decomposes into a scalar state-dependent kernel and an output-dimension-dependent covariance matrix approach is more efficient as for systems with large state or control dimensions, learning the parameters of may require large amounts of training data. Moreover, the nice matrix-times-scalar-kernel structure is not preserved in the posterior using these Coregionalization models.

B.2. Further discussion about ECBF

If satisfies the properties stated in in the next theorem, any control policy that ensures CBC, renders the dynamics (1) safe with respect to set C.

Theorem 7 (Designing Nguyen and Sreenath, 2016a; Ames et al., 2019)) Function h(x) is Exponential CBF if is chosen with following conditions.

1. is Hurwitz and total negative (resulting in negative real poles).

2. The eigenvalues of the system (16) satisfy , where is recursively defined as of the characteristic polynomial of the matrix

For relative degree one systems reduces to with . Thus, , the safety condition based on ECBF, is equivalent to CBC in Def. 1 when , the extended class function, is linear.

Appendix C. Proofs

C.1. Matrix variate Gaussian distributions

Let , such that . Let where are column vectors of Y.

Note that

Second part can be proved by noticing that,

We start by noticing that the Kronecker product satisfies the following properties.

variance and removed from mean of GP posterior. First we consider the term that appears in both mean and variance,

Now we consider the mean,

Now, we can compute the variance of the posterior,

C.2. Relative degree one

We start by re-writing the definition of CBC as follows.

Sinceare jointly Gaussian, their linear combination is also Gaussian. We further notice conditioned on

and the result follows.

Since the program (8) have solutions at the triggering times we know and . Thus, if we prove condition on CBCthe event (7), we have CBC, the result follows.

Using Cauchy-Schwarz inequality, and the Lipschitz assumptions (7) and (15), and Proposition 1, for all

Thus, using (14) and (13), we notice the right-hand side of (55) is upper-bounded by

The result follows by noticing for ) is less than or equal to

C.2.3. PROOF OF PROPOSITION 2 (CBC(r) MEAN IS AFFINE AND ITS VARIANCE IS QUADRATIC)

This proof uses the linearity of expectation,

Unlike CBCfor relative degree one, the distribution of CBCis not a Gaussian Process for Hence instead of analytically computing the probability distribution, we use Cantelli’s inequality to bound the mean and variance of CBCfor any scalar

Since we want the probability to be greater than , we ensure its lower bound is greater than

The terms can be rearranged into,

Substituting the constraint in (61), we get (17).

The mean, variances and covariances of different components are given by (using Lemma 6),

Theorem 8 The mean and variance of CBCfor relative-degree 2 system (

where each term in the above equations can be computed step by step by Algorithm 1.

We will compute the mean and variance of each of the term one by one.

We derive the mean and variance and bounds over CBCfor a Gaussian Process . Then condition can be written as a quadratic form in the random vectors

The challenge is to compute the mean and variance of vector-variate Gaussian process z(x; u). We compute it in three main steps, computation of mean and variance of (1) and (3)

Step 1: Mean and variance of We start by observing that a scalar value Gaussian process because is deterministic,

Step 2: Mean and variance of

Lemma 9 (Differentiating a Gaussian Process) Let q be a scalar valued Gaussian Process with mean function and kernel function , then is a vector-valued Gaussian Process,

For proof see Section C.3.2. Note that the expressions for mean, variance and covariance are valid for any random process, even if not Gaussian.

Using Lemma 9 is a vector-variate Gaussian Process,

The above derivatives can be expanded as,

cov(cov(

Step 3: Mean and variance of Now we have all terms to write the mean and covariance of the Gaussian Process

Note that is also quadratic in control with the term being affine in u and the quadratic term To compute the mean and variance of CBCfrom equation (73), we need the following lemma.

Lemma 10 (Mean and cumulants of Quadratic form) Let x a Gaussian random variable with mean be symmetric. The mean of the Quadratic form can be computed by

The above mean is valid for any random variable with finite mean and variance. The rth cumulant for Gaussian Random Variables is given by (Searle and Gruber, 1971, p55)

And the covariance of the quadratic form with the original variable is given by,

We begin by noting that the variance of quadratic form is the second cumulant r = 2,

Consider two Gaussian random vectors x and y with mean and respectively and variances Var(x) and Var(y) respectively. Let the covariance between x and y be given by cov(x, y). We want to find out the mean and variance of . First note that it can be written in quadratic form,

Hence the mean and variance of is given by,

Now we are equipped to compute mean and variance of

Let q(x) be a random process with finite mean and finite variance we compute the mean and variance of random process

To compute variance, consider a random process with zero mean, variance as q. We write differentiation as limit of finite differences,

Considering only i, jth element of variance

Hence, the variance is the Hessian of the kernel Varcan be written as a Vector-variate Gaussian Process,

We assume another vector-valued Gaussian process and known covariance with . To prove that , we start with

Considering matrix forms of each of the terms in above equation,

Hence,

designed for accessibility and to further open science