1. Introduction. Data-driven discovery or identification of unknown governing equations has attracted a growing amount of attention recently, from earlier attempts using symbolic regression ([1, 33]), to more recent work using techniques such as Gaussian processes [22], artificial neural networks [24, 25], group sparsity [27], etc. Most of the recent work transform the problem into an approximation problem and develop various techniques to create parsimonious models [35], to discover partial differential equations [28, 30], and to deal with noises in data [3, 31], corruptions in data [36], or limited amount of data [32]. Methods have also been developed in conjunction with model selection approach [15], Koopman theory [2], and Gaussian process regression [23], etc. Results from approximation theory have been borrowed to justify the use of multiple short burst of trajectories [37]. More recently, machine learning methods, particularly deep neural networks are being investigated to aid the task of equation discovery, for ODEs [7, 26, 29, 4, 19]) and PDEs [16, 13, 11, 8, 21, 12, 38].
A related class of problems is to identify unknown parameters or processess embedded in a given system of governing equations. This is sometimes referred to as “system identification”. Many efforts have been devoted to this line of research, see, e.g. [14, 10, 18, 9, 5, 34, 39, 6, 17, 20], and more recently, [22, 25, 27]. The majority of the existing work focused on identification of unknown parameters, which take constant values throughout the domain of interest. The focus and contribution of this paper is on the identification of unknown processes, which are functions, embedded in a given system of governing equations. In particular, we use advection-diffusion type of partial differential equation (PDE) as our primary application. Existing work on system identification for advection-diffusion problem include [14, 10, 18, 9, 34, 39]), most of which focused on identification of constant parameters.
The technical contributions of this paper include the following. We first present an analysis on the uniqueness of the system identification problem for convection-diffusion type equations. We show that separability of the solution is key to guarantee uniqueness. We then present a general numerical framework for identifying unknown functions embedded in given governing equations using observational data of the state variable. This framework is based on seeking an approximation of the unknown functions in a properly defined finite dimensional linear space, which can
be taken conveniently as the same linear space for the discretization of the governing equation. The identification of the unknown processes is then conducted by minimizing the residual of the discretized equations in certain space-time norm. Under the framework, two types of algorithms, “collocation” and “Galerkin”, are proposed, depending on the way the residues are defined and minimized. The Galerkin algorithm utilize weak form formulation and can avoid using information of the derivatives/gradients of the solution states. Consequently, it is more suitable for practical computation than the collocation algorithm, especially when measurement data contain noises. We remark that the proposed numerical framework and algorithms are applicable to general classes of PDEs. Our focus on advection-diffusion type PDE in this paper is to have a concrete model to conduct theoretical analysis.
This paper is organized as follows. After the basic problem setup in Section 2, we present its uniqueness analysis in Section 3. The numerical approaches are discussed in Section 4, with both a general framework and two types of algorithms, Galerkin and collocation. An extensive set of numerical examples are presented in Section 5,
2. Problem Setup. Let 1, be a spatial domain with coordinate x = (
), and T > 0 be a real number. Let u(t, x) be a state variable, governed by a time-dependent partial differential equation (PDE)
where L is an known differential operator, and Γ(x) = (functions depending only on the spatial variable x. Suppose observation data of the solution state u are available, our goal is to identify the unknown functions Γ(x) embedded in the governing equation (2.1).
In order to conduct concrete theoretical and numerical analysis, we focus on advection-diffusion type PDE
where ) is flux function,
) = (
velocity field and
) diffusivity field. The flux F is assumed to be known, and the unknown processes to be recovered are Γ(x) =
. Throughout this paper, we assume Γ
). Note that even though our theoretical analysis applies to this advection-diffusion PDE, the proposed numerical algorithms are applicable to general type operator L.
3. Uniqueness Analysis. In this section, we present theoretical analysis for the aforementioned recovery problem. We restrict our analysis to one spatial dimension with d = 1, as multi-dimensional analysis becomes more challenging and remains open. We also break down the analysis into three sub-problems: for advection equation, for diffusion equation, and finally for advection-diffusion equation.
3.1. Advection equation. We now consider the following linear advection equation with unknown variable velocity field
where ) is known and Γ(x) =
) is unknown. Lemma 3.1. Let
([0
) be a given solution of the equation (3.1). A sufficient and necessary condition for the uniquely determine
)
is that: there does not exist nonzero function ) such that
)) is independent of x.
Proof. We first prove the sufficiency by contradiction. Assume that there is another function ) such that
and
Combining it with (3.1) gives
which implies that ()) does not depend on x. Therefore,
) = 0
, which leads to the contradiction. Hence the parameter function
) to be recovered is unique.
We now prove necessity by contradiction. Assume that there is a nonzero function ) such that
)) does not depend on x. Then
This, together with (3.1), imply
which is contradictory to the uniqueness of .
Definition 3.2. Consider a bivariate function h : (. It is called separable if it can be written as a product of two univariate functions
(T
, T
g(x)
(Ω).
Theorem 3.3. Let ([0
) be a given solution of the equation (3.1). If there exists no open interval Ω
and (
(0, T] such that F(u(t, x)) is separable on (
Ω, then
) is unique.
Proof. We prove it by contradiction. Assume that ) is not unique. Then, according to Lemma 3.1, there exists a nonzero function
) such that
)) does not depend on x. Thus we have
= 0 for some
in the interior of D, and
)) =
[0
, for some single-variable function
([0, T]). Due to the sign-preserving property of
), there exists an open interval Ω
containing the point
such that
Hence
This means F(u(t, x)) is separable on [0, which contradicts with the assumption on F(u(t, x)). Therefore,
) is unique.
3.2. Diffusion equation. We now consider the following diffusion equation
where Γ(x) = ) is unknown. Lemma 3.4. Let
([0
)) be a given solution of the equation (3.2).
A sufficient and necessary condition for the uniqueness of the function )
is that: there is no nonzero function ) such that
)
) is inde-
pendent of x.
If for any given open interval Ω , there exist a temporal interval (
[0, T]
Proof. The proof is similar to that of Theorem 3.3 and omitted here.
3.3. Advection-Diffusion equation. We now consider one-dimensional advection-diffusion equation
where Γ(x) = ()) is unknown. Definition 3.6. Consider a bivariate function h : (
. It is called weakly separable if it can be written as
(T
, T
(Ω), i = 1, 2.
Theorem 3.7. Let ([0
)) be a given solution of the equation (3.3). If there is no open interval Ω
and (
(0, T] such that u(t, x) is weakly separable on (
Ω, then the functions
) and
) are unique. Proof. Assume that there are another two functions
) and
) such that
which, along with (3.3), imply
where :=
) and
:=
). Note that (3.4) further implies that
for some single-variable function ([0, T]). Next, we only need to show that
) =
) = 0
.
Let us first prove ) = 0
, by contradiction. Assume that
0. According to the continuity of
on D, we have
= 0 for some
belonging to the interior of D. Due to sign-preserving property for
), there exists an open interval Ω
containing
such that
Let us introduce an auxiliary positive function
It then follows from (3.5) that
Or, equivalently, we have
By integrating (3.6) we have
Note that ) = 1. We then obtain
This implies that u(t, x) is weakly separable on [0and contradicts with the hypothesis on u(x, t). Therefore, the assumption that
0 is incorrect. Hence we complete the proof of
) = 0
.
By substituting 0 into (3.5), we then have
We now prove 0 by contradiction. Assume
0. According to the continuity of
on D, we have
= 0 for some
belonging to the interior of D. Due to sign-preserving property of
), there exists an open interval Ω
containing the point
such that
Hence
This means u(t, x) is separable, and subsequently weakly separable, on [0. This is a contradiction to the hypothesis on u(t, x). Therefore, the assumption that
0 is incorrect. Hence we have
) = 0
.
In summary, we have proved that ) =
) = 0
. In other words,
) =
) and
) =
) for all
. The proof is completed.
4. Numerical Methods. In this section, we present our numerical methods for recovery of unknown functions embedded in PDE by using data of the state variables. We focus on the advection-diffusion type equations (2.2) discussed in the previous section, although the methods are applicable for general PDEs.
4.1. General Framework. We seek to approximate/represent the unknown functions Γ(x) = in a finite N-dimensional linear subspace
). Let
) := (
be a basis for
. Denote
) := (
(
(
and
the finite- dimensional representation of
) =
and
), respectively. They can be expressed as
where the coefficient vectors and k are to be determined. A straightforward approach to determine these finite-dimensional unknown functions is to minimize the residual of
) in
)) norm, i.e.,
However, this minimization problem is challenging to solve, as it involves complicated temporal and spatial integrals, as well as the derivatives of u. We now discuss how to transform this problem into a tractable one via proper discretization.
4.1.1. Time Discretization. Let denote a set of time instances in [0, T], where the data of the state variable u are collected. We replace the time integral in (4.2) by a weighted sum. Subsequently, the optimization problem (4.2) can be transformed into
where are a set of weights. Note that with a given time instance set
, one can choose a proper set of weights
such that the weighted sum in (4.3) is a good approximation to the time integral in (4.2)
4.1.2. Space Discretization. Upon discretization in time, we now discuss two approaches to simplify the spatial integral in (4.3).
• In collocation approach, we seek to minimize (4.3) at selected nodes in spatial domain, i.e., at collocation points. Let
be such a set of nodes, we further transform (4.3) into the following problem:
as our testing space for the residual. Our Galerkin type method then transform (4.3) into the following problem:
4.2. Application to Advection-Diffusion Equation (2.2). We now discuss the detailed formulation when applying the aforementioned approaches to the advection-diffusion equation (2.2). The collocation approach (4.4) requires direct evaluations of the equation (2.2) at the collocation points. This is straightforward to implement and requires no further discussion. On the other hand, the implementation of the Galerkin approach (4.5) requires further discussion. First, we show that the Galerkin minimization problem (4.5) for the advection-diffusion (2.2) can be re-written into the minimization problem for the expansion coefficients (4.1).
Theorem 4.1. Let Then the problem (4.5) for (2.2) is equivalent to
where for 1 and 1
) =
, and
with n = () denoting the outward unit normal vector along
.
and then substitute (4.8) into (4.5) to obtain (4.6). To show (4.8), we splitof (4.5) into three terms: timederivative term, advection term and diffusion term, as follows.
By using integration-by-part for the advection and diffusion terms, we have
Note that in (4.11), the spatial derivatives of u are not required in the interior of D. Let =: (
)
and k =: (
be the coefficients in (4.1). We obtain
Hence, for the advection term we have
where ), 1
, are defined in (4.7). Combining (4.12) and (4.13) into (4.9) gives
for 1 . Substituting (4.14) into (4.9) gives (4.8), with which (4.6) follows immediately. The proof is complete.
We now derive uniqueness condition for the solution to the minimization problem (4.6).
and define :=
), which is a symmetric positive semidefinite matrix. A solution to the minimization problem (4.6) satisfies
Furthermore, if the matrix is nonsingular, then the problem (4.6) has a unique solution
Proof. We immediately have
which is a positive semidefinite quadratic form in the variables c. Thus, the minima of J(c) satisfy (4.16). If the matrix is nonsingular, then the linear system (4.16) for c has unique solution, which is given by (4.17).
Remark 4.1. Note that although the collocation approach (4.4) is straightforward to implement, we advocate the use of the Galerkin approach (4.5). This can be seen from its implementation for the advection-diffusion equation. Using the weak form of Galerkin and integration-by-part, the Galerkin algorithm avoids using derivatives of the state variable in the interior of the domain. For many pratical problems when the spatial derivatives are not directly available and need to be estimated from data, this is preferred because estimating derivatives can induce more numerical errors, especially when data contain noises.
4.3. Implementation Detail. Assume that are a large set of time in- stances in [0, T], where the state u are measurable. We set
= 1/M and
, 1
, which are M uniformly i.i.d. (independent and identically distributed) random samples from the set
. Let
be a properly selected numerical quadrature for computing the spatial integrals on D and
. Based on the formulations derived in Theorems 4.1 and 4.2, the implementation of our method proceeds as follows.
Step 1: Sample Data. Collect the data of u(t, x) at the points (), 1
, 1
. Let us denote the sampled data as
:=
, where
are possible noises. We assume that the noises are i..i.d. random.
Step 2: Filter Data. If the data are noisy, we propose to use a filter. For each m = 1, . . . , M, q = 1, . . . , Q, we locally construct a polynomial function ) in the neighborhood of
, and obtain filtered data
:=
). To do so, we use standard least square minimization method and sample extra data in the neighborhood of
.
Step 3: Estimate Derivatives. We evaluate the time derivative (
) by locally constructing polynomial function
) near
. To do so, we use standard least square minimization method and sample extra data in the neighborhood of
from
. We obtain time derivative estimate
(
). Similarly, for the gradients
, on the domain boundary, we construct (local) polynomial function
) for each m = 1, . . . , M, and get
).
We compute (4.7) at
by using the filtered data
on D and the gradient estimate
) on
with suitable numerical
quadratures. We then compute ) using the derivative estimate
(
). The matrix
) and vector
) in (4.15) are then formed immediately.
Compute the symmetric positive semidefinite matrix
If the matrix is nonsingular, we obtain the unique expansion coefficient vectors
and k in (4.1) by (4.17). That is,
which is the minimum to the least-square problem
5. Numerical Examples. In this section, we present numerical examples to demonstrate the performance of the proposed numerical methods for advection-diffusion problem (2.2). Our examples include both 1D and 2D cases, as well as a nonlinear Burgers’ equation that does not fall into the category of linear advection-diffusion.
For benchmarking purpose, we use synthetic data generated by solving known advection-diffusion equations with high resolution. The data are then collected over a uniformly distributed time instances in time domain and Gauss points in spatial domain, both in the interior and along the boundary. This results in our sets of noiseless data. To generated noisy data, we add i.i.d. Gaussian noises ) to the clean data, where
is the noise level.
We use normalized Legendre polynomials as the basis functions. For the noisy data cases, we employ the filtering procedure described in the previous section. In all the examples here, we built polynomials ) of degree 10 using 300 noisy data drawn from the neighborhood of x. These local polynomials are also used to estimate the spatial derivatives (when required by the algorithms). The temporal derivatives are estimated in a similar way, by first building local polynomials
) of degree 10 using 300 neighboring data points and then taking their derivative. For noiseless cases, all derivatives are computed via second-order finite difference.
The recovered velocity and diffusivity fields are evaluated over another set of grids and then compared to the true values. We then report the relative errors. The sets of evaluation grids are uniform in 1D and tensor grids in 2D.
5.1. Example 1: Advection Equation. We first consider 1D advection equation
where
with = 0
= 0
. The clean data of u is obtained by solving the equation numerically with initial condition
The details of the numerical solver are listed in Table 5.1. Our data of u are uniformly
Table 5.1 PDE solver information for convection equation in Example 1.
sampled 50 points in time, and over 50 Gauss points in space, along with 2 boundary points. Our goal is to recover ). We choose polynomial space
as testing space.
We first consider clean noiseless data case. On the left of Fig. 5.1, the clean data u(0, x) and u(1, x) are presented. On the right of Fig. 5.1, the relative errors in our recovered ) versus its polynomial order n are shown. We observe exponential decay of errors before they saturate after n > 10. The comparison of exact and recovered
are shown in Fig. 5.2, for n = 6 and n = 30.
Next we consider noisy data case. We add i.i.d. Gaussian noise ) to clean data u, where
= 10
and 10
. The comparison of filtered and unfiltered results is shown in Fig. 5.3. We clearly observe that filtered results perform significantly better than the unfiltered results, with errors one order of magnitude smaller. This example demonstrates the necessity of employing filtering for noisy data.
Fig. 5.1. Example 1 with noiseless data. Left: Solution state u; Right: Relative errors in the recovered polynomial order n.
5.2. Example 2: Diffusion Equation. We now consider a 1D diffusion equation
Fig. 5.2. Example 1 with noiseless data. Left: recovered ; Right: Recovered
Fig. 5.3. Example 1 with noise in data. Relative errors in the recovered
polynomial order n.
where
with = 0
= 0
= 4
. The initial condition is set as
The details of our numerical solver are listed in Table 5.2. Upon solving the equation, we collect solution data over 50 uniform points in the temporal domain and 50 Gauss points plus 2 boundary points in the spatial domain. We then choose as the polynomial space to recover
).
We first consider noiseless clean data case. On the left of Fig. 5.4, we plot the recovered ) with polynomial order n = 30, along with the true exact
). On the right of Fig.5.4, we plot error convergence and observe fast exponential error decay.
We then consider noisy data, with noise level at = 10
10
10
. The comparison is shown in Fig. 5.5, between recovery with filtering and without filtering. It
Table 5.2 PDE solver information for diffusion equation in Example 2.
Fig. 5.4. Example 2: Recovery of with noiseless data. Left: Result with n = 30; Right: Error vs. polynomial order n.
is clearly seen that the recovery results with filtering are noticeably more accurate than those without filtering.
It should be mentioned that the results shown so far are obtained via the Galerkin method. We then compare the Galerkin method and collocation method for this example, with noisy data at noise level = 10
. Filtering is applied in both approaches. The results are shown in Figs. 5.6 and 5.7. Fig. 5.6 shows the results obtained with high-order polynomial of degree n = 30. While the Galerkin method produces highly accurate recovery result, the results by collocation method show visible errors and are unsatisfactory. The error convergence with respect to increasing polynomial order is shown in Fig. 5.7. We can see that the collocation method fails to converge properly as the Galerkin method does. The primary reason for the lack of accuracy in the collocation method is because it requires derivative estimation in the solver. Computing derivatives with noisy data inevitably induces additional numerical errors. On the other hand, the Galerkin method avoids much of the derivative requirement due to its weak formulation and is able to maintain high accuracy.
5.3. Example 3: 1D Convection-Diffusion Equation. We now consider a 1D advection-diffusion equation
where
= 10-5 = 10-4 = 10-3 10-5
Fig. 5.5. Example 2: Comparison of recovery results with filtering and without filtering, using noisy data at different noise levels.
Fig. 5.6. Example 2: recovery using polynomial order n = 30 with noisy data of noise level using
. Filtering applied. Left: Galerkin method; Right: Collocation method.
with = 1,
= 0
= 0
= 10
. The data set is obtained by solving the equation numerically with initial condition
Details of the numerical solver are listed in Table 5.3. Data are collected over 50 uniform grids in the temporal domain and 200 Gauss point plus the 2 boundary points in the spatial domain. Our goal is to recover the velocity field ) and the diffusivity field
). We use
as approximation and testing space.
The recovery result for noiseless data is shown in Fig. 5.8. We observer excellent visual agreement between the recovered ) and their true counterparts. Closer
Fig. 5.7. Example 2: noisy data with noise level Filtering applied. Error vs. polynomial order.
Table 5.3 PDE solver information for convection-diffusion equation in Example 3.
examination reveals that the relative errors are 2.9335 10
for
) and 3.4908
10
for
). The results obtained with noisy data are not shown, as they are visually similar to the noiseless case and with errors dominated by the input data noise.
Fig. 5.8. Example 3 with noiseless data. Left: recovered ; Right: recovered
We now consider the 1D viscous Burgers’ equation
where
with = 0
= 0.2, and
= 3
. The initial condition is set as u(0, x) =
sin(
This nonlinear equation represents a departure from the linear advection-diffusion equation discussed in the paper. Although our theoretical results do not apply here, the proposed numerical approaches still apply. We focus on noiseless data case by Galerkin method. Similar to the other examples, data are collected over 50 uniform grids in the temporal domain and 100 Gauss point plus boundaries in the spatial domain. Polynomial space of is used as the approximation and testing space for both
) and
). The recovered results are shown in Fig. 5.9. Good agreement with the true
) and
) can be seen. The relative errors are 9.7347
10
for
) and 4.0632
10
for
).
Fig. 5.9. Example 4 with noiseless data and polynomial order n = 40. Left: recovery of Right: recovery of
5.5. Example 5: 2D Advection-Diffusion Equation. We finally consider a 2D advection-diffusion equation
where
with = 1,
= 1,
= 0
= 0.02,
= 1, and
. The initial condition is set as
with =
= 0.2. The details of the numerical solver are in Table 5.4. The solutions of the state variable at the initial and final time are shown in Fig. 5.10, for demonstration purpose.
Table 5.4 PDE solver information for 2-D convection-diffusion equation in Example 5.
Fig. 5.10. Example 5: State variable u at Initial and final stage.
To recover and
), we collection solution data over 200 uniformly distributed grids in the temporal domain and 80
80 tensor Gauss points in the interior of the spatial domain, along with 80 Gauss points on each of the boundary edges. We use
as approximation and testing space.
The recovered results for ) = (
and
) are shown in Fig. 5.11, obtained via Galerkin method using noiseless data. Visual comparison with the true functions shows good agreement. More detailed examination shows that the relative errors in the recovered solutions are 5.5235
for
), 4.0274
for
), and 6.9119
10
for
). Results of noisy data case are not shown, as they are visually similar to the noiseless case and with errors dominated by the data noise.
6. Conclusion. In this paper, we studied the problem of identifying unknown parameter functions embedded in time-dependent partial differential equations (PDEs) using observational data of the state variables. Using linear advection-diffusion type equations, we conducted theoretical analysis on the solvability of the problem and derived conditions under which unique recovery can be obtained. We then presented numerical approaches applicable for general PDEs. Two types of approaches, Galerkin
Fig. 5.11. Example 5: Comparison of true (left column) and recovered (right column) parameter functions. From top to bottom
and collocation, are presented. While the collocation approach is straightforward to implement, the Galerkin method is preferred because its use of weak form avoids the use of much spatial derivatives of the state variables. In many practical cases when only data of the state variables are available, estimating derivatives often induce additional errors, especially when data contain noises.
[1] J. Bongard and H. Lipson. Automated reverse engineering of nonlinear dynamical systems. Proc. Natl. Acad. Sci. U.S.A., 104(24):9943–9948, 2007.
[2] S. L. Brunton, B. W. Brunton, J. L. Proctor, Eurika Kaiser, and J. N. Kutz. Chaos as an intermittently forced linear system. Nature Communications, 8, 2017.
[3] S. L. Brunton, J. L. Proctor, and J. N. Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. U.S.A., 113(15):3932–3937, 2016.
[4] R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations. arXiv preprint arXiv:1806.07366, 2018.
[5] J. H. Crews, R. C. Smith, K. M. Pender, J. C. Hannen, and G. D. Buckner. Data-driven techniques to estimate parameters in the homogenized energy model for shape memory alloys. Journal of Intelligent Material Systems and Structures, 23(17):1897–1920, 2012.
[6] M. Dam, M. Brøns, J. J. Rasmussen, V. Naulin, and J. S. Hesthaven. Sparse identification of a predator-prey system from simulation data of a convection model. Physics of Plasmas, 24(2):022310, 2017.
[7] W. E. A proposal on machine learning via dynamical systems. Communications in Mathematics and Statistics, 5(1):1–11, Mar 2017.
[8] J. Han, A. Jentzen, and W. E. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34):8505–8510, 2018.
[9] M. Karalashvili, S. Gro˜A, W. Marquardt, A. Mhamdi, and A. Reusken. Identification of transport coefficient models in convection-diffusion equations. SIAM Journal on Scientific Computing, 33(1):303–327, 2011.
[10] M. Karalashvili, S. Gro˜A, A. Mhamdi, A. Reusken, and W. Marquardt. Incremental identifica-tion of transport coefficients in convection-diffusion systems. SIAM Journal on Scientific Computing, 30(6):3249–3269, 2008.
[11] Y. Khoo, J. Lu, and L. Ying. Solving parametric pde problems with artificial neural networks. arXiv preprint arXiv:1707.03351, 2018.
[12] Z. Long, Y. Lu, and B. Dong. Pde-net 2.0: Learning pdes from data with a numeric-symbolic hybrid deep network. arXiv preprint arXiv:1812.04426, 2018.
[13] Z. Long, Y. Lu, X. Ma, and B. Dong. PDE-Net: learning PDEs from data. arXiv preprint arXiv:1710.09668, 2017.
[14] B. Malengier and R. V. Keer. Parameter estimation in convection dominated nonlinear convection-diffusion problems by the relaxation method and the adjoint equation. Journal of Computational and Applied Mathematics, 215(2):477 – 483, 2008.
[15] N. M. Mangan, J. N. Kutz, S. L. Brunton, and J. L. Proctor. Model selection for dynamical systems via sparse regression and information criteria. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 473(2204), 2017.
[16] A. Mardt, L. Pasquali, H. Wu, and F. Noe. VAMPnets for deep learning of molecular kinetics. Nature Comm., 9:5, 2018.
[17] A. Narasingam and J. S. Kwon. Data-driven identification of interpretable reduced-order models using sparse regression. Computers & Chemical Engineering, 119:101–111, 2018.
[18] T. K. Nilssen, K. H. Karlsen, T. Mannseth, and X.-C. Tai. Identification of diffusion parame- ters in a nonlinear convection-diffusion equation using the augmented lagrangian method. Computational Geosciences, 13(3):317–329, Sep 2009.
[19] T. Qin, K. Wu, and D. Xiu. Data driven governing equations approximation using deep neural networks. J. Comput. Phys., 395:620–635, 2019.
[20] M. Quade, M. Abel, J. N. Kutz, and S. L. Brunton. Sparse identification of nonlinear dynamics for rapid model recovery. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(6):063116, 2018.
[21] M. Raissi. Deep hidden physics models: Deep learning of nonlinear partial differential equations. arXiv preprint arXiv:1801.06637, 2018.
[22] M. Raissi and G. E. Karniadakis. Hidden physics models: Machine learning of nonlinear partial
differential equations. J. Comput. Phys., 357:125–141, 2018.
[23] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Machine learning of linear differential equations using gaussian processes. Journal of Computational Physics, 348:683–693, 2017.
[24] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561, 2017.
[25] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics informed deep learning (part ii): data-driven discovery of nonlinear partial differential equations. arXiv preprint arXiv:1711.10566, 2017.
[26] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Multistep neural networks for data-driven discovery of nonlinear dynamical systems. arXiv preprint arXiv:1801.01236, 2018.
[27] S. Rudy, A. Alla, S. L. Brunton, and J. N. Kutz. Data-driven identification of parametric partial differential equations. arXiv preprint arXiv:1806.00732, 2018.
[28] S. H. Rudy, S. L. Brunton, J. L. Proctor, and J. N. Kutz. Data-driven discovery of partial differential equations. Science Advances, 3(4):e1602614, 2017.
[29] S. H. Rudy, J. N. Kutz, and S. L. Brunton. Deep learning of dynamics and signal-noise decomposition with time-stepping constraints. arXiv preprint arXiv:1808.02578, 2018.
[30] H. Schaeffer. Learning partial differential equations via data discovery and sparse optimization. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 473(2197), 2017.
[31] H. Schaeffer and S. G. McCalla. Sparse model selection via integral terms. Phys. Rev. E, 96(2):023302, 2017.
[32] H. Schaeffer, G. Tran, and R. Ward. Extracting sparse high-dimensional dynamics from limited data. arXiv preprint arXiv:1707.08528, 2017.
[33] M. Schmidt and H. Lipson. Distilling free-form natural laws from experimental data. Science, 324(5923):81–85, 2009.
[34] J. Schorsch, M. Gilson, and H. Garnier. Identification of advection-diffusion equation from a limited number of spatial locations. IFAC Proceedings Volumes, 46(11):193 – 198, 2013. 11th IFAC Workshop on Adaptation and Learning in Control and Signal Processing.
[35] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.
[36] G. Tran and R. Ward. Exact recovery of chaotic systems from highly corrupted data. Multiscale Model. Simul., 15(3):1108–1129, 2017.
[37] K. Wu and D. Xiu. Numerical aspects for approximating governing equations using data. J. Comput. Phys., 384:200–221, 2019.
[38] K. Wu and D. Xiu. Data-driven deep learning of partial differential equations in modal space. J. Comput. Phys., 408:109307, 2020.
[39] S. Zhuk, T. T. Tchrakian, S. Moore, R. Ord´o˜nez Hurtado, and R. Shorten. On source-term parameter estimation for linear advection-diffusion equations with uncertain coefficients. SIAM Journal on Scientific Computing, 38(4):A2334–A2356, 2016.