Inductive logic programming (ILP) has been a deeply influential paradigm in AI, enjoying decades of research on its theory and implementations (Muggleton and De Raedt 1994; De Raedt et al. 2008; Muggleton et al. 2012). ILP continues to be applied in domains ranging from robotics to biology. As a natural descendent of the fields of logic programming and machine learning, it admits the incorporation of complex background knowledge, which can be very useful in domains where prior knowledge from experts is available and can lead to a more data-efficient learning regime. In essence, the semantic theory attempts to construct a hypothesis based on entailment judgements wrt a (possibly small) set of examples. The construction itself may appeal to principled notions such as inverse entailment (Muggleton 1991).
Be that as it may, the limitation to Horn clauses composed over Boolean variables is a very serious one. Many phenomena occurring in the real-world, from gravitational and quantum mechanics to stock price fluctuations, are best characterized using continuous entities, and more generally, mixtures of discrete and continuous entities. While many notable proposals are treating the issue of modeling hybrid phenomena in logic programming and inductive logic programming settings (Nitti 2016; Speichert and Belle 2018), the underpinning semantics still largely reduces to classical notions with continuous concepts carefully (but not generally) integrated. Thus, we consider whether we should upgrade the logical basis for ILP to natively handle continuous concepts.
In recent years, satisfiability modulo theories (SMT) has emerged as a pragmatic logical framework for reasoning about complex terms, inequalities and other arithmetic operations (Barrett et al. 2009), such as testing the satisfiability of linear constraints. For example, in (Chistikov, Dimitrova, and Majumdar 2015; Belle, Passerini, and Van den Broeck 2015), SMT solvers were used to generalize model counting to hybrid domains by computing the volume of the polytope encoded as a linear constraint over the reals. Other significant advances in SMT have included the handling of inductive constraints and non-linear theories (Reynolds and Kuncak 2015; Gao, Kong, and Clarke 2013). Thus, we argue that SMT and similar technologies could serve as a reasonable basis to upgrade ILP for hybrid domains. We do not suggest the proposal should aim to subsume classical ILP, as it is very likely that Horn logic over Boolean atoms will be both sufficient and efficient for many problem domains. Rather, it is meant to be complementary, to potentially tackle a different set of problem domains while benefiting from decades of developments in ILP theory.
It is possible, of course, that a reasonable middle ground could be achieved by appealing to constraint logic programming (Jaffar and Maher 1994), as seen in some early work (Martin and Vrain 1997; Sebag and Rouveirol 1996). However, it is not entirely obvious such proposals capture the entire range of expressivity that one would with a SMT basis. In that regard, note that because we are not insisting on logic programming syntax, we are essentially motivating a reconsideration of inductive declarative programming, as opposed to purely inductive logic programming.
It is also worth remarking that a number of recent developments relate to the motivation here, such as in the field of constraint learning (De Raedt, Passerini, and Teso 2018), all of which could be leveraged to strengthen the theoretical and algorithmic foundations of the proposed framework. In (Kolb et al. 2018), a heuristic approach to learn SMT formulas capturing positive-only examples is considered. In (Mocanu, Belle, and Juba 2019), the implicit learning of SMT formulas is investigated via a PAC formulation, while also allowing for noisy examples. The work in (Molina et al. 2018; Bueff, Speichert, and Belle 2018) can be seen as learning weighted SMT formulas, albeit simple ones corresponding to the difference logic fragment over binary connectives.
The basic concepts of logic programming are defined wrt a first-order language, where we have: atoms p(t1, . . ., tn), consisting of predicates p and terms t1, . . . , tb, understood as usual. Literals, clauses, definite clauses and grounding are understood as usual. Then, in the simplest instance, we have:
Given a set of examples (or observations) E = {e1, . . . , en}, where ei is a ground fact for the unknown target predicate p, a background theory B as a set of definition clauses, a space of clauses specified using a declarative bias, find a hypothesis H
where B
The hypothesis would additionally need to satisfy certain rational generality properties, in the sense of maximally compressing E relative to B (Muggleton 1991). For example, consider an empty background theory with observations: parent(f,c), parent(m,c), parent(g,f),grandparent(g,c), and being the set of definite clauses. Then we may obtain the hypothesis: grandparent(x, y)
parent(x, z), parent(z, y).
Satisfiability modulo theories (SMT) is a generalization to SAT for deciding satisfiability for fragments and extensions of first-order logics with equality, for which specialized decision procedures have been developed. Deciding satisfia-bility for these logics is done with respect to some decidable background theory which fixes the interpretations of functions and predicates (Barrett et al. 2009). Briefly, we have:
Syntax We assume a logical signature consisting of the set of predicates denoted as P, and a set of functions symbols F , including 0-ary functions, logical variables, and standard connectives. An atom is one of the form: b (a propositional symbol), p(t1, ..., tk), t1 = t2, (false),
(true). Literals and clauses are understood as usual. A ground expression is one where all the variables are replaced by the domain of discourse (e.g., integers, reals, finite set of named objects).
Semantics Formulas are given a truth value from the set {true, false} by means of first order models. A model is a pair consisting of a non-empty set
, the universe of the model and a mapping assigning to each constant symbol a an element a
(the domain), to each function symbol f
of arity n > 0 a total function f :
, to each propositional symbol b an element b
true, false} and to each predicate p
of arity n > 0 a total function p :
true, false}. Terms are interpreted as usual, as is the satisfaction relation that is defined inductively. We assume entailment wrt a suitable background theory (e.g., reals).
A general setting for induction can be taken to be de-fined over the following languages (Muggleton 1991): (the language of examples),
(the language for the background knowledge) and
(the language for the hypothesis), and as can be inferred from above, given B
and E
, the task is to find H
, such that H
As far as the background knowlege and hypothesis is concerned, for the SMT setting, we could now imagine
being fragments of linear real arithmetic, for example, but this is not necessary.
could involve inductive constraints, and both
and
could involve non-linear constraints. The search for the hypothesis could be achieved in the first instance by appealing to reductions of the induction step to satisfiability (Evans and Grefenstette 2018). So, not very much changes at first glance, which is a positive development for bridging existing ILP theory and frameworks with this new setting. Moreover, other entailment judgements for strengtening ILP, e.g., that negative examples where provided should never be entailed by B
H could be applied here too. Nonetheless, note that
are richer in some regards, and not restricted to Horn clauses. We will now discuss some variants below for
partial models In the simplest instance, we have observations that are partial models. For example, in a language with 0-ary functions {x, y, . . ., z} over the reals, a full model may be of the form (x = 1, y = 2.3, . . ., z = 6). In this case, an example might be a partial model of the form (x = 1), and another might be of the form (y = 2.3). Clearly, if the conjunction of the partial models is taken as H, then trivially B
But this is a not a very interesting hypothesis. Like in the classical setting, we would need to specify
in a way to maximally compress the examples wrt B; so, for example, if B is x + y > z, we might infer H as y > x by specifying length/syntax restrictions on H wrt E.
sets of partial models It is natural to imagine that the observations are, in fact, sets of partial models. For example, if we were unable to measure x precisely, we may need to contend with x > 0. So, given examples x > 0 and z = 0, we might infer the hypothesis x > z.
k-ary functions We have discussed the use of 0-ary functions above, and that is the case for much of the learning literature (Kolb et al. 2018; Mocanu, Belle, and Juba 2019). The natural analogue of the kind of examples seen in classical ILP might better correspond to the use of k-ary functions with logical variables. Incidentally, this level of expressiveness is supported by SMT solvers (Barrett et al. 2009) for capturing arrays, and so on. As an example, suppose B includes the expression bmi(x) = weight(x)/height(x)2, and given examples (partial models) weight(john) = 100, height(john) = 1.9, weight(mary) = 70, height(mary) = 1.8, we might infer a hypothesis bmi(x) > 21.
We motivated a reconsideration of inductive declarative programming by leveraging SMT. SMT solvers have emerged to deftly handle arithmetic constraints and inductive defi-nitions. As a result, the languages that we could consider for B and H would be significantly richer so as to capture a broad range of problems, and potentially impact the numerous applications areas of SMT (Barrett et al. 2009). Of course, understanding how ideas from ILP systems can be lifted for this methodology is an open but exciting question. Treating probabilistic concepts (De Raedt et al. 2008) is another exciting direction, which would relate this framework to learning in hybrid probabilistic models (Belle, Passerini, and Van den Broeck 2015) and statistical relational learning (Kersting, Natarajan, and Poole 2011) more generally. It would allow one to express that each learned clause holds with a certain probability, but not categorically, for example.
[Barrett et al. 2009] Barrett, C.; Sebastiani, R.; Seshia, S. A.; and Tinelli, C. 2009. Satisfiability modulo theories. In Handbook of Satisfiability. IOS Press. chapter 26, 825–885.
[Belle, Passerini, and Van den Broeck 2015] Belle, V.; Passerini, A.; and Van den Broeck, G. 2015. Probabilistic inference in hybrid domains by weighted model integration. In IJCAI.
[Bueff, Speichert, and Belle 2018] Bueff, A.; Speichert, S.; and Belle, V. 2018. Tractable querying and learning in hybrid domains via sum-product networks. In KR Workshop on Hybrid Reasoning and Learning.
[Chistikov, Dimitrova, and Majumdar 2015] Chistikov, D.; Dimitrova, R.; and Majumdar, R. 2015. Approximate counting in SMT and value estimation for probabilistic programs. In TACAS, volume 9035. 320–334.
[De Raedt et al. 2008] De Raedt, L.; Frasconi, P.; Kersting, K.; and Muggleton, S., eds. 2008. Probabilistic inductive logic programming: theory and applications. Berlin, Heidelberg: Springer-Verlag.
[De Raedt, Passerini, and Teso 2018] De Raedt, L.; Passerini, A.; and Teso, S. 2018. Learning constraints from examples. In Thirty-Second AAAI Conference on Artificial Intelligence.
[Evans and Grefenstette 2018] Evans, R., and Grefenstette, E. 2018. Learning explanatory rules from noisy data. Journal of Artificial Intelligence Research 61:1–64.
[Gao, Kong, and Clarke 2013] Gao, S.; Kong, S.; and Clarke, E. M. 2013. dreal: An smt solver for nonlinear theories over the reals. In CADE, 208–214.
[Jaffar and Maher 1994] Jaffar, J., and Maher, M. J. 1994. Constraint logic programming: A survey. The journal of logic programming 19:503–581.
[Kersting, Natarajan, and Poole 2011] Kersting, K.; Natara- jan, S.; and Poole, D. 2011. Statistical relational AI: Logic, probability and computation.
[Kolb et al. 2018] Kolb, S.; Teso, S.; Passerini, A.; and De Raedt, L. 2018. Learning smt (lra) constraints using smt solvers. In IJCAI, 2333–2340.
[Martin and Vrain 1997] Martin, L., and Vrain, C. 1997. Learning linear constraints in inductive logic programming. In Proceedings of the 9th European Conference on Machine Learning, ECML ’97, 162–169. London, UK, UK: SpringerVerlag.
[Mocanu, Belle, and Juba 2019] Mocanu, I. G.; Belle, V.; and Juba, B. 2019. Pac+ smt. In NeurIPS Workshop
on Knowledge Representation & Reasoning Meets Machine Learning.
[Molina et al. 2018] Molina, A.; Vergari, A.; Di Mauro, N.; Natarajan, S.; Esposito, F.; and Kersting, K. 2018. Mixed sum-product networks: A deep architecture for hybrid domains. In Thirty-second AAAI conference on artificial intelligence.
[Muggleton and De Raedt 1994] Muggleton, S., and De Raedt, L. 1994. Inductive logic programming: Theory and methods. The Journal of Logic Programming 19:629–679.
[Muggleton et al. 2012] Muggleton, S.; De Raedt, L.; Poole, D.; Bratko, I.; Flach, P.; Inoue, K.; and Srinivasan, A. 2012. Ilp turns 20. Machine learning 86(1):3–23.
[Muggleton 1991] Muggleton, S. 1991. Inductive logic pro- gramming. New Gen. Comput. 8(4):295–318.
[Nitti 2016] Nitti, D. 2016. Hybrid Probabilistic Logic Programming. Ph.D. Dissertation, KU Leuven.
[Reynolds and Kuncak 2015] Reynolds, A., and Kuncak, V. 2015. Induction for smt solvers. In International Workshop on Verification, Model Checking, and Abstract Interpretation, 80–98. Springer.
[Sebag and Rouveirol 1996] Sebag, M., and Rouveirol, C. 1996. Constraint inductive logic programming.
[Speichert and Belle 2018] Speichert, S., and Belle, V. 2018. Learning probabilistic logic programs in continuous domains.