Planning and Synthesis Under Assumptions

2018·Arxiv

Abstract

Abstract

In Reasoning about Action and Planning, one synthesizes the agent plan by taking advantage of the assumption on how the environment works (that is, one exploits the environment’s effects, its fairness, its trajectory constraints). In this paper we study this form of synthesis in detail. We consider assumptions as constraints on the possible strategies that the environment can have in order to respond to the agent’s actions. Such constraints may be given in the form of a planning domain (or action theory), as linear-time formulas over infinite or finite runs, or as a combination of the two. We argue though that not all assumption specifications are meaningful: they need to be consistent, which means that there must exist an environment strategy fulfilling the assumption in spite of the agent actions. For such assumptions, we study how to do synthesis/planning for agent goals, ranging from a classical reachability to goal on traces specified in LTL and LTLf/LDLf, characterizing the problem both mathematically and algorithmically.

1 Introduction

Reasoning about actions and planning concern the representation of a dynamic system. This representation consists of a description of the interaction between an agent and its environment and aims at enabling reasoning and deliberation on the possible course of action for the agent (Reiter 2001). Planning in fully observable nondeterministic domains (FOND), say in Planning Domain Defini-tion Language (PDDL), (Ghallab, Nau, and Traverso 2004; Geffner and Bonet 2013) exemplifies the standard methodology for expressing dynamic systems: it represents the world using finitely many fluents under the control of the environment and a finitely many actions under the control of the agent. Using these two elements a model of the dynamics of world is given. Agent goals, e.g., reachability objectives, or, say, temporally extended objectives written in LTL (Bacchus and Kabanza 2000; Camacho et al. 2017; De Giacomo and Rubin 2018), are expressed over such models in terms of such fluents and actions. An important observation is that, in devising plans, the agent takes advantage of such a representation of the world. Such a representation corresponds to knowledge that the agent has of the world. In other words, the agent assumes

that the world works in a certain way, and exploits such an assumption in devising its plans. A question immediately comes to mind:

Obviously the planning domain itself (including the initial state) with its preconditions and effects is such an assumption. That is, as long as the agent sticks to its preconditions, the environment acts as described by the domain. So, the agent can exploit the effect of its actions in order to reach a certain goal (state of affairs). Another common assumption is to assume the domain is fair, i.e., so-called fair FOND (Daniele, Traverso, and Vardi 1999; Pistore and Traverso 2001; Cimatti et al. 2003; Camacho et al. 2017; D’Ippolito, Rodr´ıguez, and Sardi˜na 2018). In this case the agent can exploit not only the effects, but also the guarantee that by continuing to execute an action from a given state the environment will eventually respond with all its possible nondeterministic effects.1 More recently (Bonet and Geffner 2015; Bonet et al. 2017) trajectory constraints over the domain, expressed in LTL, have been proposed to model general restrictions on the possible environment behavior. But is any kind of LTL formula on the fluents and actions of the domain a possible trajectory constraint for the environment? The answer is obviously not! To see this, consider a formula expressing that eventually a certain possible action must actually be performed (the agent may decide not to do it). But then

Which trajectory constraints are suitable as assumptions in a given domain? Focusing on LTL, the question can be rephrased as:

Can any linear-time specification be used as an assumption for the environment? We can summarize these questions, ultimately, by asking:

This is what we investigate in this paper. We take the view that environment assumptions are ways to talk about the set of strategies the environment can enact. Moreover, the plan for the goal, i.e., the agent strategy for fulfilling the goal, need only fulfill the goal against the strategies of the environment from the given set of environment strategies. We formalize this insight and define synthesis/planning under assumptions and the relationship between the two in a general linear-time setting. In particular, our definitions only allow linear-time properties to be assumptions if the environment can enforce them. In doing this we answer the above questions.

We also concretize the study and express goals and assumptions in LTL, automata over infinite words (deterministic parity word automata) (Gr¨adel, Thomas, and Wilke 2002), as well as formalisms over finite traces, i.e., LTLf/LDLf (De Giacomo and Vardi 2013; De Giacomo and Rubin 2018) and finite word automata. This allows us to study how to solve synthesis/planning under assumptions problems. One may think that the natural way to solve such synthesis problems is to have the agent synthesize a strategy for the implication

where both Assumption and Goals are expressed, say, in LTL. A first problem with such an implication is that the agent should not devise strategies that make Assumption false, because in this case the agent would lose its model of the world without necessarily fulfilling its Goal. This undesirable situation is avoided by our very notion of environment assumption. A second issue is this:

We show that this is not the case. Note that an agent that synthesizes for the implication is too pessimistic: the agent, having chosen a candidate agent strategy, considers as possible all environment strategies that satisfy Assumption against the specific candidate strategy it is analyzing. But, in this way the agent gives too much power to the environment, since, in fact, the environment does not know the agent’s chosen strategy. On the other hand, surprisingly, we show that if there is an agent strategy fulfilling Goal under Assumption, then also there exists one that indeed enforces the implication. Thus, even if the implication cannot be used for characterizing the problem of synthesis/planning under assumptions, it can be used to solve it. Exploiting this result, we give techniques to solve synthesis/planning under assumptions, and study the worst case complexity of the problems when goals and assumptions are expressed in the logics and automata mentioned above.

2 Synthesis and Linear-time speciﬁcations

Synthesis is the problem of producing a module that sat-isfies a given property no matter how the environment behaves (Pnueli and Rosner 1989). Synthesis can be thought of in the terminology of games. Let Var be a finite set of Boolean variables (also called atoms), and assume it is partitioned into two sets: A, those controllable by the agent, and E, those controllable by the environment. Let be the set of actions and the set of environment states (note the symmetry: we think of A as a set of actions that are compactly represented as assignments of the variables in A). The game consists of infinitely many phases. In each phase of the game, both players assign values to their variables, with the environment going first. These assignments are given by strategies: an agent strategy and an environment strategy . The resulting infinite sequence of assignments is denoted .2

In classic synthesis the agent is trying to ensure that the produced sequence satisfies a given linear-time property. In what follows we write L to denote a generic formalism for defining linear-time properties. Thus, the reader may substitute their favorite formalism for L, e.g., one can take L to be linear temporal logic, or deterministic parity automata. We use logical notation throughout. For instance, when refers to a logical formula, then refers to conjunction of formulas, but when refers to an automaton then refers to intersection of automata. If write for the set it defines. For instance, if LTL then is the set of infinite sequences that satisfy , but when is an automaton operating on infinite sequences, then is the set of infinite sequences accepted by the automaton. Moreover, in both cases we say that the sequence satisfies .

We say that realizes (written ) if , i.e., if no matter which strategy the environment uses, the resulting sequence satisfies . Similarly, we say that realizes (written ) if . We write (resp. ) for the set of environment (resp. agent) strategies that realize , and in case this set is non-empty we say that is environment (resp. agent) realizable. We write (resp. ) for the set of all environment (resp. agent) strategies.

Solving L environment- (resp. agent-) synthesis asks, given to decide if is environment- (resp. agent-) realizable, and to return such a finite-state strategy (if one exists). In other words, realizability is the recognition problem associated to synthesis. We now recall two concrete speci-fication formalisms L, namely LTL (linear temporal logic) and DPW (deterministic parity word automata), and then state that results about solving LTL/DPW synthesis.

Linear temporal Logic (LTL) Formulas of LTL(Var), or simply LTL, are generated by the following grammar:

where . The size of a formula is the number of symbols in it. LTL formulas are interpreted over infinite sequences . Define the satisfaction relation |= as follows: iff iff for some iff it is not the case that iff iff there exists such that and for all , . Write if and say that satisfies and is a model of . An LTL formula de-fines the set . We use the usual abbreviations, , . Write Bool(Var) for the set of Boolean formulas over Var. We remark that every result in this paper that mentions LTL also holds for LDL (linear dynamic logic) (Vardi 2011; Eisner and Fisman 2006).

Deterministic Parity Word Automata (DPW) A DPW over Var is a tuple where Q is a fi-nite set of states, is an initial state, is the transition function, and Z is the coloring. The run of M on the input word is the infinite sequence of transitions such that . A run is successful if the largest color occurring infinitely often is even. In this case, we say that the input word is accepted. The DPW M defines the set [[M]] consisting of all input words it accepts. The size of M, written |M|, is the cardinality of Q. The number of colors of M is the cardinality of col(Q).

DPWs are effectively closed under Boolean operations, see e.g., (Gr¨adel, Thomas, and Wilke 2002):

Lemma 1. Let be DPW with states and colors, respectively.

1. One can effectively form a DPW with states and colors for the complement of .

2. One can effectively form a DPW with with many states and O(d) many colors, where , for the disjunction .

Thus, e.g., from DPW one can build a DPW for whose number of states is and whose number of colors is O(d).

Every LTL formula can be translated into an equivalent DPW M, i.e., , see e.g. (Vardi 1995; Piterman 2007). Moreover, the cost of this translation and the size of M are at most doubly exponential in the size of , and the number of colors of M is at most singly exponential in the size of .

Here is a summary of the complexity of solving synthesis:

Theorem 2 (Solving Synthesis).

1. Solving LTL environment (resp. agent) synthesis is -complete (Pnueli and Rosner 1989).

2. Solving DPW environment (resp. agent) synthesis is PTIME in the size of the automaton and EXPTIME in the number of its colors (Pnueli and Rosner 1989; Finkbeiner 2016).

3 Synthesis under Assumptions

In this section we give core definitions of environment assumptions and synthesis under such assumptions. Intuitively, the assumptions are used to select the environment strategies that the agent considers possible, i.e., although the agent does not know the particular environment strategy it will encounter, it knows that it comes from such a set. We begin in the abstract, and then move to declarative specifi-cations. Unless explicitly specified, we assume fixed sets E and A of environment and agent atoms.

Here are the main definitions of this paper:

Definition 1 (Environment Assumptions – abstract). We call any non-empty set of environment strategies an environment assumption.

Informally, the set represents the set of environment strategies that the agent considers possible.

Definition 2 (Agent Goals – abstract). We call any set of traces an agent goal.

Definition 3 (Synthesis under assumptions – abstract). Let be an environment assumption and an agent goal. We say that an agent strategy realizes assuming if

Remark 1 (On the non-emptiness of ). Note that the requirement that be non-empty is a consistency requirement; if it were empty then there would be no to test for membership in and so synthesis under assumptions would trivialize and all agent strategies would realize all goals.

For the rest of this paper we will specify agent goals and environment assumptions as linear-time properties. In particular, we assume that L is a formalism for specifying linear-time properties over Var, e.g., L = LTL or L = DPW.

How should determine an assumption ? In general, talks about the interaction between the agent and the environment. However, we want that the agent can be guaranteed that whatever it does the resulting play satisfies . Thus, a given induces the set consisting of all environment strategies such that for all agent strategies the resulting trace satisfies . In particular, for to be non-empty (as required for it to be an environment assumption) we must have that is environment realizable. This justifies the following definitions.

Definition 4 (Synthesis under Assumptions – linear-time).

1. We call an environment assumption if it is environment realizable.

2. We call any an agent goal.

3. An L synthesis under assumptions problem is a tuple P = where is an environment assumption and is an agent goal.

4. We say that an agent strategy realizes assuming , or that it solves P, if .

5. The corresponding decision problem is to decide, given P, if there is an agent strategy solving P.

For instance, solving LTL synthesis under assumptions means, given with environment assumption LTLand agent goal LTL, to decide if there is an agent strategy solving P, and to return such a finite-state strategy (if one exists). We remark that solving LTL synthesis under assumptions is not immediate; we will provide algorithms in the next section. For

now, we point out that deciding whether is an environment assumption amounts to checking if is environment realizable, itself a problem that can be solved by known results (i.e., Theorem 2).

Theorem 3. 1. Deciding if an LTL formula is an environment assumption is -complete.

2. Deciding if a DPW is an environment assumption is in PTIME in the size of the DPW and exponential in its number of colors.

We illustrate such notions with some examples.

Example 1. 1. The set , definable in LTL by the formula , is an environment assumption. It captures the situation that the agent assumes that the environment will use any of the strategies in .

2. In robot-action planning problems, typical environment assumptions encode the physical space, e.g., “if robot is in Room 1 and does action Move then in the next step it can only be in Rooms 1 or 4”. The set of environment strategies that realize these properties is an environment assumption, definable in LTL by a conjunction of formulas of the form . We will generalize this example by showing that the set of environment strategies in a planning domain D can be viewed as an environment assumption definable in LTL.

4 Solving Synthesis under Assumptions

In this section we show how to solve synthesis under assumptions when the environment assumptions and agent goals are given in LTL or by DPW. The general idea is to reduce synthesis under assumptions to ordinary synthesis, i.e., synthesis of the implication . Although correct, understanding why it is correct is not immediate.

Lemma 4. Let be an environment assumption and an agent goal. Then, every agent strategy that realizes also realizes assuming .

Proof. Let be an agent strategy realizing (a). To show that realizes assuming let be an environment strategy realizing (b). Now consider the trace . We must show that satisfies . By (a) satisfies and by (b) satisfies .

We now observe that the converse is not true. Consider A .= {x} and E .= {y}, and let and . First note that is an environment assumption formula (indeed, the environment can realize by playing at the first step). Moreover, every environment strategy realizing begins by playing (since otherwise the agent could play on its first turn and falsify ). Thus, every agent strategy realizes assuming (since the environment’s first move is to play which makes true no matter what the agent does). On the other hand, not every agent strategy realizes (indeed, the strategy which plays x on its first turn fails to satisfy the implication on the trace in which the environment plays y on its first turn). In spite of the failure of the converse, the realizability problems are inter-reducible:3

Theorem 5. Suppose is an environment assumption. The following are equivalent: 1. There is an agent strategy realizing . 2. There is an agent strategy realizing assuming .

Proof. The previous lemma gives us . For the converse, suppose 1 does not hold, i.e., is not agentrealizable. Now, an immediate consequence of Martin’s Borel Determinacy Theorem (Martin 1975) is that for every in any reasonable specification formalism (including all the ones mentioned in this paper), is not agent realizable iff is environment realizable. Thus, is environment-realizable, i.e., . Note in particular that realizes , i.e., . Now, suppose for a contradiction that 2 holds, and take realizing assuming . Then by definition of realizability under assumptions and using the fact that we have that . On the other hand, we have already seen that , a contradiction.

Moreover, we see that one can actually extract a strategy solving synthesis by assumptions simply by extracting a strategy for solving the implication , which itself can be done by known results, i.e., for LTL use Theorem 2 (part 1), and for DPW use Lemma 1 and Theorem 2 (part 2).

Theorem 6. 1. Solving LTL synthesis under assumptions is -complete.

2. Solving DPW synthesis under assumptions is in PTIME in the size of the automata and in EXPTIME in the number of colors of the automata.

5 Planning under Assumptions

In this section we define planning under assumptions, that is synthesis wrt a domain4. We begin with a representation of fully-observable non-deterministic (FOND) domains (Ghal- lab, Nau, and Traverso 2004; Geffner and Bonet 2013). Our representation considers actions symmetrically to fluents, i.e., as assignments to certain variables. A domain consists of:

• a non-empty set E of environment Boolean variables, also called fluents; the elements of are called environment states,

• a non-empty set A (disjoint from E) of action Boolean variables; the elements of are called actions,

• a non-empty set of initial environment states,

• a relation of available actions such that for every there is an with (we say that a is available in s), and

• a relation such that implies that .

As is customary in planning and reasoning about actions, we assume domains are represented compactly by tuples where , and (here ). This data induces the domain where

1. s I iff s |= init,

2. (s, aPre iff s pre, 3. (s, a, tiff s δ.

We emphasize that when measuring the size of D we use this compact representation:

Definition 5. The size of D, written |D|, is |E| + |A| + .

We remark that in PDDL action preconditions are declared using :precondition, conditional effects using the when operator, and nondeterministic outcomes using the oneof operator (note that we code actions with action variables).

Example 2 (Universal Domain). Given E and A define the universal domain where I .= E, and .

We now define the set of environment strategies induced by a domain. We do this by describing a property , that itself can be represented in LTL and DPW, as shown below.

Definition 6. Fix a domain D. Define a property (over atoms ) as consisting of all traces such that

Observe that is an environment assumption since, by the definition of domain, whenever an action is available in a state there is at least one possible successor state. Intuitively, an environment strategy is in if i) its first move is to pick an initial environment state, and ii) thereafter, if the current action a is available in the current environment state x (and the same holds in all earlier steps) then the next environment state is constrained so that . Notice that is unconstrained the moment a is not available in x, e.g., in PDDL these would be actions for which the preconditions are not satisfied. Intuitively, this means that it is in the interest of the agent to play available actions because otherwise the agent can’t rely on the fact that the trace comes from the domain.

Remark 2. The reader may be wondering why the above definition does not say i’) and ii’) for all , . Consider the linear-time property consisting of traces satisfying i’ and ii’. Observe that, in general, is not environment realizable. Indeed, condition ii’ implies that is available in E. However, no environment strategy can force the agent to play an available action.

We now observe that one can express in LTL.

Lemma 7. For every domain D there is an LTL formula equivalent to . Furthermore, the size of the LTL formula is linear in the size of D.

sented compactly by . For the LTL formula, let be the LTLformula formed from the formula by replacing every term of the form by X e. Note that iff . The promised LTLformula is

One can also express directly by a DPW.

Lemma 8. For every domain D there is a DPW equivalent to . Furthermore, the size of the DPW is at most exponential in the size of D and has two colors.

To do this we define the DPW directly rather than translate the LTL formula (which would give a double exponential bound). Define the DPW over A as follows. Introduce fresh symbols . Let be the initial state. Define . Define , and col(q) = 0 for all . For all the transitions are given in Table 1. Intuitively, on reading the input the DPW goes to the rejecting sink if (resp. I) is not respected, it goes to the accepting sink if (resp. I) is respected but Pre is not, and otherwise it continues (and accepts).

Table 1: Transitions for DPW for

Definition 7. Let D be a domain.

• A set is an environment assumption for the domain D if is non-empty.

• is an environment assumption for the domain D if is non-empty, i.e., if is an environment assumption for the domain D. We illustrate the notion with some examples.

Example 3.

1. is an environment assumption for D since is environment realizable.

2. Let denote the following property: iff for all , if there are infinitely many n such that and , then for every

with there are infinitely many n such that and . In words, this says that if a state-action pair occurs infinitely often, then infinitely often this is followed by every possible effect.

Note that is an environment assumption for domain D since, e.g., the strategy that resolves the effects in a round-robin way realizes . Note that is definable in LTL by a formula of size exponential in D:

3. In planning, trajectory constraints, e.g., expressed in LTL, have been introduced for expressing temporally extended goals (Bacchus and Kabanza 2000; Gerevini et al. 2009). More recently, especially in the context of generalized planning, they have been used to describe restrictions on the environment as well (Bonet and Geffner 2015; De Giacomo et al. 2016; Bonet et al. 2017). However, not all trajectory constraints can be used as assumptions. In fact, Definition 7, which says that a formula is an environment assumption for the domain D if is environment realizable, characterizes those formulas that can serve as trajectory constraints.

We can check if LTL is an environment assumption for D by converting it to a DPW , converting D into the DPW (as above), and then checking if the DPW is environment realizable. Hence we have:

Theorem 9. 1. Deciding if an LTL formula is an environment assumption for the domain D is -complete. Moreover, it can be solved in EXPTIME in the size of D and in the size of .

2. Deciding if a DPW is an environment assumption for the domain D is in EXPTIME. Moreover, it can be solved in EXPTIME in the size of D and PTIME in the size of and EXPTIME in the number of colors of . For the lower bound take D .= U to be the universal do-

main and apply the lower bound from Theorem 2. Now we turn to planning under assumptions.

Definition 8 (Planning under Assumptions – abstract).

1. A planning under assumptions problem P is a tuple where • D is a domain, • is an environment assumption for D, and • is an agent goal.

2. We say that an agent strategy solves P if

We can instantiate this definition to environment assumptions and agent goals definable in L.

Definition 9 (Planning under Assumptions – linear-time).

1. An L planning under assumptions problem is a tuple P = where is an environment assumption for D and is an agent goal.

2. We say that an agent strategy realizes assuming , or that it solves P, if

The corresponding decision problem asks, given an L planning under assumptions problem P to decide whether there is an agent strategy that solves P. For instance, LTL planning under assumptions asks, given with LTL, to decide if there is an agent strategy that solves P, and to return such a finite-state strategy (if one exists). Similar definitions apply to DPW planning under assumptions, etc.

It turns out that virtually all forms of planning (with linear-time temporally extended goals) in the literature are special cases of planning under LTL assumptions, i.e., the set of strategies that solve a given planning problem are exactly the set of strategies that solve the corresponding planning under assumptions problem. In the following, , and Exec is the LTL formula expressing that if an action is done then its precondition holds.

Example 4. 1. FOND planning with reachability goals (Rintanen 2004) corresponds to LTL planning under assumptions with and .

2. FOND planning with LTL (temporally extended) goals (Bacchus and Kabanza 2000; Pistore and Traverso 2001; Camacho et al. 2017). corresponds to LTL planning under assumptions with and goal .

3. FOND planning with LTL trajectory constraints and LTL (temporally extended) goals (Bonet and Geffner 2015; De Giacomo et al. 2016; Bonet et al. 2017) corresponds to LTL planning under assumptions with assumptions and goal .

4. Fair FOND planning with reachability goals (Daniele, Traverso, and Vardi 1999; Geffner and Bonet 2013; D’Ippolito, Rodr´ıguez, and Sardi˜na 2018) corresponds to planning under assumptions with and Exec .

5. Fair FOND planning with (temporally extended) goals as defined in (Patrizi, Lipovetzky, and Geffner 2013; Camacho et al. 2017) corresponds to planning under assumptions with and goal .

6. Obviously adding LTL trajectory constraints to fair FOND planning with (temporally extended) goals corresponds to planning under assumptions with and goal .

We also observe that the Fair FOND planning problems just mentioned can be captured by LTL planning under assumptions since can be written in LTL (see Example 3).

6 Translating between planning and synthesis

In this section we ask the question if there is a fundamental difference between synthesis and planning in our setting (i.e., assumptions and goals given as linear-time properties).

We answer by observing that there are translations between them. The next two results follow immediately from the definitions: Theorem 10 (Synthesis to Planning). Let be a synthesis under Assumptions problem, and let P = be the corresponding Planning under Assumptions problem where U is the universal domain. Then, for every agent strategy we have that solves P iff realizes assuming . Theorem 11 (Planning to Synthesis). Let D = be a domain and let be a Planning under Assumptions problem. Let be the corresponding Synthesis under Assumptions problem. Then, for every agent strategy we have that solves P iff realizes assuming .

Thus, we can solve LTL planning under assumptions by reducing to LTL synthesis under assumptions, which itself can be solved by known results (i.e., Theorem 2): Corollary 12. Solving LTL planning under assumptions is -complete.

However, this does not distinguish the complexity measured in the size of the domain from that in the size of the assumption and goal formulas. We take this up next.

7 Solving Planning under Assumptions

In this section we show how to solve Planning under Assumptions for concrete specification languages L, i.e., L = LTL and L = DPW. We measure the complexity in two different ways: we fix the domain D and measure the complexity with respect to the size of the formulas or automata for the environment assumption and the agent goal, this is called goal/assumption complexity; and we fix the formulas/automata and measure the complexity with respect to the size of the domain, this is called the domain complexity. 5 We begin with L = DPW and consider the following algorithm: Given in which is represented by a DPW and is represented by a DPW , perform the following steps:

Alg 1. Solving DPW planning under assumptions Given domain D, assumption , goal . 1: Form DPW equivalent to . 2: Form DPW M for . 3: Solve the parity game on M.

The first step results in a DPW whose size is exponential in the size of D and with a constant number of colors (Lemma 8). The second step results in a DPW whose size is polynomial in the number of states of the DPWs involved (i.e., and ), and exponential in the number of their colors (Lemma 1). For the third step, the think of the DPW M as a parity game: play starts in the initial state, and at each step, if q is the current state of M, first the environment picks and then the agent picks an action , i.e., an evaluation of the action variables. The subsequent step starts in the state of M resulting from taking the unique transition from q labeled . This produces a run of the DPW which the agent is trying to ensure is successful (i.e., the largest color occurring infinitely often is even).

Formally, we say that an agent strategy is winning if for every environment strategy , the unique run of the DPW on input word is successful. Deciding if the a player has a winning strategy, and returning a finite-state strategy (it one exists), is called solving the game. Parity games can be solved in time polynomial in the size of M and exponential in the number of colors of M (Gr¨adel, Thomas, and Wilke 2002).6

The analysis of the above algorithm shows the following.

Theorem 13. 1. The domain complexity of solving DPW planning under assumptions is in EXPTIME.

2. The goal/assumption complexity of solving DPW planning under assumptions is in PTIME in their sizes and EX- PTIME in the number of their colors. Moreover, by converting LTL formulas to DPW with ex-

ponentially many colors and double-exponential many states

(Vardi 1995; Piterman 2007), we get the upper bounds in the following:

Theorem 14. 1. The domain complexity of solving LTL planning under assumptions is EXPTIME-complete.

2. The goal/assumption complexity of solving LTL planning under assumptions is -complete. For the matching lower-bounds, we have that the domain

complexity is EXPTIME-hard follows from the fact that plan-

ning with reachability goals and no assumptions is EXP- TIME-hard (Rintanen 2004); to see that the goal/assumption complexity is -hard note that LTL synthesis, known to be -hard (Pnueli and Rosner 1989; Rosner 1992), is a special case (take and D to

be the universal domain). Similarly, one can apply this technique to solving Fair

LTL planning under assumptions. The exact complexity, however, is open. See the conclusion for a discussion.

8 Focusing on ﬁnite traces

In this section we revisit the definitions and results in case that assumptions and goals are expressed as linear-time properties over finite traces. There are two reasons to do this. First, in AI and CS applications executions of interest are often finite (De Giacomo and Vardi 2013). Second, the algorithms presented for the infinite-sequence case involve complex constructions on automata/games that are notoriously hard to optimize (Fogarty et al. 2013). Thus, we will not simply reduce the finite-trace case to the infinite-trace case (De Giacomo, Masellis, and Montali 2014). We begin by carefully defining the setting.

Synthesis and linear-time specifications over finite traces We define synthesis over finite traces in a similar way to the infinite-trace case, cf. (De Giacomo and Vardi 2015; Camacho, Bienvenu, and McIlraith 2018). The main difference is that agent strategies can be partial. This represents the situation that the agent stops the play. Environment strategies are total (as before). Thus, the resulting play may be finite, if the agent chooses to stop, as well as infinite.7 Objectives may be expressed in general specification formalisms Lf for fi-nite traces, e.g., Lf = LTLf (LTL over finite traces8), Lf = DFA (deterministic finite word automata). For , we overload notation and write for the set of finite traces defines.

We now define realizability in the finite-trace case:

Definition 10. Let .

1. We say that realizes (written ) if is finite and .

2. We say that realizes (written ) if is finite, then . The asymmetry in the definition results from the fact that

stopping is controlled by the agent. Duality still holds, and is easier to prove since it amounts

to determinacy of reachability games (Gr¨adel, Thomas, and Wilke 2002):

Lemma 15 (Duality). For every we have that is not agent realizable iff is environment realizable.

Linear temporal logic on finite traces (LTLf) The logic LTLf has the same syntax as LTL but is interpreted on finite traces . Formally, for (the length of ) we only reinterpret the temporal operators:

• (π, n) |= X ϕ iff n < lenand (π, n + 1) |= ϕ;

• iff there exists i with

such that and for all . Let denote the dual of X, i.e., . Semantically we have that

• (π, nϕ iff n < lenimplies (π, n + 1) |= ϕ.

Deterministic finite automata (DFA) A DFA over Var is a tuple which is like a DPW except that col is replaced by a set of final states. The run on a finite input trace is successful if it ends in a final state. We recall that DFA are closed under Boolean operations using classic algorithms (e.g., see (Vardi 1995)). Also, LTLf formulas (and also LDLf formulas) can be effectively translated into DFA. This is done in three classic simple steps that highlight the power of the automata-theoretic approach: convert to an alternating automaton (poly), then into a nondeterministic finite automaton (exp), and then into a DFA (exp). These steps are outlined in detail in, e.g., (De Giacomo and Vardi 2013).

Solving Synthesis over finite traces Lf agent synthesis is the problem, given , of deciding if the agent can realize . Now, solving DFA agent synthesis is PTIME-complete: it amounts to solving a reachability game on the given DFA M, which can be done with an algorithm that captures how close the agent is to a final state, i.e., a least-fixpoint of the operation. Finally, to solve LTLf agent synthesis first translate the LTLf formula to a DFA and then run the fix-point algorithm (also, LTLf agent synthesis is -complete) (De Giacomo and Vardi 2015).

Note that, by Duality, solving LTLf environment realizability and solving LTLf agent realizability are inter-reducible (and thus the former is also -complete). Thus, to decide if is environment realizable we simply negate the answer to whether is agent realizable. However, to extract an environment strategy, one solves the dual safety game.

Synthesis under assumptions We say that is an environment assumption if is environment realizable. Solving Lf synthesis under assumptions means to decide if there is an agent strategy such that

We now consider the case that Lf = LTLf. Checking if LTLf is an environment assumption is, by definition, the problem of deciding if is environment realizable, as just discussed. Hence we can state the following:

Theorem 16.

1. Deciding if an LTLf formula is an environment assumption is -complete.

2. Deciding if a DFA is an environment assumption is PTIME-compete (cf. (Gr¨adel, Thomas, and Wilke 2002)). Turning to LTLf synthesis under assumptions we have

that synthesis under assumptions and synthesis of the im-

plication are equivalent. Indeed, as before, the key point is the duality which we have in Lemma 15:

Theorem 17. Suppose is an environment assumption. The following are equivalent:

1. There is an agent strategy realizing .

2. There is an agent strategy realizing assuming . Hence to solve synthesis under assumptions we simply

solve agent synthesis for the implication. Hence we have:

Theorem 18.

1. Solving LTLf synthesis under assumptions is -complete.

2. Solving DFA synthesis under assumptions is PTIME-complete.

Planning under assumptions Planning and fair planning have recently been studied for LTLf goals (De Giacomo and Rubin 2018; Camacho, Bienvenu, and McIlraith 2018; Camacho et al. 2018). Here we define and study how to add environment assumptions.

Recall that we represent a planning domain D by the linear-time property (Definition 6) which itself was de-fined as those infinite traces satisfying two conditions. The exact same conditions determine a set of finite traces, also denoted . Moreover, this is equivalent to an LTLf formula of size linear in D and a DFA of size at most exponential in D. To see this, replace X by in the LTL formula from Lemma 7. That is, let be the LTLf formula formed from by replacing every term of the form by . Note that if then iff , and if then iff . The promised LTLfformula is . Also, similar to the DPW before there is a DFA of size at most exponential in the size of D equivalent to . To see this, take the DPW from Lemma 8 and instead of col define the set of final states to be the set .

As before, say that is an environment assumption for the domain D if is environment realizable. De-fine an Lf planning under assumptions problem to be a tuple with such that is an environment assumption for D. To decide if LTLf/DFA is an environment assumption for D we use the next algorithm:

Alg 2. Deciding if is an environment assumption for D Given domain D, and DFA . 1: Convert D into a DFA equivalent to . 2: Form the DFA M for . 3: Decide if M is environment realizable. Further, if is given as an LTLf formula, first convert it to a DFA and then run the algorithm. We then have: Theorem 19. 1. Deciding if LTLf formula is an environment assumption for the domain D is -complete. Moreover, it can be solved in EXPTIME in the size of D and in the size of .

2. Deciding if DFA is an environment assumption for the domain D is in EXPTIME. Moreover, it can be solved in EXPTIME in the size of D and PTIME in the size of .

Solving Planning under Assumptions As before, there are simple translations between Lf planning under assumptions and Lf synthesis under assumptions. And again, solving LTLf planning under assumptions via such a translation is not fine enough to analyze the complexity in the domain vs the goal/assumption. To solve DFA/LTLf planning under assumptions use the following simple algorithm:

Alg 3. Solving DFA planning under assumptions Given domain D, assumption , goal . 1: Convert D into a DFA equivalent to . 2: Form the DFA M for . 3: Solve the reachability game on DFA M. Further, if is given as an LTLf formula, first convert it to a DFA and then run the algorithm. This gives the upper bounds in the following: Theorem 20. 1. The domain complexity of solving DFA (resp. LTLf) planning under assumptions is EXPTIME-complete.

2. The goal/assumption complexity of solving DFA (resp. LTLf) planning under assumptions is PTIME-complete (resp. -complete).

For the lower bounds, setting results in FOND with reachability goals, known to be EXPTIME-hard (Rin- tanen 2004); and additionally taking the domain D to be the universal domain results in DFA (resp. LTLf) synthesis, known to be PTIME-hard (Gr¨adel, Thomas, and Wilke 2002) (resp. -hard (De Giacomo and Vardi 2015)).

Finally, if is an LTLf planning under assumptions problem, say that fairly solves P if for every we have that if then is finite and satisfies (here from Example 3 is defined so that it now also includes all finite traces). We remark that Alg 2 applies unchanged. However, to solve the fair LTLf planning problem, we do not know a better way, in general, than translating the problem into one over infinite traces and applying the techniques from the previous section.

9 Conclusion and Outlook

While we illustrate synthesis and planning under assumptions expressed in linear-time specifications, our definitions immediately apply to assumptions expressed in branchingtime specifications, e.g., -calculus, and tree automata. As future work, it is of great interest to study synthesis under assumptions in the branching time setting so as to devise restrictions on possible agent behaviors with certain guarantees, e.g., remain in an area from where the agent can enforce the ability to reach the recharging doc, whenever it needs to, in the spirit of (Dal Lago, Pistore, and Traverso 2002). Although our work is in the context of reasoning about actions and planning, we expect it can also provide insights to verification and to multi-agent systems. In particular, the undesirable drawback of the agent being able to falsify an assumption when synthesizing Goal is well known, and it has been observed that it can be overcome when the Assumption is environment realizable (D’Ippolito et al. 2013; Brenguier, Raskin, and Sankur 2017). Our Theorem 5 provides the principle for such a solution. Interestingly, various degrees of cooperation to fulfill assumptions among adversarial agents has been considered, e.g., (Chatterjee and Henzinger 2007; Bloem, Ehlers, and K¨onighofer 2015; Brenguier, Raskin, and Sankur 2017) and we believe that a work like present one is needed to establish similar principled foundations. Turning to the multi-agent setting, there, agents in a common environment interact with each other and may have their own objectives. Thus, it makes sense to model agents not as hostile to each other, but as rational, i.e., agents that act to achieve their own objectives. Rational synthesis (Kupferman, Perelli, and Vardi 2014) (as compared to classic synthesis) further requires that the strategy profile chosen by the agents is in equilibrium (various notions of equilibrium may be used). It would be interesting to investigate rational synthesis under environment assumptions, in the sense that all agents also make use of their own assumptions about their common environment. We believe that considering assumptions as sets of strategies rather than sets of traces will serve as a clarifying framework also for the multi-agent setting.

Finally, there are a number of open questions regarding the computational complexity of solving synthesis/planning under assumptions, i.e., what is the exact complexity of Fair LTL/LTLf planning under assumptions? what is the assumption complexity of LTL/LTLf synthesis under assumptions? Here, the assumption complexity is the complexity of the problem assuming the domain and goal are fixed, and the only input to the problem is the assumption formula/automaton.

References

[Bacchus and Kabanza 2000] Bacchus, F., and Kabanza, F. 2000. Using temporal logics to express search control knowledge for planning. Artif. Intell. 116(1-2):123–191.

[Bloem, Ehlers, and K¨onighofer 2015] Bloem, R.; Ehlers, R.; and K¨onighofer, R. 2015. Cooperative reactive synthesis. In Proc. of ATVA 2015. 394–410.

[Bonet and Geffner 2015] Bonet, B., and Geffner, H. 2015. Policies that generalize: Solving many planning problems with the same policy. In IJCAI, 2798–2804. AAAI Press.

[Bonet et al. 2017] Bonet, B.; De Giacomo, G.; Geffner, H.; and Rubin, S. 2017. Generalized planning: Nondeterministic abstractions and trajectory constraints. In IJCAI, 873–879.

[Brenguier, Raskin, and Sankur 2017] Brenguier, R.; Raskin, J.; and Sankur, O. 2017. Assume-admissible synthesis. Acta Inf. 54(1):41–83.

[Calude et al. 2017] Calude, C. S.; Jain, S.; Khoussainov, B.; Li, W.; and Stephan, F. 2017. Deciding parity games in quasipolynomial time. In STOC, 252–263. ACM.

[Camacho et al. 2017] Camacho, A.; Triantafillou, E.; Muise, C.; Baier, J. A.; and McIlraith, S. 2017. Nondeterministic planning with temporally extended goals: LTL over finite and infinite traces. In AAAI.

[Camacho et al. 2018] Camacho, A.; Baier, J. A.; Muise, C. J.; and McIlraith, S. A. 2018. Finite LTL synthesis as planning. In ICAPS, 29–38. AAAI Press.

[Camacho, Bienvenu, and McIlraith 2018] Camacho, A.; Bi- envenu, M.; and McIlraith, S. A. 2018. Finite LTL synthesis with environment assumptions and quality measures. In KR, 454–463. AAAI Press.

[Chatterjee and Henzinger 2007] Chatterjee, K., and Hen- zinger, T. A. 2007. Assume-guarantee synthesis. In Tools and Algorithms for the Construction and Analysis of Systems. 261–275.

[Cimatti et al. 2003] Cimatti, A.; Pistore, M.; Roveri, M.; and Traverso, P. 2003. Weak, strong, and strong cyclic planning via symbolic model checking. Artif. Intell. 1–2(147).

[Dal Lago, Pistore, and Traverso 2002] Dal Lago, U.; Pis- tore, M.; and Traverso, P. 2002. Planning with a language for extended goals. In AAAI, 447–454.

[Daniele, Traverso, and Vardi 1999] Daniele, M.; Traverso, P.; and Vardi, M. Y. 1999. Strong cyclic planning revisited. In ECP, 35–48.

[De Giacomo and Rubin 2018] De Giacomo, G., and Rubin, S. 2018. Automata-theoretic foundations of fond planning for LTL/LDLgoals. In IJCAI.

[De Giacomo and Vardi 2013] De Giacomo, G., and Vardi, M. Y. 2013. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI.

[De Giacomo and Vardi 2015] De Giacomo, G., and Vardi, M. Y. 2015. Synthesis for LTL and LDL on finite traces. In IJCAI.

[De Giacomo et al. 2016] De Giacomo, G.; Murano, A.; Ru- bin, S.; and Stasio, A. D. 2016. Imperfect-information games and generalized planning. In IJCAI, 1037–1043.

[De Giacomo, Masellis, and Montali 2014] De Giacomo, G.; Masellis, R. D.; and Montali, M. 2014. Reasoning on ltl on finite traces: Insensitivity to infiniteness. In AAAI.

[D’Ippolito et al. 2013] D’Ippolito, N.; Braberman, V. A.; Piterman, N.; and Uchitel, S. 2013. Synthesizing nonanomalous event-based controllers for liveness goals. ACM Trans. Softw. Eng. Methodol. 22(1):9:1–9:36.

[D’Ippolito, Rodr´ıguez, and Sardi˜na 2018] D’Ippolito, N.; Rodr´ıguez, N.; and Sardi˜na, S. 2018. Fully observable non-deterministic planning as assumption-based reactive synthesis. J. Artif. Intell. Res. 61:593–621.

[Eisner and Fisman 2006] Eisner, C., and Fisman, D. 2006. A practical introduction to PSL. Springer.

[Finkbeiner 2016] Finkbeiner, B. 2016. Synthesis of reactive systems. Dependable Software Systems Eng. 45:72–98.

[Fogarty et al. 2013] Fogarty, S.; Kupferman, O.; Vardi, M. Y.; and Wilke, T. 2013. Profile trees for B¨uchi word automata, with application to determinization. In GandALF.

[Geffner and Bonet 2013] Geffner, H., and Bonet, B. 2013. A Coincise Introduction to Models and Methods for Automated Planning. Morgan & Claypool.

[Gerevini et al. 2009] Gerevini, A.; Haslum, P.; Long, D.; Saetti, A.; and Dimopoulos, Y. 2009. Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners. Artif. Intell. 173(5-6):619–668.

[Ghallab, Nau, and Traverso 2004] Ghallab, M.; Nau, D. S.; and Traverso, P. 2004. Automated planning – Theory and Practice. Elsevier.

[Gr¨adel, Thomas, and Wilke 2002] Gr¨adel, E.; Thomas, W.; and Wilke, T., eds. 2002. Automata, Logics, and Infinite Games: A Guide to Current Research, LNCS 2500.

[Kupferman, Perelli, and Vardi 2014] Kupferman, O.; Perelli, G.; and Vardi, M. Y. 2014. Synthesis with rational environments. In Multi-Agent Systems. 219–235.

[Martin 1975] Martin, D. A. 1975. Borel determinacy. Annals of Mathematics 363–371.

[Patrizi, Lipovetzky, and Geffner 2013] Patrizi, F.; Lipovet- zky, N.; and Geffner, H. 2013. Fair LTL synthesis for non-deterministic systems using strong cyclic planners. In IJCAI, 2343–2349.

[Pistore and Traverso 2001] Pistore, M., and Traverso, P.

2001. Planning as model checking for extended goals in non-deterministic domains. In IJCAI.

[Piterman 2007] Piterman, N. 2007. From nondeterministic b¨uchi and streett automata to deterministic parity automata. Logical Methods in Computer Science 3(3).

[Pnueli and Rosner 1989] Pnueli, A., and Rosner, R. 1989. On the synthesis of a reactive module. In POPL.

[Pnueli and Zuck 1993] Pnueli, A., and Zuck, L. D. 1993. Probabilistic verification. Inf. Comput. 103(1):1–29.

[Pnueli 1983] Pnueli, A. 1983. On the extremely fair treat- ment of probabilistic algorithms. In STOC.

[Reiter 2001] Reiter, R. 2001. Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems. The MIT Press.

[Rintanen 2004] Rintanen, J. 2004. Complexity of planning with partial observability. In ICAPS.

[Rosner 1992] Rosner, R. 1992. Modular synthesis of reactive systems. Ph.D. Dissertation, PhD thesis, Weizmann.

[Vardi 1995] Vardi, M. Y. 1995. An automata-theoretic ap- proach to linear temporal logic. In Moller, F., and Birtwistle, G. M., eds., Logics for Concurrency - Structure versus Automata, volume 1043 of LNCS, 238–266. Springer.

[Vardi 2011] Vardi, M. Y. 2011. The rise and fall of linear time logic. In GandALF.

designed for accessibility and to further open science