b

DiscoverSearch
About
My stuff
Incremental Monoidal Grammars
2020·arXiv
Abstract
Abstract

In this work we define formal grammars in terms of free monoidal categories, along with a functor from the category of formal grammars to the category of automata. Generalising from the Booleans to arbitrary semirings, we extend our construction to weighted formal grammars and weighted automata. This allows us to link the categorical viewpoint on natural language to the standard machine learning notion of probabilistic language model.

Recurrent neural networks and probabilistic language models [BDVJ03] have become standard tools in the natural language processing (NLP) community. These networks are inherently incremental, scanning through the sequence of words they are given and updating their prediction for what comes next. Despite their practical success on hard language tasks such as translation and question answering, the structure underlying these machine learning models is yet poorly understood, and they are generally only used as black boxes.

On the other hand, the categorical compositional distributional (DisCoCat) models of Coecke et al. [CCS08, CCS10, CGS13] use grammatical structure, explicitly encoded as string diagrams in a free monoidal category, to compute natural language semantics. DisCoCat models have received experimental support on small-scale tasks [GS11, KSP12, KSP13], but the extra mathematical structure makes them hard to scale to the billions of words used in the training of modern NLP models.

In this work, we aim to bridge this gap by constructing a functor which sends formal grammars (encoded as monoidal categories) to a state automaton that parses the grammar in an incremental way, reading one word at a time and updating the set of possible parsings. In section 1 we introduce some core definitions and preliminaries. In section 2 we introduce monoidal grammars and S-monoidal grammars for an arbitrary semiring S. In section 3 we define the functor I, from the category of S-monoidal grammars to coalgebras of the weighted automata functor  W(X) = S × (SX)V . Insection 4, we describe an algorithm for learning  R≥0-monoidal grammars from a probabilistic language model. This paper represents a work in progress, and we conclude with a discussion of future work.

1.1 Formal Grammars

Given a pair of sets X and Y , we write  X × Y and X + Yfor their Cartesian product and disjoint union respectively. We use the Kleene star  X∗ = �n∈N Xnto denote the free monoid, with concatenation as product and the empty sequence  ϵ ∈ X0 as unit.Similarly, we write  X+ = �n>0 Xnfor the free semigroup (i.e. non-empty sequences).

If we fix a finite set of words V called the vocabulary, then a the language of a formal grammar over this set of words is a subset  L ⊆ V ⋆: the set of sequences of words that form grammatical sentences accoridng to the rules of the grammar. A wellknown class of formal grammars is context-free grammars, introduced by Chomsky as the first level of his complexity-theoretic hierarchy [Cho56, Cho57, Cho65].

Definition 1.1. A context-free grammar is given by a 4-tuple (V, X, R, s) where:

V is a finite set called the vocabulary (or terminal symbols),

X is a finite set of syntactic variables (or nonterminal symbols),

 R ⊆ X × (V + X)∗ is a finite set ofproduction rules,

 s ∈ Xis a syntactic variable called the start symbol.

We define the language  L(G) ⊆ V ⋆ of a context-free grammar G = (V, X, R, s) as follows: We first construct a binary relation (→R) ⊆ (V + X)∗ × (V + X)∗ wherefor any pair of strings  s, t ∈ (V + X)∗ we have s →R tif and only if there is some production rule (x, v) ∈ Rand a pair  u, w ∈ (V + X)∗ such that s = uxw andt = uvw. We then compute its reflexive transitive closure  →∗R and define L(G) ={ u ∈ V ∗ | s →∗R u }. Note that this corresponds precisely to the preordered monoid generated by the production rules R, i.e. with generators V + X and relations  x ≤ ufor each (x, u) ∈ R.

Definition 1.2. A preordered monoid is a monoid X equipped with a reflexive transitive relation  ≤ such that v ≤ v′ implies uvw ≤ uv′w for all u, v, v′, w ∈ X.

If we generalise this to arbitrary preordered monoids, i.e. to grammars with production rules that have arbitrary sources R (and not just the ones with an R with restrictions as defined above), we obtain abstract rewriting systems, also known as semi-Thue systems [Thu14, Pow13]. If we forbid rules with empty source or target we obtain the word-problem for semigroups, shown to be undecidable independently by Markov [Mar47] and Post [Pos47]. Both are equivalent in computational power to the last level of Chomsky’s hierarchy: unrestricted grammars (which are thus undecidable).

Definition 1.3. An unrestricted grammar is a tuple G = (V, X, R, s) where V and X are finite sets called the terminals and non-terminals with  R ⊆ (V + X)+ × (V + X)∗and  s ∈ X. We define L(G) = { u ∈ V ∗ | s ≤ u }the language of G, where the order is given by the preordered monoid generated by R.

Empirically, the complexity of human languages has been shown to lie somewhere in between that of context-free and unrestricted grammars [Shi87], see the line of work on midly context-sensitive grammars [Jos85, Wei88, VW94].

We cannot infer the underlying structure of human languages from complexity-theoretic arguments: indeed, the same class of languages can be generated by nonisomorphic grammar formalisms. As our main example, Lambek’s pregroup grammars [Lam99, Lam01, Lam08] have the same expressive power as context-free grammars [BM08], i.e. there are back-and-forth translations which preserve the generated languages. However, pregroup grammars and context-free grammars are only weakly equivalent: these translations do not give an isomorphism.

Definition 1.4. A pregroup1 is a preordered monoid P equipped with a pair of functions (−)l, (−)r : P → P such that tlt ≤ ϵ ≤ ttl and ttr ≤ ϵ ≤ trt for all t ∈ P.

Definition 1.5. A pregroup grammar is a tuple G = (V, B, D, s) where V, B and D ⊆ V ×PBare finite sets and s is an element of B, for B a set of basic grammatical types and  PBthe free pregroup generated by B.

In the above, V is the Vocabulary, B a set of basic grammatical types such as {n, s}, for n the type of a noun phrase and s that of a sentence, and D a relation known as the type dictionary or the lexicon. The language L(G) of a pregroup grammar G = (V, B, D, s) is the set of all strings  w1w2 · · · wn ∈ V ∗ such that t1 . . . tn ≤ s foreach (wi, ti) ∈ D.

1.2 Monoidal Categories

We assume familiarity with monoidal categories, see e.g. [Awo06] for an introduction. In order to fix some notation, we define monoidal signatures and presentations.

Definition 1.6. A monoidal signature is a tuple Σ = (Σ0, Σ1, dom, cod) where Σ0and Σ1are sets of generating objects and arrows respectively, and dom, cod : Σ1 → Σ∗0are pairs of functions called domain and codomain.

A homomorphism of monoidal signatures  h : Σ → Σ′ is given by a pair of functions h0 : Σ0 → Σ′0 and h1 : Σ1 → Σ′1 such that the following diagram commutes:

image

We write MonSig for the category of monoidal signatures and their homomorphisms. The free monoidal category  CΣgenerated by a monoidal signature Σ is the image of the left adjoint to the forgetful functor from MonCat to MonSig. Explicitly, the arrows  f : x → y in CΣare given by (planar progressive) string diagrams with arrows in Σ1as nodes and  x, y ∈ Σ⋆0 as input/output, see [JS88, JS91, Sel10]. In order to impose equations on diagrams, we need some extra data: a presentation.

Definition 1.7. A presentation for a monoidal category is given by a monoidal signature Σ and a set of relations  R ⊆ �u,t∈Σ∗0 CΣ(u, t) × CΣ(u, t)between parallel arrows of the free monoidal category.

The monoidal category generated by (Σ, R) is the quotient  CΣ/R, see [Lan98][II.8].

1.3 Coalgebras

Coalgebras are a useful framework for categorically modeling dynamical systems. In functional programming terminology, some functor F defines a type signature for a mapping on Set, and an F-coalgebra is a particular implementation of this signature.

Definition 1.8. For an endofunctor  F : Set → Set, an F-coalgebra is a pair (X, g),where X is a set and  g : X → FXis a function that defines structure over X.

Example 1.8.1. For X a set of states and F the identity functor, an F-coalgebra (X, g) maps states to states, computing a function over states such as permutations.

Example 1.8.2. The powerset functor F(X) = P(X) maps a state to a set of states, computing the accessibility relation of a Kripke structure or the transitions of an automata.

Example 1.8.3. The functor  D(X) = B × XV maps states in X to the product of {false, true} and the set of functions from some set V back to X. This signature is the curried form of a deterministic automata transition function: each state in X is tagged with a boolean value indicating whether or not it is an accept state, and each state defines a function from a vocabulary V to the next state.

Our goal in this section will be to formally describe a characterization of formal grammars in terms of monoidal categories. We will begin with the following definition:

Definition 2.1. A monoidal grammar is a tuple  G = (Σ, R, s) where Σ, R is thepresentation of a monoidal category  CG with V ∪ {s} ∈ Σ0. We define L(G) ={ u ∈ V ∗ | ∃ f ∈ CG(u, s) }.

Given two monoidal grammars  G and G′ over a common vocabulary V , we will define the morphisms between them to be the monoidal functors  F : CG → CG′ suchthat the following diagram commutes, where  inc and inc′ are the inclusion maps from V ∗ ∪ {s}into the objects of  CG and CG′respectively:

image

Monoidal grammars and their morphisms trivially form a subcategory of (V + { s })∗/MonCat, which we will denote Grammar.

We can characterize the formal grammars that we described earlier as monoidal grammars. For example:

Proposition 2.1. A pregroup grammar G = (V, B, D, s) is a monoidal grammar with Σ0 = V + B × Z, Σ1 = D + �x∈B×Z { cupx, capx }2and the relations R given by the snake equations, see [Sel10][4.1].

Proposition 2.2. A context-free grammar  G = (V, X, RG, s)is a monoidal grammar with Σ0 = V + X, Σ1 = {u → x| (x, u) ∈ RG}, and R = ∅.

We can now express the languages of these constructions in a unified manner:

Definition 2.2. For w1w2...wn ∈ V ∗ we call an arrow  w1w2...wn → r where r ∈Obj(CG) aparse state of  w1w2...wn. Whenever in a parse state we have r = s, we call that parse state a parsing or a derivation.

Definition 2.3. The language of a monoidal grammar is the set of all strings  w1 ⊗w2...wnthat have a parsing.

We can think of the string diagrams of parse states and parsings as akin to partially constructed and fully constructed syntax trees respectively.

Example 2.3.1. Let G = (V, B, D, s) be a pregroup grammar with V = { Alice, loves, Bob }, B = { s, n } and D = { (Alice, n), (loves, nrsnl), (Bob, n) }. Then the following string diagram represents the arrow in  CG from Alice ⊗ loves ⊗ Bob to s:

image

Example 2.3.2. Let G = (V, X, R, s) be a context-free grammar with V and X defined as:

image

for s the non-terminal representing a sentence, np the non-terminal representing a noun phrase, similarly, adj an adjective, itv an intransitive verb, tv a transitive verb, and vp a verb phrase. Where the production rules are:

image

Then the following string diagrams represent the arrows in  CG from Complex ⊗

image

2.1 S-Monoidal Grammars

We can generalize monoidal grammars to associate weights (such as likelihoods or probabilities) with each parse state.

Definition 2.4. For some semiring S, an S-monoidal signature is a monoidal signature Σ = (Σ0, Σ1, dom, cod)equipped with a mapping  r : Σ1 → S.

A homomorphism of S-monoidal signatures  h : (Σ, r) → (Σ′, r′) is given by a monoidal signature homomorphism  h0, h1such that the following diagram commutes:

image

Definition 2.5. An S-monoidal grammar (G, r) is a tuple , R, s, r) such that (Σ, r)is an S-monoidal signature and  G = (Σ, R, s)is a monoidal grammar.

We can extend r into a function from  Ar(CG) → Sby defining  r(g◦f) = r(g)∗r(f)and  r(f1 ⊗ f2) = r(f1) ∗ r(f2). If we do this, then r defines a functor from  CG to thetrivial monoidal category where arrows are the elements of S and both composition and tensor product are semiring multiplication.

We will refer to the category of S-monoidal grammars over a common vocabulary as  GrammarS. The morphisms in this category are monoidal functors  F : CG → CG′such that the following diagram commutes:

image

Let us note that we can characterize any monoidal grammar G as the B-monoidal grammar (G, r) where B is the semiring of Booleans and r is the constant functor that maps all arrows in  CG to true in B.

In an S-monoidal grammar, we utilize the categorical composition to model the sequential nature of the parsing process and we use the tensor-product to model both the adjacency of words/types in a sentence and the parallel application of processing steps to different parts of a sentence. In this section we will explore how we can study the behavior of the tensor-product as a dynamic construct.

We can define an action of the free monoid  V ∗ on the category of endofunctors of CGto model the process of adding a “new” word to our system. For any word  w ∈ V ,the endofunctor  Wwmaps the object  o to o⊗wand the arrow  a to a⊗idw. In order to model the interpretation of this new word in context, we look at the interaction of this endofunctor with the arrows in  CGby defining a mapping  W ∗w(a) : Ar(CG) → SAr(CG)as follows.

image

Intuitively,  W ∗w(a)(a′) is the generalized “likelihood” of the parse state  a′ ◦ Ww(a),given  Ww(a). Naturally, parse states that cannot be expressed as  a′ ◦ Ww(a) have alikelihood of 0 given  Ww(a).

If we define a “listener” to be an entity that assigns the parse state a to the string of words  w1w2...wn, then W ∗w defines how a new word may update a listener’s parse state. We can represent this process with an S-weighted automaton. The states in the machine are parse states  a : w1w2...wn → o, and the transition function is  W ∗w.S-weighted automaton are also equipped with a function  r0that maps states to S. For example, in the  B case r0picks out the “accept states” of the automaton. For

some S-monoidal grammar (CG, r) we will define this function  r0as follows:

image

In order to formalize this relationship, we will use the theory of F-coalgebras on Set [Rut96]. An S-weighted automaton is a coalgebra of the functor W(X) = S×(SX)V and by [SBBR13] a given transition function will uniquely define coalgebras of W up to bisimulation, so we simply need to define a mapping  I between GrammarSand coalgebras of W.

Definition 3.1. The Incremental Functor I is a map on the objects and arrows of GrammarSthat acts as follows:

For some object (CG, r) in GrammarS, we define I(CG, r)to be the coalgebra (Ar(CG), r0 × W ∗w).

For some morphism  F between (CG, r0) and (C′G, r′0) in GrammarS, we defineI(F) to be the map  hF : Ar(CG) → Ar(C′G) that uses F’s mapping on arrows.

Lemma 3.1. The map I is a functor from  GrammarSto the category of coalgebras of W.

Proof. Since I(F) uses the functor  F ′smapping on arrows, I trivially preserves identity and composition. All that remains is to prove that  I(F) = hFis indeed a coalgebra homomorphism. To do this we need to show that the following diagram commutes:

image

When we apply the function  S × (ShF )V to r0 × W ∗w we get the following function:

image

Here  r′0(hF (a)) = r0(a) because F and rcommute in  GrammarS. When we use the fact that  hFpreserves composition and tensor product we can rewrite  X′ as:

image

This makes it clear that:

image

Therefore, we can conclude that:

image

So  hFis indeed a coalgebra homomorphism.

image

The incremental functor reveals an interesting duality. Given a monoidal category that describes how laterally composed (tensor product) objects are processed vertically (by composition of arrows), we get for free an automaton that defines a laterally evolving process upon vertically composed arrows. This automaton models how the process that the monoidal category describes will behave in the face of new information, and it reveals the monoidal category’s underlying incremental structure. This suggests the following corollary:

Corollary 3.1.1. If the S-monoidal grammars (CG, r) and (C′G, r′)have functors between them, then the coalgebras  I(CG, r) and I(C′G, r′)are bisimulatable.

Proof. If (CG, r) and (C′G, r′) have functors between them, then  I(CG, r) and I(C′G, r′)have coalgebra homomorphisms between them. By [Lev], this implies that  I(CG, r)and  I(C′G, r′) are bisimulatable.

For some  R≥0-monoidal grammar (G, r), we can think of r as assigning “confidence values” or “likelihoods” to the arrows in G. This allows us to model the relative probability of different interpretations of the same string of words and incorporate probabilistic and distributional language models into the parsing process. Let us explore what this means.

4.1 Maximum Parse State Likelihood

Definition 4.1. A language model over the monoidal grammar G is the following family of probability distributions:

For any prefix  w1 ⊗ ...wn, a probability distribution with finite support over the possible completions  wn+1⊗...wmsuch that there is a parsing  w1⊗...wn⊗wn+1⊗...wm → s. We will denote this as  Pr(wn+1 ⊗ ...wm|w1 ⊗ ...wn).

For any sentence  w1 ⊗ ...wmwith at least one parsing  a : w1 ⊗ ...wm → s, aprobability distribution with finite support over the set of all such parsings. We will denote this as  Pr(a|w1 ⊗ ...wm).

We would like to develop a mapping from language models over a monoidal grammars  G to R≥0-monoidal grammars (G, r) such that r(p) is indicative of the probability that a particular parse state is “optimal” according to that language model. We can formalize this notion with the following concepts:

Definition 4.2. For some prefix  w1 ⊗ ...wnequipped with a parse state  p : w1 ⊗...wn → oand sentence  w1 ⊗ ...wn ⊗ wn+1 ⊗ ...wmequipped with a parsing  a : w1 ⊗...wn ⊗ wn+1 ⊗ ...wm → s, we say that a is p-compliant if there exist some arrows ah : wn+1 ⊗ ...wm → o′, av : o ⊗ o′ → s such that a = av ◦ (p ⊗ ah).

Definition 4.3. Given some prefix  w1 ⊗...wnand sentence  w1 ⊗...wn ⊗wn+1 ⊗...wmsuch that there exists a parsing  a : w1 ⊗ ...wn ⊗ wn+1 ⊗ ...wm → s, we will define a maximal parse state for  w1 ⊗ ...wn on ato be a parse state  p : w1 ⊗ ...wn → o suchthat a is p-compliant and there exists no  c such that c ◦ pis not isomorphic to p and a is (c ◦ p)-compliant.

Intuitively, the parsing a is p-compliant if the parse state p factors a, and p is a maximal parse state for a if p cannot be developed any farther while still maintaining the possibility to extend it into a. We can use these constructs to define the following quantity:

Definition 4.4. Given a parse state p : w1 ⊗ ...wn → o, itsmaximal parse state likelihood  prM(p) is:

image

Where  Apis the set of all parsings a such that p is a maximal parse state for  w1⊗...wnon a.

Intuitively, given a prefix  w1 ⊗ ...wn, a parse state  p : w1 ⊗ ...wn → o, and alanguage model over the possible fully parsed sentences,  prM(p) is the probability that p will be the maximal parse state for the completed sentence.

Now say we want to construct an  R≥0-monoidal grammar (G, r) such that for any parse state, r(p) is as close as possible to  prM(p). By the definition of S-monoidal grammars, r is defined by the values of r(g) for all generating arrows  g ∈ Σ1. Forexample, for any parse state p in G, we have that:

image

Here  Bpis any bag of generating arrows in Σ1 that form pwhen combined by composition and tensor product.

Therefore, our objective is to find r to minimize the following expression, where PGis the set of all parse states in G:

image

Because log is a monotonic function, we can rewrite this as the following, where  Bp(g)is the number of times that g appears in  Bp:

image

This is a standard linear approximation of the coefficients  log(r(g)). If PG is finite,we can use the normal equations or gradient descent to learn the values of r(g). If not, we can use stochastic gradient descent [Bis06].

We will close with some remarks on future directions.

In section 4 we describe a simple algorithm for constructing an  R≥0-monoidalgrammar from a language model over a monoidal grammar. However, since we do not describe a method for constructing such language model from a dataset and choice of monoidal grammar, it is not immediately clear how we can implement this algorithm in practice. Such a method would enable us to efficiently implement our algorithm on top of any linguistic parsing library for CFGs, pregroups, or other monoidal grammars.

In section 3 we introduce a functor  I from GrammarSto the coalgebras of the functor  W = S × (SX)V . A natural question is whether there are functors in the reverse direction as well. In particular, we are curious as to whether I has left/right adjoints. If such functors exist, this would be further evidence of a fundamental connection between dynamical systems and monoidal categories. Such a discovery could help us unite these fields of study, as well as enable us to reduce questions about automata into questions about monoidal categories.

Our framework is fully syntax-based and is currently agnostic to semantic meaning. In [CSC10], the authors define a functorial framework for affiliating syntactic structure with vector space semantics, and in [SPHK18] the authors project vector space semantics onto dynamic syntax trees in order to model incrementality. This raises the question of how we can best incorporate semantic information into our framework. For example, if we extend the functor from [CSC10] to operate over monoidal grammars in general, it would be interesting to see how the interaction between this functor and I compares to the construction in [SPHK18].

[Awo06] Steve Awodey. Category Theory. Ebsco Publishing, May 2006.

[BDVJ03] Yoshua Bengio, R´ejean Ducharme, Pascal Vincent, and Christian Janvin. A Neural Prob- abilistic Language Model. J. Mach. Learn. Res., 3:1137–1155, 2003.

[Bis06] Christopher M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg, 2006.

[BM08] Wojciech Buszkowski and Katarzyna Moroz. Pregroup Grammars and Context-free Grammars. Computational Algebraic Approaches to Natural Language, Polimetrica, 121, 2008.

[CCS08] Stephen Clark, Bob Coecke, and Mehrnoosh Sadrzadeh. A Compositional Distributional Model of Meaning. In Proceedings of the Second Symposium on Quantum Interaction (QI-2008), pages 133–140, 2008.

[CCS10] Stephen Clark, Bob Coecke, and Mehrnoosh Sadrzadeh. Mathematical foundations for a compositional distributional model of meaning. In J. van Benthem, M. Moortgat, and W. Buszkowski, editors, A Festschrift for Jim Lambek, volume 36 of Linguistic Analysis, pages 345–384. 2010.

[CGS13] B. Coecke, E. Grefenstette, and M. Sadrzadeh. Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus. ArXiv e-prints, 2013.

[Cho56] Noam Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, September 1956.

[Cho57] Noam Chomsky. Syntactic Structures. Mouton and Co., The Hague, 1957.

[Cho65] Noam Chomsky. Aspects of the Theory of Syntax. MIT Press, 1965.

[CSC10] Bob Coecke, Mehrnoosh Sadrzadeh, and Stephen Clark. Mathematical foundations for a compositional distributional model of meaning. CoRR, abs/1003.4394, 2010.

[GS11] Edward Grefenstette and Mehrnoosh Sadrzadeh. Experimental Support for a Categorical Compositional Distributional Model of Meaning. In The 2014 Conference on Empirical Methods on Natural Language Processing., pages 1394–1404, 2011.

[Jos85] Aravind K. Joshi. How much context sensitivity is required to provide a reasonable structural description. 1985.

[JS88] Andr´e Joyal and Ross Street. Planar diagrams and tensor algebra. Unpublished manuscript, available from Ross Street’s website, 1988.

[JS91] Andr´e Joyal and Ross Street. The geometry of tensor calculus, I. Advances in Mathematics, 88(1):55–112, July 1991.

[KSP12] Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, and Stephen G. Pulman. A Unified Sentence Space for Categorical Distributional-Compositional Semantics: Theory and Experiments. In COLING, 2012.

[KSP13] Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, and Stephen Pulman. Separating Disambiguation from Composition in Distributional Semantics. page 10, 2013.

[Lam99] Joachim Lambek. Type Grammar Revisited. In Alain Lecomte, Fran¸cois Lamarche, and Guy Perrier, editors, Logical Aspects of Computational Linguistics, pages 1–27, Berlin, Heidelberg, 1999. Springer Berlin Heidelberg.

[Lam01] Joachim Lambek. Type Grammars as Pregroups. Grammars, 4:21–39, 2001.

[Lam08] Joachim Lambek. From Word to Sentence: A Computational Algebraic Approach to Grammar. Open Access Publications. Polimetrica, 2008.

[Lan98] S.M. Lane. Categories for the Working Mathematician. Graduate Texts in Mathematics. Springer New York, 1998.

[Lev] P. Levy. Coalgebra: Basic concepts.

[Mar47] A Markov. On certain insoluble problems concerning matrices. In Doklady Akad. Nauk SSSR, volume 57, pages 539–542, 1947.

[Pos47] Emil L. Post. Recursive Unsolvability of a problem of Thue. Journal of Symbolic Logic, 12(1):1–11, March 1947.

[Pow13] James F. Power. Thue’s 1914 paper: A translation. arXiv:1308.5858 [cs], August 2013.

[Rut96] J. J.M.M. Rutten. Universal coalgebra: A theory of systems. Technical report, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, The Netherlands, 1996.

[SBBR13] Alexandra Silva, Filippo Bonchi, Marcello Bonsangue, and Jan Rutten. Generalizing de- terminization from automata to coalgebras. Logical Methods in Computer Science, 9(1):9, March 2013.

[Sel10] P. Selinger. A Survey of Graphical Languages for Monoidal Categories. New Structures for Physics, pages 289–355, 2010.

[Shi87] Stuart M. Shieber. Evidence Against the Context-Freeness of Natural Language. In Walter J. Savitch, Emmon Bach, William Marsh, and Gila Safran-Naveh, editors, The Formal Complexity of Natural Language, Studies in Linguistics and Philosophy, pages 320–334. Springer Netherlands, Dordrecht, 1987.

[SPHK18] Mehrnoosh Sadrzadeh, Matthew Purver, Julian Hough, and Ruth Kempson. Exploring Semantic Incrementality with Dynamic Syntax and Vector Space Semantics. arXiv:1811.00614 [cs], November 2018.

[Thu14] Axel Thue. Probleme ¨Uber Ver¨anderungen von Zeichenreihen Nach Gegebenen Regeln. na, 1914.

[VW94] K. Vijay-Shanker and D. J. Weir. The equivalence of four extensions of context-free grammars. Mathematical Systems Theory, 27(6):511–546, November 1994.

[Wei88] David Jeremy Weir. Characterizing Mildly Context-Sensitive Grammar Formalisms. PhD thesis, University of Pennsylvania, Philadelphia, PA, USA, 1988. AAI8908403.


Designed for Accessibility and to further Open Science