VC-dimensions of nondeterministic finite automata for words of equal length

2020·Arxiv

Abstract

Abstract

Let NFA) denote the set of languages accepted by nondeterministic finite automata with q states over an alphabet with denote the set of words of length n. We give a quadratic lower bound on the VC dimension of

Next, the work of Gruber and Holzer (2007) gives an upper bound for the nondeterministic state complexity of finite languages contained in which we strengthen using our methods.

Finally, we give some theoretical and experimental results on the dependence on n of the VC dimension and testing dimension of NFA

Keywords: Vapnik-Chervonenkis dimension, testing dimension, finite automata, nondeterminism, state complexity.

1 Introduction

In this article we shall improve some results on the nondeterministic state complexity of finite languages, and investigate the VC dimension associated to fixed numbers of states for NFAs. Our methods build on work from the last three decades: by Ishigami and Tani [6] for VC dimension of DFAs; Gruber and Holzer [2] for nondeterministic state complexity; and Shallit and Wang [11] and Hyde and Kjos-Hanssen [3] for automatic complexity and languages consisting of words of the same length.

1.1 Dimension

The Vapnik–Chervonenkis dimension is an important tool in machine learning. We fix notation and give the definition:

Definition 1. Let H be a set family (a set of sets) and C a set. Their intersection is the set family := . A set C is shattered by H if contains all the subsets of C, i.e.: = 2. The VC dimension of H is dim) = max{|C| : C is shattered by H}.

The less-known testing dimension was introduced by Kathleen Romanik at COLT’92 [8, 9] and is the result of replacing an “” by a “”:

Definition 2. Let Q be a concept class defined over a set X, i.e., a family of subsets of X. The testing dimension of Q is

Unlike VC dimension, testing dimension on its face depends on X as well as Q. We typically have X = , but note Theorem 7 below.

A visualization of the relationship between dimand dimover for n = 2 and n = 3 can be found in Figures 4 and 5. There, we make use of Theorems 3 and 4.

Theorem 3 (Sauer–Shelah Lemma [10, 12]). There is no family of subsets of U, u = |U|, of cardinality with VC dimension .

Theorem 4. Let c and u be nonnegative integers, and let U be a set of cardinality u. The following are equivalent:

1. For each set family H over dim.

2. c ≥ 2− 2

Proof. (2) =(1): Assume (2), let H be given, and let with |C| = m. We must show that there is some such that . The number of sets H with is 2. By (2), , hence the complement of H satisfies = 2. Hence , as desired.

(1) =(2): Suppose . Let with |C| = m. Let . Then |H| = 2, and C and D witness that T(H) < m.

Remark 5. By Theorem 4, letting x = (c + 1)2+ 2= 1 2+ 2we obtain a lower bound used in Figures 4 and 5. A basic and obvious upper bound on dimused there is: dimlog.

1.2 Automata

For a nonnegative integer k, we let [k] = . Thus [2]= is the set of all finite binary words.

Definition 6. Let nfa) (nfa)) denote the class of all nondeterministic finite automata with q states and 1 accept state (arbitrary number of accept states) over an alphabet of cardinality b. Let Σ be a set. Then nfa) is the set of all NFAs with q states over the alphabet Σ.

The language accepted by the automaton M is L(M). Let

NFA) = nfa, and similarly define NFA) and NFA).

For example, NFAis a set of “slices” of languages accepted by q-state NFAs, where .

Theorem 7. Let S = NFA

Proof. S shatters : for any n, but does not shatter {1}.

2 Main results

Gruber and Holzer [1, 2] gave the following construction. Intuitively, we have states indicating that we have seen the symbols in w, and states indicating that the symbols in w remain to be seen.

Definition 8. Let denote the empty word. Let or . Let 1)and 1). We construct a nondeterministic finite automaton A = (), where (disjoint union) with and and and , and the set of final states . (Note that 1 2, and |F| = 2 .) The transition function is specified as follows:

1. For all and , if then the set ) contains the element .

2. For all , if w = xay is the unique decomposition where |x| = 1)is a single letter, and 1), then let ) contain the element .

3. For all and , the set contains the element .

As a first approximation to the construction we will employ in Theorem 14, let us restate Definition 8 for the case with n an odd number.

Definition 9. Let , where n = 2m + 1 is odd. We construct a nondeterministic finite automaton A = (), where (disjoint union) with and and and , and . The transition function is specified as follows:

1. For all and , if |w| < m then the set ) contains the element .

2. For all , if w = xay is the unique decomposition where |x| = |y| = m and a is a single letter, then let ) contain the element .

3. For all and , the set ) contains the element .

If x and y are words and x is a prefix of y, so that y = xz for some word z, then we write . We denote the reversal of x by . If x is a suffix of y then consequently we may write .

Our strategy for obtaining lower bounds for VC dimension in Theorem 14 will be to remove some states , and then to also remove any state where all with are removed. The counting of the states removed this way will turn out to hinge on Definition 10.

Definition 10. The function a is defined by a(0) = 0 and for 0, a(n+1) = a(n) + 1 + t where n + 1 = (2k + 1)2for some k.

Lemma 11. a(n) = 2) where w(n) is the binary Hamming weight of n.

Proof. By induction. Base case: For n = 0, a(0) = 0 = 0 0 = 2 0), as desired.

Induction step: Assume a(n) = 2). By definition, a(n + 1) = a(n) + 1 + t where n + 1 = (2k + 1)2for some k. The Hamming weight of

compared to that of

is 1 smaller, i.e.,

So then

a(n + 1) = a(n) +1 + t (by definition)

log+ 1)for all n.

Proof. Let w(n) be the binary Hamming weight of n. Note that 1) = s for each integer 0 and . In particular, letting log+1), we have n < n + 1 , and hence 1) = log+ 1)for each 0. By Lemma 11 we are done.

We denote the binary numerical value of a word , where each , by int). For example,

For any integer 0 , we write bin) for the word w of length n with int. For example, bin(3) = 0011.

Lemma 13. Let : int,

and ) = Then = a from Definition 10.

Proof. Since and each [2]has some with , we have

For each removed word in we remove it from , and additionally, for the (2k + 1)2th removed word from we remove t more words from . For instance, when the 4th word is removed from we remove 3 words from . Thus, +1) = )+1+t where n+1 = (2k+1)2for some k as desired.

For functions f(q), g(q) we write if for all 0, (1 + ) for all large enough q.

Theorem 14 is our main theorem, and uses a modification inspired by Hyde and Kjos-Hanssen [3] of the Gruber–Holzer construction to get information about VC dimension over .

Theorem 14. Let 0 and 1 be integers. Then

Proof. Let , so that + 1}. Let r be the least integer such that 2. Let ˜q = 2. Let int. We claim that we can shatter the following sets S using NFA):

Thus, let . We define an automaton that accepts every words in and rejects every words in . To do so, it suffices to

• construct an automaton M with ˜q states and (hence ) and then

• extend M to an automaton with q states and L(M) = ), by adding inaccessible states.

and contains some states that are identified with (equal to) states of the form as follows.

• If n is odd then for each w with and .

• If n is even, let for each w with 1.

The set of final states is

is the number of words such that all its

extensions to length m are in E. So

Case 1: n is odd. The lower bound on VC dimension witnessed by M is then

Asymptotically the VC dimension achievable d for n = 2m + 1 length satisfies

Theorem 14 can be compared with a result from the literature:

Theorem 15 (Ishigami and Tani ALT’93 [5, Theorem 4.1]). Let Σ be a finite alphabet, , and let 1 be an integer. Then

Remark 16. The construction in Theorem 14 is sharp for the number of states in the case q = 3, n = 3. Consider the set {000, 011, 110, 111}. There is a 3-state solution produced by our construction:

The nondeterministic state complexity of a language L is denoted nsc(L).

Remark 17. Gruber and Holzer [2, Lemma 12] claimed that for all , nsc(. The proof in [2] contains the following mistake. If 1)and 1)then they claimed that

A correct estimate would be

We state Theorem 18 in a corrected form.

Theorem 18 (Gruber and Holzer [2, Lemma 12])

Theorem 19. Assume . Then nsc(.

Proof. Let m be such that n = 2m+1 or n = 2m, i.e., . The number of states required in Theorem 14 with r = 0 is

3 Increasing the word length

Table 1 contains some results of exhaustive search by computer as well as some consequences of our theoretical results. Looking at the (incomplete) table suggests the following.

Conjecture 20. For all integers 0 and 1,

Informally, Conjecture 20 says that automata with a fixed number of states can shatter sets of long words just as well as they can shatter sets of short words. In particular, we do not know whether

Resolving this would take about a month of continuous computation with our current code and computer. Although stated as a conjecture, we suspect the negation of Conjecture 20 is true and that for n = 6 and q = 7, q-state NFAs cannot be used to shatter a set of size 32. In the remainder of the paper we prove some results related to Conjecture 20.

Theorem 21. For all integers 0 and 1,

Proof. Let s = min(2dim(NFA)). Since , to show

given with |S| = s, we must demonstrate how to shatter S using NFA). Let . Let T = S0 = 0 : and similarly = 0. Since and dim(NFA), there exists nfa) with ), () = . Let nfa) be obtained from M by letting the final states of N be exactly those of M that have a transition using 0 to a final state of M. Then ) and () = , as desired.

We do not know whether Theorem 21 is true with NFAin place of NFA; see Table 1.

Theorem 22. For all integers 0 and 1,

dim(NFA) = 2dim(NFA) = 2dim(NFA) = 2dim(NFA) = 2

Proof. Since dim(NFA) = 2, we also have dim(NFA) = 2. By Theorem 21, dim(NFA. Since dimdimin general, this gives dim(NFAdim(NFAas desired.

Theorem 23. For all integers 0 and 1,

Proof. Let be a set that is shattered by NFA). Let . Then |T| = |S| and T is shattered by NFA+ 1).

Definition 24. For a fixed q let us say that two sequences of words and are -similar if for each ] there is an automaton M with q states such that

Let us say that two sets of words are -similar if there exist orderings of them that are -similar.

For instance, because ().

One way to prove that dim(NFAdim(NFA) would be if each shattered set of words of length n is similar to a shattered set of words of length n + 1. Things turn out not to be that simple:

Proposition 25 ([7]). For all with |S| = 3, .

Proposition 25 says that for any such S, the number of subsets [3] for which a suitable automaton M exists is always smaller than 8. In fact, our computer calculation shows that the greatest number of sets F for which M exists is 6, which is achieved for S = {000, 001, 100}.

Some weak support for Conjecture 20 is given by Theorem 26.

Theorem 26. For all 1 and 0,

Proof. Suppose dim(NFAfor . Thus there exists a set of words of length n that can be shattered by q-state NFAs over a finite alphabet Σ . We may assume the NFAs only have transitions for the symbols occurring in words in S. Let where b does not occur in any word in S. Let = . We shall show how to shatter .

Fix any sets of words , with . Let = and = . By assumption, there exists nfa) with L(M) and . Thus, M accepts all the words in A and rejects all the words in R.

Let be formed from M by adding a loop labeled b to the start state . Then ) = . Thus the automata witness that dim(NFA, as desired.

References

[1] Hermann Gruber and Markus Holzer. Results on the average state and tran- sition complexity of finite automata accepting finite languages (extended abstract). In Hing Leung and Giovanni Pighizzini, editors, 8th International Workshop on Descriptional Complexity of Formal Systems - DCFS 2006, Las Cruces, New Mexico, USA, June 21 - 23, 2006. Proceedings, pages 267–275. New Mexico State University, Las Cruces, New Mexico, USA, 2006.

[2] Hermann Gruber and Markus Holzer. On the average state and transi- tion complexity of finite languages. Theoret. Comput. Sci., 387(2):155–166, 2007.

[3] Kayleigh K. Hyde and Bjørn Kjos-Hanssen. Nondeterministic automatic complexity of overlap-free and almost square-free words. Electron. J. Combin., 22(3):Paper 3.22, 18, 2015.

[4] OEIS Foundation Inc. The on-line encyclopedia of integer sequences. http: //oeis.org/A005187, 2021.

[5] Yoshiyasu Ishigami and Sei’ichi Tani. The VC-dimensions of finite automata with n states. In Algorithmic learning theory (Tokyo, 1993), volume 744 of Lecture Notes in Comput. Sci., pages 328–341. Springer, Berlin, 1993.

[6] Yoshiyasu Ishigami and Sei’ichi Tani. VC-dimensions of finite automata and commutative finite automata with k letters and n states. Discrete Appl. Math., 74(2):123–134, 1997.

[7] Bjørn Kjos-Hanssen and Davin K. Takahashi. Code for “VC-dimensions of finite automata for words of equal length”. https://github.com/ bjoernkjoshanssen/vc, 2019.

Figure 1: The Gruber–Holzer construction for length n = 5 and the language L = {10011, 10010, 00100}, and our modified construction.

Figure 2: The even-length construction in Theorem 14 examplified at n = 6 with q = 11 states. Various layouts of transitions here will shatter : 011 or 1 = 40 (Table 1).

Figure 3: The odd-length construction in Theorem 14. The states shown here suffice to shatter the set S of all words of length 7 that do not begin or end with 000, 001, or 010. In particular, the automaton M shown here has = = 1011101011.

Table 1: Cells annotated with a/b or just a indicate a = dim(NFA) or b = dim(NFA). When these coincide the value a may be indicated. For 3 and 2 see [7]. Numbers in {braces} come from Theorem 14. Bold numbers indicate results derived from Theorem 14 with r = 0. * = result also correct for NFA in place of NFA(all numbers for 3 or 2 are also correct for NFA in place of NFA).

• Upper bound 4 + logfor dimin red.

• Lower bound log(1 + 2) for dimin blue from Remark 5.

• Lower bound x = 2+for dimfrom Theorem 3 in black.

• dimdimof NFAfor q = 1, 2 in green.

Figure 4: VC and testing dimension of NFA: upper and lower bounds. y = dim) as a function of x = c/16, where is a set family over .

• Upper bound 2+ logfor dimin red.

• Lower bound log(1 + 2) for dimin blue.

• Lower bound x = 2+ for dimfrom Theorem 3 in black.

• dimdimof NFAfor q = 1, 2, 3 in green.

Figure 5: VC and testing dimension of NFA: upper and lower bounds. y = dim) as a function of , where and C is a set family over .

[8] Kathleen Romanik. Approximate testing and learnability. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT ’92, page 327–332, New York, NY, USA, 1992. Association for Computing Machinery.

[9] Kathleen Romanik. Approximate testing and its relationship to learning. Theoret. Comput. Sci., 188(1-2):79–99, 1997.

[10] N. Sauer. On the density of families of sets. J. Combinatorial Theory Ser. A, 13:145–147, 1972.

[11] Jeffrey Shallit and Ming-Wei Wang. Automatic complexity of strings. vol- ume 6, pages 537–554. 2001. 2nd Workshop on Descriptional Complexity of Automata, Grammars and Related Structures (London, ON, 2000).

[12] Saharon Shelah. A combinatorial problem; stability and order for models and theories in infinitary languages. Pacific J. Math., 41:247–261, 1972.