8 Mar 2016•0 j'aime## 4 j'aime

•3,662 vues## vues

Soyez le premier à aimer ceci

afficher plus

Nombre de vues

0

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

0

Télécharger pour lire hors ligne

Signaler

Formation

「言語の測度に基づく非正規性の証明技法」 言語の大きさを考えることで非正規性を示す技法を説明しています。PPL 2016での発表資料です。

Ryoma Sin'yaSuivre

Associate Professor à Akita University- 言語の測度に基づく非正規性の証明技法 Ryoma Sin’ya Tokyo Institute of Technology, Department of Mathematics and Computer Science. A NEW TECHNIQUE FOR PROVING NON-REGULARITY BASED ON THE MEASURE OF A LANGUAGE
- 無限の猿定理 - Inﬁnite Monkey Theorem - (a.k.a. Borge’s theorem) http://en.wikipedia.org/wiki/Inﬁnite_monkey_theorem 2
- The main issue of the talk is the “inverse direction” of Inﬁnite Monkey Theorem. In the case of regular languages, Inﬁnite Monkey Theorem states a necessary and sufﬁcient condition for the notion of “almost sureness”. 3
- 4 記法 - Notation - A : an alphabet (ﬁnite set of letters) An : the set of all words over A of length n A⇤ : the set of all words over A(A⇤ = [ n2N An ) A language over A is a subset of A⇤ .
- 5 union: L ∪ K; concatenation: LK = {vw | v ∈ L, w ∈ K}; Kleene star: L∗ = n∈N Ln = {ε} ∪ L ∪ LL ∪ LLL ∪ · · · . For languages L and K, we define the following three operations: The class of regular languages is the smallest class that includes all finite languages and closed under the above three operations.
- 6 言語の階層 - Language Hierarchy -
- 6 言語の階層 - Language Hierarchy - よわい？
- 6 言語の階層 - Language Hierarchy - よわい？ きれい！
- 測度と零壱定理 - Measure & Zero-One Theorem - 7
- µn(L) = number of all words of length n in L number of all words of length n For a language L over A, its probability function is the fraction deﬁned by: 8 = |L An | |An| .
- µn(L) = number of all words of length n in L number of all words of length n This is exactly the probability that a randomly chosen word of length n belongs to L. For a language L over A, its probability function is the fraction deﬁned by: 8 = |L An | |An| .
- For a language L over A, its probability function is the fraction deﬁned by: µn(L) = number of all words of length n in L number of all words of length n The measure of a language L is the limit of its probability function: µ(L) µ(L) = lim n!1 µn(L). 9 = |L An | |An| .
- Example 10 The full language is almost full, and the empty language is almost empty. That is, the set of all words A∗ over A satisﬁes µ(A∗ ) = 1, and its complement ∅ satisﬁes µ(∅) = 0.
- Example 10 Consider aA∗ the set of all words which start with the letter a in A. Then the following holds: µn(aA∗ ) = |aAn−1 | |An| = 1 |A| . Hence µ((aA)∗ ) = 1/|A| holds and aA∗ is not zero-one if |A| ≥ 2. The full language is almost full, and the empty language is almost empty. That is, the set of all words A∗ over A satisﬁes µ(A∗ ) = 1, and its complement ∅ satisﬁes µ(∅) = 0.
- Example 10 Consider aA∗ the set of all words which start with the letter a in A. Then the following holds: µn(aA∗ ) = |aAn−1 | |An| = 1 |A| . Hence µ((aA)∗ ) = 1/|A| holds and aA∗ is not zero-one if |A| ≥ 2. The full language is almost full, and the empty language is almost empty. That is, the set of all words A∗ over A satisﬁes µ(A∗ ) = 1, and its complement ∅ satisﬁes µ(∅) = 0. Consider (AA)∗ the set of all words with even length. Then: µn((AA)∗ ) = 1 if n is even, 0 if n is odd. Hence, its limit µ((AA)∗ ) does not exist.
- 禁句 - Forbidden Word - A word w is forbidden for a language L over A, if holds.A⇤ wA⇤ L = ; (A⇤ wA⇤ ✓ L) More intuitively, w is a forbidden word of L if and only if every words in L does not contain w as a factor.
- 無限の猿定理 - Inﬁnite Monkey Theorem - (a.k.a. Borge’s theorem) http://en.wikipedia.org/wiki/Inﬁnite_monkey_theorem 12
- 無限の猿定理 - Inﬁnite Monkey Theorem - (a.k.a. Borge’s theorem) 13 Let L be a language over A. If L contains a language of the form , then L is almost full.A⇤ wA⇤ (i.e., A⇤ wA⇤ ✓ L ) µ(L) = 1) Infinite Monkey Theorem (formal statement)
- 零壱定理 - Zero-One Theore - Let L be a regular language. Then the following are equivalent: 1. L is almost empty (i.e., ) 2. L has a forbidden word. µ(L) = 0. Theorem [S. 2015]
- 零壱定理 - Zero-One Theore - Let L be a regular language. Then the following are equivalent: 1. L is almost empty (i.e., ) 2. L has a forbidden word. µ(L) = 0. Theorem [S. 2015] The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem.
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word ) 9w 2 A⇤ (A⇤ wA⇤ L = ;)
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word ) 9w 2 A⇤ (A⇤ wA⇤ L = ;) , 9w 2 A⇤ (A⇤ wA⇤ ✓ L)
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word ) 9w 2 A⇤ (A⇤ wA⇤ L = ;) , 9w 2 A⇤ (A⇤ wA⇤ ✓ L) ) µ(L) = 1
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word ) 9w 2 A⇤ (A⇤ wA⇤ L = ;) , 9w 2 A⇤ (A⇤ wA⇤ ✓ L) ) µ(L) = 1 ) µ(L) = 1 − µ(L) = 0
- 零壱定理 - Zero-One Theore - The implication (2) → (1) is nothing but the well-known Inﬁnite Monkey Theorem. (2) L has a forbidden word ) 9w 2 A⇤ (A⇤ wA⇤ L = ;) , 9w 2 A⇤ (A⇤ wA⇤ ✓ L) ) µ(L) = 1 ) µ(L) = 1 − µ(L) = 0 , L is almost empty (1).
- 零壱定理 - Zero-One Theore - Let L be a regular language. Then the following are equivalent: 1. L is almost empty (i.e., ) 2. L has a forbidden word. µ(L) = 0. Theorem [S. 2015] The remarkable fact of this theorem is that its converse (1) → (2) is also true!
- 零壱定理 - Zero-One Theore - Let L be a regular language. Then the following are equivalent: 1. L is almost empty or almost full 2. L or its complement has a forbidden word. 3. The syntactic monoid of L has a zero element. 4. The minimal automaton of L is zero. 5. L is recognised by a quasi-zero automata. (µ(L) = 0) (µ(L) = 1). Theorem [S. 2015] (complete version)
- An Automata Theoretic Approach to the Zero-One Law for Regular Languages: Algorithmic and Logical Aspects Ryoma Sin’ya Tokyo Institute of Technology. shinya.r.aa@m.titech.ac.jp ´Ecole Nationale Sup´erieure des T´el´ecommunications. rshinya@enst.fr A zero-one language L is a regular language whose asymptotic probability converges to either zero or one. In this case, we say that L obeys the zero-one law. We prove that a regular language obeys the zero-one law if and only if its syntactic monoid has a zero element, by means of Eilenberg’s variety theoretic approach. Our proof gives an effective automata characterisation of the zero-one law for regular languages, and it leads to a linear time algorithm for testing whether a given regular language is zero-one. In addition, we discuss the logical aspects of the zero-one law for regular languages. For more details, see arxiv:1509.07209 18
- 非正規性の証明技法 - Technique for Proving Non-Regularity- Let L be a almost empty language over A. If L does not have a forbidden word, then L is not regular. Zero Lemma (corollary of Zero-One Theorem)
- 非正規性の証明技法 - Technique for Proving Non-Regularity- Let L be a almost empty language over A. If L does not have a forbidden word, then L is not regular. Zero Lemma (corollary of Zero-One Theorem) A new necessary condition of the regularity.
- 20 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular.
- 20 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd.
- 20 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd.
- 20 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd. 8w 2 A⇤ (wwr 2 P).
- 20 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd. 8w 2 A⇤ (wwr 2 P). (i.e., P does not have a forbidden word)
- 20 P is not regular by Zero Lemma! 回文 - Palindromes - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd. 8w 2 A⇤ (wwr 2 P). (i.e., P does not have a forbidden word)
- 21 Recall that the Dyck language D over A = {[, ]} is the set of all balanced square brackets: D = {ε, [], [[]], [][], [[[]]], [[][]], [[]][], [][[]], [][][], . . .}. µn(D) = Θ 1 n3/2 if n is even, 0 if n is odd. D is not regular by Zero Lemma! 括弧の対応 - Dyck Language - 8w 2 A⇤ 9n, m 2 N([n w]m 2 D) . (i.e., D does not have a forbidden word)
- 22 by Prime Number Theorem. is not regular by Zero Lemma! by Dirichlet's theorem : the set of all prime numbers. 素数 - Primes -
- Zero Lemma ~ • states a necessary condition for regular languages. • can be only applied to almost empty languages. • is useful, since the assumption “L is almost empty” is often intuitively clear.
- Zero Lemma ~ • states a necessary condition for regular languages. • can be only applied to almost empty languages. • is useful, since the assumption “L is almost empty” is often intuitively clear. However, even though “L is almost empty” is often intuitively clear, proving it requires extra work.
- Proving “L is almost empty” requires the asymptotic behaviour of the probability function of L. However, even though “L is almost empty” is often intuitively clear, proving it requires extra work.
- Motivation: Can we ﬁnd a simple sufﬁcient condition for the almost emptiness? Proving “L is almost empty” requires the asymptotic behaviour of the probability function of L. However, even though “L is almost empty” is often intuitively clear, proving it requires extra work.
- 零測度の十分条件 - Sufﬁcient Condition for the Almost Emptiness- 25
- http://www.newscientist.com/article/dn10521-forest-growth-is-encouraging-say-researchers/ Dense 26
- http://www.evs.anl.gov/news/2014/03-31-mapping-ephemeral-streams.cfm Sparse 27
- Idea: If no element has a neighbour element, the set looks like sparse, e.g., is of measure zero. 28
- In order to formalise this idea, we have to introduce some distance between words! 29 Idea: If no element has a neighbour element, the set looks like sparse, e.g., is of measure zero.
- 30 Hamming距離 - Hamming Distance - Hamming distance is a distance between words of same length.
- 31 Hamming距離 - Hamming Distance - d(u, v) = |{i 2 [0, n − 1] | ui 6= vi}| where wi is the i-th later of w. The hamming distance between two words is the number of positions at which corresponding letters are different: u, v 2 An d(u, v)
- 32 Hamming距離 - Hamming Distance - d(u, v) = |{i 2 [0, n − 1] | ui 6= vi}| where wi is the i-th later of w. d(0001, 0000) = 1, d(1111, 0000) = 4, d(1101, 0110) = 3, d(1001, 1111) = 2.
- 33 Hamming距離 - Hamming Distance - d(u, v) = |{i 2 [0, n − 1] | ui 6= vi}| where wi is the i-th later of w.
- 34 Hamming距離 - Hamming Distance -
- 35 For a word , its distance-one neighbours is defined by: w 2 An B(w) B(w) = {u 2 An | d(w, u) 1}.
- 35 For a word , its distance-one neighbours is defined by: w 2 An B(w) B(w) = {u 2 An | d(w, u) 1}.
- 35 For a word , its distance-one neighbours is defined by: w 2 An B(w) B(000) = B(w) = {u 2 An | d(w, u) 1}.
- 35 For a word , its distance-one neighbours is defined by: w 2 An B(w) B(000) = B(w) = {u 2 An | d(w, u) 1}.
- 36 For a word , its distance-one neighbours is defined by: w 2 An B(w) Note: the size of satisfies:B(w) |B(w)| = n(|A| − 1) + 1 for every word w 2 An . B(w) = {u 2 An | d(w, u) 1}.
- 37 零測度の十分条件 - Sufﬁcient Condition for the Almost Emptiness- Lemma 1 Let L be a language over A. If the number of distance-one neighbours that are in L is bounded by some constant for every sufficiently large word, then L is almost empty.
- 37 零測度の十分条件 - Sufﬁcient Condition for the Almost Emptiness- Lemma 1 Let L be a language over A. If the number of distance-one neighbours that are in L is bounded by some constant for every sufficiently large word, then L is almost empty. Namely, if L satisfies the following condition, then L is almost empty: 9C, N 2 N 8n 2 N 8w 2 An n > N ) |B(w) L| C .
- 38 回文・再 - Palindromes (Revised) - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd.
- 38 回文・再 - Palindromes (Revised) - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd.
- 38 回文・再 - Palindromes (Revised) - Recall that the set of all palindromes P over A is deﬁned as follows: P = {w ∈ A∗ | w = wr }. Note that, if A is singleton (|A| = 1), then P = A∗ and hence P is regular. µn(P) = ⎧ ⎨ ⎩ |A|n/2 |A|n = 1 |A|n/2 if n is even, |A|×|A|(n−1)/2 |A|n = 1 |A|(n−1)/2 if n is odd. µ(P) = 0 since the number of distance-one neighbours that is in L is bounded by |A| for any words.
- 39 回文・再 - Palindromes (Revised) - madamimadam 2 P (Madam, I’m Adam) Example:
- 39 回文・再 - Palindromes (Revised) - madamimadam 2 P (Madam, I’m Adam) Example: badamimadam /2 P distance-one
- 39 回文・再 - Palindromes (Revised) - madamimadam 2 P (Madam, I’m Adam) Example: badamimadam /2 P distance-one madambmadam 2 P distance-one
- 39 回文・再 - Palindromes (Revised) - madamimadam 2 P (Madam, I’m Adam) Example: We can obtain another palindrome, only if we change the central letter “i ” to another letter. badamimadam /2 P distance-one madambmadam 2 P distance-one
- 39 回文・再 - Palindromes (Revised) - madamimadam 2 P (Madam, I’m Adam) Example: We can obtain another palindrome, only if we change the central letter “i ” to another letter. badamimadam /2 P distance-one madambmadam 2 P distance-one µ(P) = 0 since the number of distance-one neighbours that is in L is bounded by |A| for any words.
- The proof of Lemma 1 is not so difﬁcult. It uses a result of Coding Theory (Cohen’s theorem), please see my paper for the details of Lemma 1. 40
- Theorem [Cohen et al. 1986] A language L over A is said to be a covering of if holds. An [ w2L B(w) = An We denote the minimal size of a covering of by An KA(n) = min |L| {|L| | L is a covering of An }. For any alphabet A, there exists some constant C such that: lim sup n!1 KA(n) ⇥ (n(|A| − 1) + 1) |An| < C.
- 課題 - Future Works - 42
- 43 Our Lemma 1 is: • a sufﬁcient condition of the almost emptiness. • general. It can be applied to any language.
- 43 Our Lemma 1 is: • a sufﬁcient condition of the almost emptiness. • general. It can be applied to any language. • but not strong enough, we can not prove the almost emptiness of Dyck language by Lemma 1. • We want to improve Lemma 1. Some conjectures are written in my paper.
- 予想 - Conjecture - 問題 1. 2 つ以上の文字を含むアルファベット A 上の言語を L とする．L から定められる 2 つの関 数 f, g : N → N をそれぞれ f(n) = max{|L ∩ BA(w, n)| | w ∈ L ∩ An }, g(n) = min{|L ∩ BA(w, n)| | w /∈ L ∩ An } で定義する． この時，次が成り立つか？ (イ) f(n) が定数で上から抑えられ (f(n) ∈ O(1))，かつ g(n) が線形で下から抑えられる (g(n) ∈ Ω(n)) ならば L はほとんど空． (ロ) limn→∞ f(n)/g(n) = 0 ならば L はほとんど空． Dyck 言語は問題 1 の (イ) の具体例となっている．直感的には (ロ) は「多数派 (L の要素) の周り には多数派が多く，少数派 (L の要素) の周りには少数派が少ない)」という状況を表している．
- Tokyo Tech Ofﬁcial Mascot: “Tech-chan” (東工大公式マスコット：テックちゃん) Thank you♪ 45
- Tokyo Tech Ofﬁcial Mascot: “Tech-chan” (東工大公式マスコット：テックちゃん) Any questions or comments? 46