\hfill \thepage} %} \newtheorem{theorem}{Theorem} \newtheorem{acknowledgement}[theorem]{Acknowledgement} \newtheorem{algorithm}[theorem]{Algorithm} \newtheorem{axiom}[theorem]{Axiom} \newtheorem{case}[theorem]{Case} \newtheorem{claim}[theorem]{Claim} \newtheorem{conclusion}[theorem]{Conclusion} \newtheorem{condition}[theorem]{Condition} \newtheorem{conjecture}[theorem]{Conjecture} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{criterion}[theorem]{Criterion} \newtheorem{definition}[theorem]{Definition} \newtheorem{example}[theorem]{Example} \newtheorem{exercise}[theorem]{Exercise} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{notation}[theorem]{Notation} \newtheorem{problem}[theorem]{Problem} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{remark}[theorem]{Remark} \newtheorem{solution}[theorem]{Solution} \newtheorem{summary}[theorem]{Summary} \newenvironment{proof}[1][Proof]{\noindent\textbf{#1.} }{\ \rule{0.5em}{0.5em}} \input{tcilatex} \begin{document} \section{Turing Machines} \section{Rosen, Fifth Edition: Chapter 11.5; Sixth Edition: Chapter 12.5} \subsection{\protect\vspace{1pt}Section 11.5 , Problems:1,2, 3,5, 13, 17 (fifth edition); Section 12.5, Problems:\ 1,2,3,7,15,19 (sixth edition);% \newline Section 2.5, Problems: 1, 3 (fifth edition); Section 3.6 (sixth edition), Problems: 1, 3} We have seen that finite state machines don't recognize sets that otherwise are easy to describe, like $a^{n}b^{n},n\in %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion .$This set can be described by a context free grammar. On the other hand there are sets, again not too complicated, that are not recognized by any context free grammar. A prototype is the set $a^{n}b^{n}c^{n},n\in %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion .$The fact that this set is not describable by a context-free grammar is more difficult to prove. Alan Turing came up in 1936 with the definition of a most general machine, namely one that can perform any conceivable mechanical computation. Its definition resembles finite state machines. The difference is that a Turing machine allows for a potentially infinite memory. The ingredients of a Turing machine are \begin{enumerate} \item A \textbf{finite} set $S$ of states. There is one state designated as the starting state $s_{0}.$Final states will be defined below. \item A \textbf{finite} set $I$ which is called the tape alphabet. Some characters of $I,$for example $0,1,a,b,c,d,\ldots $ are used to form input~words, like $w=adam,$ while others, like $\sqcup ,|,\#\ldots $ are used as Blank, delimiters etc. The Rosen book uses $B$ for blank, while $% \sqcup $ is more common. Let $\Sigma $ be the subset of $I$ from which input words are formed. Thus $I~\backslash ~\Sigma $ consists of auxiliary tape symbols, like the blank. \item A\textbf{\ partial} function $f:S\times I\rightarrow S\times I\times \{R,L\}.$Here $R$ stands for right and $L$ stands for left. \end{enumerate} \vspace{1pt} This abstract definition can be interpreted in various ways. In the Rosen book, a Turing machine acts on an \textbf{two-sided infinite tape} consisting of individual cells. Every cell contains a symbol from $I$, but for only finitely many cells the content is different from $B.$ It is thought that a Turing machine has a \textbf{control unit} which at every step during a computation is in some state $s\in S.$ If $f(s,x)=(t,y,d),$and if the head of the control unit is on top of a cell with content $x,$ then the content of the cell is changed from $x$ to $y,$ the control unit changes its state from $s$ to $t$ and the unit moves one step into the direction $d,$ that is either to the next cell to the right, if $d=R,$ or to the left, if $% d=L$. In case that $f(s,x)$ is not defined, the machine \textbf{halts}. The function $f$ can be identified with a set $F$ of $5-$tuples $(s,x,t,y,d)$ where $(s,x,t,y,d)\in F$ iff $f(s,x)=(t,y,d).$That is, $F$ is the graph of $% f.$ The state $s$ is a \textbf{final} state of the Turing machine $% T=(S,I,f,s_{0})$ if $s$ is not the first coordinate of any element of $F.$ The machine stops operating if it reaches a final state, regardless what the control unit reads. Of course, it may also halt at a state $t$ where $f(t,x)$ is not defined. Let $T$ be a Turing machine and $w$ be a word over $\Sigma $, that is $w\in \Sigma ^{\ast }.$ In its \textbf{initial position}, the machine $T$ is in state $s_{0}$ and all cells are blank, with the exception of cells which contain the symbols of $w.$ Say if $w=adam$, the the tape looks like% \begin{equation*} \ldots BBBadamBBB\ldots \ldots \end{equation*} To be in its initial position, the control unit must be over the \textbf{% left-most non-blank cell}. We use the following convention to depict the operations of a Turing machine. If the control unit is in state $s$ and over a particular cell, we put a dot over that cell and $s$ underneath it. Thus for the initial state we have% \begin{equation*} \ldots \ldots BBB% \begin{array}{c} \underset{s_{0}}{\overset{\cdot }{a}}% \end{array}% damBBB\ldots \ldots \end{equation*} and if $(s_{0},a,s_{1},e,R)$ then we describe the next transition like% \begin{equation*} \ldots \ldots BBB% \begin{array}{c} \underset{s_{0}}{\overset{\cdot }{a}}% \end{array}% damBBB\ldots \ldots \Rightarrow \ldots \ldots BBBe% \begin{array}{c} \underset{s_{1}}{\overset{\cdot }{d}}% \end{array}% amBBB\ldots \ldots \end{equation*} The word $adam$ is\textbf{\ recognized} by the Turing machine $T,$ in case that with this word as initial input, the machine halts in a final state. We could describe this first transition \begin{equation*} \begin{array}{c} \underset{s_{0}}{\overset{\cdot }{a}}% \end{array}% dam\overset{s_{1}eR}{\Rightarrow }e% \begin{array}{c} \underset{s_{1}}{\overset{\cdot }{d}}% \end{array}% am \end{equation*} \textbf{Exercise} Define a Turing machine that recognizes $w=adam$ and halts in a final state with $eve$ on its tape. \vspace{1pt} \begin{definition} Let $T$ be a Turing machine with input alphabet $\Sigma .$Then% \begin{equation*} L(T)=\{w|w\in \Sigma ^{\ast },w~\text{is recognized by }T\} \end{equation*} \end{definition} \begin{definition} A subset $A$ of $\Sigma ^{\ast }$ is \textbf{recognizable} if there is a Turing machine with input alphabet $I\supseteq \Sigma $ such that $A=L(T).$ \end{definition} \begin{definition} A subset $A$ of $\Sigma ^{\ast }$ is \textbf{decidable} if $A$ as well as its complement $\Sigma ^{\ast }~\backslash ~A$ is recognizable. \end{definition} We can represent natural numbers as a string of units. Like $0\equiv 1,1\equiv 11,2\equiv 111,\ldots $that is $n\equiv 1^{n+1},$ that is the natural number is represented by $n+1$ units. \begin{definition} A function $f:% %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion \rightarrow %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion $ is \textbf{recursive }if there is a Turing machine $T$ with input alphabet $\Sigma =\{1\}$ which recognizes $% %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion =\Sigma ^{\ast }$ and with $n$ as initial input, halts in a final state with $f(n)$ on its tape. \end{definition} We can represent \ $n-$tuples by using an auxiliary symbol, like $\ast $ as a separator for components. For example, $(2,1)\equiv 111\ast 11.$To define a recursive function in two variables, we put $\Sigma =\{1,\ast \}.$ \begin{definition} A function $f:% %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion ^{2}\rightarrow %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion $ is recursive if there is a Turing machine $T$ which has $% %TCIMACRO{\U{2115} }% %BeginExpansion \mathbb{N} %EndExpansion ^{2}$ as its recognized set and given an initial input $(n,m)$ it halts in a final state with $f(n,m)$ on its tape. \end{definition} \textbf{Exercise:} Carry out the computation of $2+3$ according to Example 4 of the Rosen book. \vspace{1pt} \subsubsection{G\"{o}del Numbers.} As we have seen, a Turing machine $T=(S,I,f,s_{0})$ is essentially a set $F$ of $5-$tuples where $(s,x,t,y,d)\in F$ iff $f(s,x)=(t,y,d).$ This set $F$ can be encoded in a way that it becomes a number whose binary representation describes $T.$ We explain this procedure in case where $\ $% \begin{equation*} I=\{0,1,B\},\Sigma =\{0,1\},S=\{s_{0},s_{1},\ldots ,s_{n-1}\} \end{equation*} \vspace{1pt}We encode the $n-$many states as strings of zeros% \begin{equation*} s_{0}=0,s_{1}=00=0^{2},\ldots ,s_{n-1}=00\ldots 0=0^{n} \end{equation*} The elements of $I$ are also encoded as strings of zeros:% \begin{equation*} 0=0,1=00=0^{2},B=000=0^{3} \end{equation*} Finally, \begin{equation*} L=0,R=00=0^{2} \end{equation*} In order to describe the transition function $f$ more conveniently, we use variables for $I$ and the directions$% :X_{1}=0,X_{2}=0^{2},X_{3}=0^{3},D_{1}=0,D_{2}=00.$ A transition \begin{equation*} f(s_{i},X_{j})=(s_{k},X_{l},D_{m})\text{ is encoded as }% 0^{i}10^{j}10^{k}10^{l}10^{m} \end{equation*} where $1$ serves as a separator. For example:% \begin{equation*} f(s_{0},1)=(s_{2},0,R)\text{ has code }0100100010100 \end{equation*} \begin{equation*} f(s_{1},B)=(s_{0},1,L)\text{ has code }0010001010010 \end{equation*} We can separate codes by two units, $11,$thus: \begin{equation*} 0100100010100110010001010010 \end{equation*} is the code for the transitions $% f(s_{0},1)=(s_{2},0,R),f(s_{1},B)=(s_{0},1,L).$ A Turing machine is given by a finite sequence of $5-$tuples. Each $5-$tuple looks like $(s,x,t,y,d)=(state,content,state,content,direction).$ We allow for Turing machines arbitrarily many states, that is $s=0^{i}$ where $i$ is between $0$ and $n-1$, if our machine has $n-$many states. A cell content is either $0,1,$or $B$ for which we use codes $0=0,1=00,B=000.$That is, the code for a particular $5-$tuple $(s,x,t,y,d)~$looks like $% 0^{s}1^{x}0^{t}10^{y}10^{d}$ where, by abuse of language we used as exponents variables for which they stand for. For example $% 0^{5}10^{4}0^{23}10^{2}10^{2}$ is not the code of a Turing machine because $% t=4$ is not allowed. The exponent $t$ which stand for cell content must be between $1$ and $3.$ In order for a sequence of $0$'s and $1^{\prime }$s to be a code for a single instance of $f(s_{i},X_{j})=(s_{k},X_{l},D_{m}),$ we need necessarily exactly $4$ \ occurrences of $1.$Thus $000010010010$ is not such a code. It is also clear that within each code, every $1$ is followed by one or more $0^{\prime }s.$ \vspace{1pt}Given a sequence of $0^{\prime }s$ and $1^{\prime }s$, it is clear whether it describes an instance for $F$ or not. A Turing machine is a set of such $5-$ tuples. We can write them as a sequence of $0^{\prime }s$ and $1^{\prime }s$ , each one now separated by two units, that is by $11.$That is by \begin{equation*} \text{code}_{1}11\text{code}_{2}11\ldots 11\text{code}_{k} \end{equation*} if for our Turing machine the set $F$ has k-many $5-$tuples. \textbf{Exercise }Write down the code $C$ for the Turing machine on page 830 (780 fifth edition) that performs addition of natural numbers. This machine has $3$ states, and $I=\{1,\ast ,B\}$ which we can encode as $0,00,000$ . \textbf{Exercise }Read the section 3.6 on Representation of Integers, (2.5 fifth edition) and calculate the value of $C.$ \vspace{1pt}If $M$ is any Turing machine with code code$_{1}11$code$% _{2}11\ldots 11$code$_{k}$ then in the binary system, this string of zeros and units stands for a certain number which we call the \textbf{G\"{o}del number} $g(M)$ of $M.$ \vspace{1pt}The\textbf{\ Church-Turing} thesis states that any computational process can be implemented by a Turing machine. For example, let $n$ be any number. If we expand $n$ in the binary system then $n$ is given as a sequence of zeros and units. We clearly can read off this representation whether this sequence is the code of a Turing machine, that is whether there is some $M$ such that $n=g(M).$ \vspace{1pt}The process of finding the binary representation of a given number is algorithmic. For the decimal system we learned this at middle school. What the Church Turing thesis implies is that we can find a Turing machine that, given as input any number $n$ as a sequence of $n+1$ many zeros, then the machine halts with a sequence of zeros and units on the tape which is the binary representation of $n.$The machine then can decide whether $n=g(M).$ \vspace{1pt}Such an $M,$ if it exists, is uniquely determined by $n.$On the other hand, the codes of a Turing machine $M~$\ can be differently ordered. Thus a Turing machine $M$ may have different codes. \vspace{1pt} An input for a Turing machine can be any string $w$ of $0^{\prime }s$ and $% 1^{\prime }s.$ Like $w=1110010,$ but it can be the code of any other Turing machine, like $w=010010001010011001000101001.$ \vspace{1pt} Given a Turing machine $M$ and in input $w$ then the pair $(M,w)$ of the machine $M$ and input $w$ can be encoded by \begin{equation*} 111\text{code}_{1}11\text{code}_{2}11\ldots 11\text{code}_{k}111w=111g(M)111w \end{equation*} Which is just a number in binary notation. Of course, given any number $\ m$ we can expand it in binary notation and decide, by inspection, whether it is of the form $g(M,w).$ \vspace{1pt} Let $A\subseteq \Sigma ^{\ast }$ be a recognizable set. That is, $A=L(M)$ for some Turing machine $M.$ We can generate $A$ in the following way. Let $% w_{1},w_{2},\ldots ,w_{n},\ldots $ be an enumeration of the set $\Sigma ^{\ast }$ of all words. We then imagine that the machine gets as inputs batches of words: $w_{1};w_{1},w_{2};\ldots ;w_{1},w_{2},\ldots ,w_{n};\ldots $ That is, the first $n-$strings, for $n=1,2,\ldots $ If the input contains $n-$many words, then the machine stops after $n-$many steps and prints out those $k-$many words that have been accepted by $M$ after $\leq n-$steps. \ Clearly, $M$ prints out exactly the elements of $A$ and puts them in a list. Because of this, recognizable sets are also called \textbf{recursively enumerable. }If $w$ is a word over $\Sigma $ we have in general no way of telling whether it will eventually be recognized by $M,$or not$.$Of course, if $w\in A,$ then for some $n$, the word $w$ will be printed amongst the word for which $M$ needed $\leq n$ many steps to recognize it. If $w\notin A, $ then no matter how large $n$ is, $w$ will not show up in any printout. Now, if there is also a machine $N$ for $\Sigma ^{\ast }~\backslash ~A$ then we can run both machines $M$ and $N$ simultaneously. Then given any word $w$% , there will be some $n$ such that either $w$ is printed by $M$ or by $N.$ In other words, for any given word $w$ we can decide in a finite amount of time whether $w$ belongs to $A$ or not. That explains the terminology: A set $A$ is decidable if $A$ and its complement $\Sigma ^{\ast }~\backslash ~A$ are recognizable. Using arguments along these lines it is not difficult to see that the union and intersection of decidable sets is decidable. We may think that for the union Turing machines are working parallel and for the intersection in series. \newline \newline \textbf{Exercise} Find a Turing machine for $\emptyset $ and one for $\Sigma ^{\ast }.$That is, $\emptyset $ and $\Sigma ^{\ast }$ are decidable. \begin{theorem} \vspace{1pt}The decidable subsets of $\Sigma ^{\ast }$ form a boolean algebra. \end{theorem} Let $G$ be the set of binary numbers which are of the form $g(M)$ for some Turing machine $M.$Then $G$ is decidable. \begin{theorem} \emph{(G\"{o}del-Turing) }There is a machine $U$ such that $L(U)=\{n|\ n=g(M,w)$ where $w\in L(M)\}.$ \end{theorem} We may think that $U$ has two modules. The first module takes any input $n$ and decides whether it is of the form $n=g(M,w)$. If not, then $n$ is rejected. Otherwise, $n$ is fed into a second module. The second module takes the first $k$ non-rejected words $% n_{1}=g(M_{1},w_{1}),n_{2}=g(M_{2},w_{2}),\ldots n_{k}=g(M_{k},w_{k})$ and prints $n_{i}=g(M_{i},w_{i})$ if for the input $w_{i},$ the machine $M_{i}$ halts after $j-$many steps where $j\leq k.$This is done for $k=1,2,\ldots $ \vspace{1pt}This informal description how $U$ works describes an algorithmic process and can be formalized as a Turing machine. $U$ is called the \textbf{% Universal Turing Machine.} It is nowadays realized by programmable computers. A program is just some binary number $n$. It is accepted if it is syntactically correct, that is, if it is of the form $n=g(M)$ . Given any $w$ then $U$ runs $M$ with input $w.$The code for $U$ is called an \textbf{operating system.} It is just another program. \vspace{1pt} The previous theorem says that the set $L(U)=\{n|\ n=g(M,w)$ where $w\in L(M)\}$ is recognizable. But is it decidable? We are going to show that it is not. For this we need a preliminary \begin{lemma} $G=\{w|~w=g(M),w\notin L(M\}$ is not recognizable. \begin{proof} Assume that $G=L(M)$ for some Turing machine $M=M(G),$that is $G=L(M(G)).$ Then let $w_{g}=g(M(G)).$But then we would get: $w_{g}\in G=L(M(G))\Leftrightarrow w_{g}=g(M(G))$ and $w_{g}\notin L(M(G))=G.$Thus $% w_{g}\in G\Leftrightarrow w_{g}\notin G$, a contradiction. \end{proof} \end{lemma} Let us recast the definition of $G.$This set consists of all those binary numbers which are G\"{o}del numbers. Whether a given number $w$ is such a number is decidable. If it is of the form $g(M)$ then we take $w$ as initial input for $M$ The machine may accept $w$, namely if after finitely many steps it get's into a final state. But we put $w$ into $G$ if this is not the case. What the Lemma says is that there is no machine that recognizes $% G. $ The complement $L(U)^{c}$ of $L(U)$ in $\Sigma ^{\ast }$ is \begin{eqnarray*} L(U)^{c} &=&\{n|~n\neq g(M,w)\text{ for any Turing machine }M\} \\ \cup \{n|~n &=&g(M,w)\text{ for some }M\text{ but }w\notin L(M)\}=K\cup G^{^{\prime }} \end{eqnarray*} Where $K$ and $G^{^{\prime }}$ are disjoint and where of course $K$ is decidable. Assume now that \begin{equation*} G^{\prime }=\{n|~n=g(M,w)\text{ for some }M\text{ but }w\notin L(M)\} \end{equation*} is recognizable. It elements then could be put into a sequence $% n_{1},n_{2},\ldots .$Each $n_{i}$ is of the form $n_{i}=g(M_{i},w_{i})$ where we know that $w_{i}\notin L(M_{i}).$We now create the sequence of those $w_{i_{k}}$ where in $n_{i_{k}}=g(M_{i_{k}},w_{i_{k}})$ one has that $% w_{i_{k}}=g(M_{i_{k}}).$The elements $w_{i_{k}}$ make up the set $G.$Because $G$ is not recognizable, $G^{^{\prime }}$ is not recognizable. \begin{theorem} The set $L(U)^{c}$ is not recognizable. \end{theorem} \begin{proof} $G^{^{\prime }}=L(U)^{c}\backslash ~K$ where $K$ is decidable. But $% G^{^{\prime }}$ is not recognizable. So, $L(U)^{c}$ is not decidable. \end{proof} \vspace{1pt} There is no universal algorithm that decides whether a program accepts (halts) on a given input $w$ or not. That is: \textbf{The Halting Problem is undecidable. } \end{document}