NOTE: See also Notes on Unsolvability and Intractability
In the previous lecture, we saw the connection between languages and machines. We also know that some machines have intrinsic limitations in the types of languages that they can recognize. (FSA's cannot distinguish between bit strings with an equal number of 0's and 1's and those without.) A natural question is whether other machines (PDA, Turing machine, your computer) also suffers from fundamental limitations.
Besides the given explanation, another way to think of this problem is to consider a set of dominoes with strings on each half and consider whether there is a way to line them up so that the complete string on the top matches the one on the bottom. The strings may consist of any letters in the alphabet. They may be binary, A's and B's, or any other letters.
The point of this slide is that this seems like a fairly easy problem. For a small number of dominoes, you could certainly try all the combinations. However, you are given a limitless supply of dominoes. Notice that for the example given, the domino with A on the top and ABA on the bottom is used twice. It turns out that you cannot possibly write a C program to determine the answer! This is an incredibly strong claim, and requires some serious effort to prove (see COS 487). The very statement should be mind-bending! It has profound computational and philosophical consequences.
A Turing Machine is very similar to a FSA and Pushdown Automata (PDA). Recall a PDA was essentially a FSA with the extra ability to store things on a stack. The PDA is not allowed to look at the entire contents of the stack, only the topmost element. Also, the PDA can insert or delete items from the stack, but again, only from the top. A Turing machine is essentially a FSA with the extra ability to store things in an (infinitely long) array. It can move up and down the array, reading and writing along the way.
The input is on a 'tape.' The tape is an infinitely long 'strip of paper' (conceptually) that can be read and written to. The symbols on the tape are the empty symbol and the symbols of the alphabet. Think of a Turing Machine as consisting of a head (like you'd find on a typewriter) and then a system of states and transition-arrows, just like in a FSA. The head of the machine starts off pointing to the first symbol in the input string. In parallel, you start at the start state of the diagram. Arrows leaving this start state will have three parts, just like in a PDA. The first part corresponds to the input symbol - what is currently under the tape head. The second part tells you what to write on the tape: it may be exactly what you just read. The third part tells you whether to move to the left or right on the tape. After you write to the tape and move to the left or right, you should also move to the next state (in the diagram) as indicated by the arrow. You repeat this process until you reach either an accept or reject state. It may be non-deterministic. If there is no way to get to the accept state, you reject. Usually, you will also stop when you get to an empty symbol (following the input string.) At this point, you will probably know whether or not to accept the string.
You will see many variations on the definition of a Turing Machine. The most important aspects are that the input is written on a tape, the machine has a read/write head which moves along the tape, and the Turing Machine has the same sort of state transition diagram as the FSA and PDA models.
Some details that aren't written on the slide: the start state is the 'mark' state, the alphabet consists of 1's and 2's, zeroes are considered to be 'blank' symbols - there are an infinite number of them to the left and right of the input string, which is 1221, and an * designates the location of the read/write head.
So, initially, the head is positioned over the 1, and we are in state 'mark.' We follow the arrow coming out of the top of the mark state, labeled 1/0 R. We write over the 1 with a 0 and move to the right. We also follow the arrow to the move1 state.
Now, the tape is positioned over the first 2, so we follow the 2/2 R arrow of state move1. We write over the 2 with another 2 and move to the right on the tape. We stay in the same state. The same thing happens for the second 2, and the tape head is now positioned over the last 1. We follow the same arrow and stay in this state. Notice that we won't move out of this state until we get to a blank symbol ( a 0.)
When we read a 0, we follow the 0/0 L arrow from state move1 to state test1. This is the first time we move the tape head left instead of right (L instead of R.) What we've done so far is read an initial '1' and move to the end of the tape. Now, we need to check that the last symbol on the tape is also a 1 (since we're trying to determine whether the string is a palindrome.) Since 1 is the next symbol we read, we follow the 1/0 L arrow to the back state. So, we write over the last 1 with a zero and move the tape head to the left. We stay in state back until we read a 0, without changing the symbols on the tape (always write over a 1 with a 1, a 2 with a 2.)
When we've moved the tape head all the way to the left (we know we've reached the end, because we've reached a blank symbol,) we follow the 0/0 R arrow to the mark state. If you're following the steps listed below the diagram, you should see that we now read a 2, so we follow the 2/0 R arrow to the move2 state. We write over the 2 with a 0 and move the tape head to the right. It is now positioned over the second 2. Next, follow the 2/2 R arrow and stay in state move 2. We've now reached the right end of the input string.
We read a 0 and follow the 0/0 L arrow to the test2 state. The head is now positioned over a 2. We follow the 2/0 L arrow to the back state. So, we have now written over all the input symbols (1's and 2's) with blank symbols (0's). We follow the arrow from state back to state mark. Since we can only read a 0 here, we follow the 0/0 L arrow to the yes state. We stop, concluding that the string 1221 is indeed a palindrome.
This program is set up to simulate Turing Machines that have 2 input symbols and less than 100 states. The tape array holds the characters on the tape, including blank symbols; it can hold a maximum of 2000 symbols. The head starts at location 1000 (so the input string must be less than 1000 characters long.) The value of the variable N isn't indicated. Basically, the loop should continue until either an ACCEPT or REJECT state is reached.
The first line of the loop sets the 'in' variable, which is a integer. It is calculated by subtracting the character value of '0' from the symbol read by the tape head. So, it should always have either 0, 1, or 2 as its value. The next state is determined by looking up the correct entry in the 'next' 2-dimensional array. The values are set when we 'read in machine.' Basically, this array corresponds to all of the arrows in the diagram. For each state, there are 3 arrows, one for each input symbol. The destination of each of these arrows will be stored in the 'next' array.
The second part of the entry on each arrow corresponds to what to write on the tape. These values are stored in the 'out' array. The 'tape' array is updated by writing the correct output to the location referenced by the head of the tape. This is done in the third line of the while loop.
Finally, the move array holds values 1 (for right) and -1 (for left) corresponding to the third part of the arrow for the each state/input combination. In the last line of the while loop, the head value is updated, moving the head to the left or right.
Just like with the C program simulating an FSA, the while loop continues, updating the state at each iteration, until we've decided whether the string is in the language. The important difference is that there's no guarantee that this loop will terminate without crashing (array-index-out-of-bounds error.)
In the previous slide, we saw that everyday computers are at least as powerful as a Turing machine (since we can always write a C program to simulate any Turing machine). The converse is also true: Turing machines are as powerful everyday computers. Most of you will probably be willing to accept this claim. For doubters, a proof sketch is outlined.
Note that Turing machines are more powerful than FSA, PDA, and LBA (linear-bounded automata) - recall the Chomsky hierarchy. By power, we mean the ability to recognize more languages. A more powerful machine can recognize any language recognizable by a less powerful machine (i.e., it can solve all the problems solvable by a less powerful machine). In addition, a more powerful machine can recognize languages not recognizable by a less powerful machine. FSA's are less powerful than PDA's because PDA's can recognize palindromes (in addition to regular languages), but FSA's can't.
One limitation of the Turing machine that you may have noticed is that it applies to only a single problem. Computers have many different programs. They can execute any of them; (they execute whichever one is currently in memory.) The idea of a universal Turing machine is that it consists of many specific Turing machines. The input to a universal Turing machine specifies which Turing machine and the input string. The universal Turing machine is basically one which acts like the C program simulator. It can simulate the actions of any other Turing machine. So, the universal Turing machine corresponds more closely to computers as we know them.
Should be pretty self-explanatory. The thesis states that a Turing machine is as powerful as any computer. It implies that our intuitive notion of an algorithm coincides exactly with a Turing machine,
In terms of Turing machines, the Halting Problem can be expressed as follows:
Given a Turing machine T and a string w, does T halt on input w?
This is a difficult problem, since the tapes are infinitely long. Unlike finite state machines, there is no guaranteed end. Even if the Turing machine is deterministic, it may never crash and at the same time, never end. What the halting problem expresses is that we can't create a second Turing machine, which takes some encoding of the Turing machine in question, and determine whether that first Turing machine will halt.
C compilers can check syntax errors and even alert us to possibly uninitialized values. However, no compiler will be able to tell you whether you have an infinite loop in your program. If you had a few core dumps in your Star Tours programming assignment, you can probably appreciate the value such a compiler would have.
A proof outline is given to show you that the halting problem is unsolvable. It's difficult to follow this unless you've had some experience with proofs.
In addition to the Halting problem, many other problems are known to be unsolvable. Some have serious practical consequences.
The overall idea is that if you know that one problem is unsolvable (like the Halting Problem), you can show that a second problem is also unsolvable by showing that if you can solve the second problem, you can solve the first. (This is a major simplification.) This method is called reduction and is useful in a wide variety of contexts.
Computers have fundamental limitations. If human beings function anything like computers, then human beings also have intrinsic limitations!