CISC 3220 - Homework #8

String Matching:

(1) Give an algorithm to fill up the FAIL array for the Knuth-Morris-Pratt pattern matching algorithm. Specifically, FAIL[j] should contain the length of the longest proper suffix of P[1...j] that is also a prefix of P.

Analyze the time complexity of your algorithm. As long as it is correct, it does not have to take linear time.

(2) Given a pattern P = abaaba . Draw the finite automaton for this pattern, assuming the alphabet is {a, b, c}. Recall that there should be 3 transitions leaving each state in the FA since the size of the alphabet is 3.

(3) Construct the FAIL array of the KMP algorithm for the pattern P in the previous question. Run the KMP text scanning algorithm on the following text: T= aababacabaabccaabaaba

(4) Let P = aaaaaaab . Draw the KMP automaton for this pattern, and then run the KMP text scanning algorithm on T=aaaaaaaaaaaaaaa. (Note: To show your work, write the states underneath the text characters, as we did in class.)

(5) Compute the edit distance between the following two strings, using the dynamic programming matrix.

S1 = THURSDAY
S2 = TUESDAY

What is the optimal alignment of these two strings?