The concept of Z Algorithm is quite simple.
First of all, let us see what is a Z function or a Z array. At any index i, Z[i] stores longest match between
String[i,n] and String[0,n].
For example,
So, I hope after reading these examples i hope, its quite clear what a Z array is.
Let us redefine it, Z array stores longest match between all suffixes of a string and string itself. Or we can say if for any i, Z[i]>0 then that suffix is also a prefix of that string with length Z[i].
It has various applications in substring matching and finding count of a substring in a string.
Now let us see how to compute this array.
O(n^2) Algorithm : The Brute Force One.
Here, we one by one see each and every suffix and match it with the string.
For example if string is ababa
First of all, let us see what is a Z function or a Z array. At any index i, Z[i] stores longest match between
String[i,n] and String[0,n].
For example,
String Z array
a 1
aa 21
aba 301
ababa 50301
tests 50010
So, I hope after reading these examples i hope, its quite clear what a Z array is.
Let us redefine it, Z array stores longest match between all suffixes of a string and string itself. Or we can say if for any i, Z[i]>0 then that suffix is also a prefix of that string with length Z[i].
It has various applications in substring matching and finding count of a substring in a string.
Now let us see how to compute this array.
O(n^2) Algorithm : The Brute Force One.
Here, we one by one see each and every suffix and match it with the string.
for(i=0 to n-1) // assuming 0 indexing, n=string length
{
String suffix=str.substring(i,n); // suffix starting from i
int m=suffix.length(); // m=suffix length
int match=0; // for storing current longest match.
for(int j=0;j<m;j++)
{
if(str.charAt(j)==suffix.charAt(j)) match++; // if match found increment match
else break; // else break
}
Z[i]=match; // store the match in Z[i].
}
{
String suffix=str.substring(i,n); // suffix starting from i
int m=suffix.length(); // m=suffix length
int match=0; // for storing current longest match.
for(int j=0;j<m;j++)
{
if(str.charAt(j)==suffix.charAt(j)) match++; // if match found increment match
else break; // else break
}
Z[i]=match; // store the match in Z[i].
}
In this way we can get the Z function in O(n^2) time, but as we can see all later suffixes are also suffixes of previous ones.We can use this fact to come up with a better approach.
O(n) Algorithm : The Smart One.
As, I have said earlier we can use a fact that we can use the knowledge of previous computed longest match of a previous suffix.To do this we will use two variables L and R (interval) .These two variables for each index will store length of longest substring string[L..R] that is also a prefix of the string.Now how are these two values useful, as we can see in case of a match we can Z[i]=R-L+1; and if no match we need to make R such that Z[i] is 0 and L will be start of that match.
For example if string is ababa
Index L:R
0 0:4
1 1:0
2 2:4
3 3:2
4 4:4
Now let us see how the algorithm works,suppose we have calculated the values of L and R for an index i, then for an index i+1, we have two simple conditions,if(i>R) then we need to compute new values of L and R,else we can either use a precomputed Z value or compute a new interval.
Now let us see how the algorithm works (source : codeforces blog)
int L=0,R=0; // interval L and R initialize
Z[0]=n; // this will always be n
for(i=1 to n) // start with 1 now.
{
if(i>R) // our initial case
{
L=R=i; // set L and R initially before matching
while(R<n&&str.charAt(R)==str.charAt(R-i)) R++; //new R
Z[i]=R-L; // store Z
Z[0]=n; // this will always be n
for(i=1 to n) // start with 1 now.
{
if(i>R) // our initial case
{
L=R=i; // set L and R initially before matching
while(R<n&&str.charAt(R)==str.charAt(R-i)) R++; //new R
Z[i]=R-L; // store Z
R--; // bring R back, since last R did not matched.
}
else
{
int k=i-L; //length of current block of L...R
if(Z[k]<R-i+1) Z[i]=Z[k]; //since L...R is equal to 0...R-L+1 (match thats why L,R)
else
{
L=i; // in this case a greater match is possible
while(R<n&&str.charAt(R)==str.charAt(R-i)) R++; //compute a new R
Z[i]=R-L;R--; // store the new match
}
}
}
}
else
{
int k=i-L; //length of current block of L...R
if(Z[k]<R-i+1) Z[i]=Z[k]; //since L...R is equal to 0...R-L+1 (match thats why L,R)
else
{
L=i; // in this case a greater match is possible
while(R<n&&str.charAt(R)==str.charAt(R-i)) R++; //compute a new R
Z[i]=R-L;R--; // store the new match
}
}
}
In this way, we can compute the Z array.
Problems : CHSTR
No comments:
Post a Comment