#P8643. Counting Common DNA Substrings
Counting Common DNA Substrings
Counting Common DNA Substrings
Biologists are studying n species, each with a DNA sequence. For the ith species the DNA sequence is s[i], and its jth base is s[i][j] (each base is one of A, G, C, T). To find some common features among the species, scientists are interested in all contiguous DNA subsequences of length k that appear in at least m species.
More formally, a contiguous subsequence is represented by a 2m-tuple \[ (i_1, p_1, i_2, p_2, \ldots, i_m, p_m) \] which satisfies the following conditions:
- \(1 \le i_1 < i_2 < \cdots < i_m \le n\)
- For every offset \(q\) with \(0 \le q < k\), the equality \[ s[i_1][p_1+q] = s[i_2][p_2+q] = \cdots = s[i_m][p_m+q] \] holds, meaning that the same contiguous DNA string of length \(k\) appears in all the chosen species at positions \(p_1, p_2, \ldots, p_m\) respectively.
Your task is to compute how many such 2m-tuples exist. Two tuples are considered different if any of the positions in the tuple differs.
Note: The contiguous subsequence in each species can start at any valid position as long as the entire length \(k\) fits within the sequence.
inputFormat
The first line contains three integers \(n\), \(k\), and \(m\) separated by spaces.
Then \(n\) lines follow, where the \(i\)-th line contains the DNA sequence of the \(i\)-th species. Each sequence is a non-empty string consisting only of the characters A, G, C, and T.
It is guaranteed that each DNA sequence has length at least \(k\).
outputFormat
Output a single integer representing the number of valid 2m-tuples.
sample
2 2 2
AGCT
TGCA
1
</p>