#P4235. Expected Distinct Hash Count

    ID: 17482 Type: Default 1000ms 256MiB

Expected Distinct Hash Count

Expected Distinct Hash Count

You are given a set of n strings and an integer parameter x. For each string s, we define a set of hash values computed by a randomized polynomial hash function:

For every integer r in [0, x), the hash value of s is computed as

[ H(s, r)=\Bigl(\sum_{i=1}^{|s|} (s_i-\mathtt{'a'}+1) \cdot r^{|s|-i} \Bigr) \bmod 65537, ]

where the modulus 65537 = 2^{16}+1 is prime. Note that when r = 0, the hash reduces to the value of the last character (converted to a number in the range 1 to 26).

For each string, a random integer r is chosen uniformly at random from [0, x) (using a procedure analogous to Rand(x)), and then the hash H(s, r) is computed. The ans is defined as the number of distinct hash values produced over all the strings.

Your task is to compute the expected value of ans over the randomness of the independent choices of r for each string.

Hint: For each possible hash value v in the union of all possible outcomes, the probability that v is not produced by a given string s is \[ \begin{cases} 1-\frac{1}{x} & \text{if } v \in \{H(s, r): 0\le r<x\}\\ 1 & \text{otherwise.} \end{cases} \] Thus, the expected contribution of v is \[ 1-\prod_{s: v \in \{H(s, r)\}}\Bigl(1-\frac{1}{x}\Bigr). \] Summing over all v gives the final expected number of distinct hash values.

inputFormat

The input consists of the following:

  1. A line containing two integers n and x (1 ≤ n, x ≤ reasonable limits), where n is the number of strings and x is the upper bound for the random base.
  2. Then n lines follow, each containing a non-empty string consisting of lowercase English letters.

Note: Although in the sample code the constant x is fixed to 10, in this problem, x is given as part of the input.

outputFormat

Output the expected value of the distinct hash count (ans) as a floating point number. The answer will be considered correct if it has an absolute or relative error of at most 10-6.

sample

3 10
a
b
c
0.300000