DNA Sequence Completion and Lexicographical Ordering

ID: 16875

Type: Default

1000ms

256MiB

In biology, a DNA sequence is a chain of nucleotides represented by the characters A, C, G, and T. Sometimes, however, some nucleotides may be uncertain and are represented by the character N. A DNA sequence that does not contain any N is called a completed sequence, and an unfinished sequence is one that contains one or more N characters.

A completed sequence is said to suit an unfinished sequence if it can be obtained by replacing every N in the unfinished sequence by one of A, C, G, or T. Formally, given an unfinished sequence S and a completed sequence C of the same length, C suits S if for every index \(i\), either \(S[i]=N\) (in which case any nucleotide is allowed) or \(S[i]=C[i]\).

Researchers order the nucleotides by the following priority: \(A \prec C \prec G \prec T\). Thus, lexicographical order of DNA sequences is defined using this ordering. For example, consider the unfinished sequence ACANNCNNG. The first 7 completed sequences that suit it in lexicographical order are:

ACAAACAAG
ACAAACACG
ACAAACAGG
ACAAACCAG
ACAAACCCG
ACAAACCGG
ACAAACCTG

Your task is to generate the first \(K\) lexicographically smallest completed sequences that suit a given unfinished DNA sequence.

Input Format (see below).

Note: All formulas are given in \(\LaTeX\) format. For example, the condition for a completed sequence \(C\) to suit an unfinished sequence \(S\) can be written as: \(\forall i,\; (S[i] \neq N \implies C[i]=S[i])\).

inputFormat

The input consists of two lines:

The first line contains a non-empty string representing the unfinished DNA sequence. The string only contains the characters 'A', 'C', 'G', 'T', and 'N'.
The second line contains an integer \(K\) (\(K \ge 1\)), which indicates the number of completed sequences to output.

outputFormat

Output exactly \(K\) lines. Each line should contain one completed DNA sequence (i.e. a sequence with no 'N') that suits the given unfinished sequence. The sequences must be listed in lexicographical order based on the ordering \(A \prec C \prec G \prec T\).

sample

ACANNCNNG
7

ACAAACAAG
ACAAACACG
ACAAACAGG
ACAAACCAG
ACAAACCCG
ACAAACCGG
ACAAACCTG

</p>

#P3624. DNA Sequence Completion and Lexicographical Ordering

DNA Sequence Completion and Lexicographical Ordering