#P8013. Minimizing Insertion Cost to Embed a Target Substring
Minimizing Insertion Cost to Embed a Target Substring
Minimizing Insertion Cost to Embed a Target Substring
Given a DNA string S composed solely of the characters A
, C
, G
, and T
, you are allowed to insert additional characters from the set {A, C, G, T} at any positions in S. The goal is to ensure that the resulting string contains the target substring ACGT
as a contiguous block while minimizing the total insertion cost. The cost for inserting each character is given and may differ from one character to another.
Formally, let S be the original string and let T be the fixed target string "ACGT". You are also provided four integers representing the insertion cost for A, C, G, and T respectively. You must insert extra letters (if needed) into S such that there exists a contiguous substring in the resulting string equal to T. When choosing to align part of T with a segment of S, note that because S cannot be modified or deleted, the matched segment must appear as a contiguous substring in S and must exactly match a contiguous segment of T. In other words, if you choose a contiguous segment T[i...j] that appears in S, then you need to insert the missing characters T[0...i-1] before it and T[j+1...end] after it. Your task is to compute the minimum total cost required to achieve this. If S already contains T, the cost is 0.
The cost of inserting a sequence of characters is simply the sum of insertion costs of each character inserted. You may also opt not to use any part of S to match T, which is equivalent to inserting the entire target string T at an appropriate position.
Note on formula:
If you use a contiguous matching segment T[i...j] found in S, then the total insertion cost is given by:
[ \text{cost} = \sum_{k=0}^{i-1} c(T[k]) + \sum_{k=j+1}^{3} c(T[k]) ]
where \(c(x)\) is the insertion cost of character \(x\), and T = "ACGT" (with indices 0 to 3).
inputFormat
The input consists of two lines:
- The first line contains the string S composed only of the characters A, C, G, and T.
- The second line contains four space-separated integers representing the insertion cost for A, C, G, and T respectively.
outputFormat
Output a single integer representing the minimum total insertion cost required to ensure that S (after insertions) contains the contiguous substring "ACGT".
sample
ACGT
1 2 3 4
0