#C13257. Fraudulent Transaction Detection via Simulated Preprocessing and Classification
Fraudulent Transaction Detection via Simulated Preprocessing and Classification
Fraudulent Transaction Detection via Simulated Preprocessing and Classification
You are given a dataset representing financial transactions. Each transaction comes with d numerical features and a binary label indicating whether the transaction is fraudulent (1) or not (0). Your task is to simulate a machine learning workflow:
- Preprocessing: For each feature, compute the mean \(\mu\) and standard deviation \(\sigma\). Then, standardize each feature value \(x\) using the formula \(x' = \frac{x - \mu}{\sigma}\). (If \(\sigma = 0\), use \(1\) instead to avoid division by zero.)
- Classification: For each sample, compute the sum of its standardized feature values. If the sum is greater than or equal to 0, predict the label as 1; otherwise, predict 0.
- Evaluation: Based on your predictions and the true labels, compute the confusion matrix where:
The confusion matrix is a \(2 \times 2\) matrix defined as follows:
[ CM = \begin{pmatrix} \text{TN} & \text{FP} \ \text{FN} & \text{TP} \end{pmatrix} ]
Here,
- \(\text{TN}\): True Negatives (actual 0 and predicted 0)
- \(\text{FP}\): False Positives (actual 0 and predicted 1)
- \(\text{FN}\): False Negatives (actual 1 and predicted 0)
- \(\text{TP}\): True Positives (actual 1 and predicted 1)
Your program should read from standard input and output the resulting confusion matrix as described below.
inputFormat
The first line of input contains two integers n and d, where n is the number of samples and d is the number of features.
Then, n lines follow. Each of these lines contains d space-separated real numbers (the features) followed by an integer (0 or 1) which represents the target label.
outputFormat
Print the confusion matrix in two lines. The first line should contain two space‐separated integers: the number of true negatives (TN) and false positives (FP). The second line should contain two space‐separated integers: the number of false negatives (FN) and true positives (TP).
## sample3 2
1 2 0
3 4 1
5 6 1
1 0
0 2
</p>