#C12745. Text Message Classification Accuracy
Text Message Classification Accuracy
Text Message Classification Accuracy
You are given a CSV file (via standard input) containing text messages and their associated labels. The CSV has a header row with two columns: message
and label
. Your task is to build a text classification system using the following process:
- Extract features from the text messages using the TF-IDF method.
- Split the dataset into training and testing sets (use a test size of 20% and fix
random_state=42
for reproducibility). - Train a Logistic Regression model on the training set.
- Predict the labels on the test set and compute the accuracy.
The accuracy is given by the formula:
$$\text{accuracy}=\frac{\text{number of correct predictions}}{\text{total number of predictions}}.$$Your program should output the computed accuracy as a floating-point number to standard output.
inputFormat
The input is provided via standard input and consists of CSV data. The first line is a header: message,label
. Each subsequent line contains a text message and its corresponding label separated by a comma. For example:
message,label hello,ham buy now,spam
outputFormat
Output a single floating-point number representing the accuracy of the Logistic Regression model on the test data. The result should be printed to standard output.
## samplemessage,label
hello,ham
hello friend,ham
greetings,ham
good day,ham
buy now,spam
limited offer,spam
discount available,spam
cheap price,spam
1.0