#C12745. Text Message Classification Accuracy

    ID: 42206 Type: Default 1000ms 256MiB

Text Message Classification Accuracy

Text Message Classification Accuracy

You are given a CSV file (via standard input) containing text messages and their associated labels. The CSV has a header row with two columns: message and label. Your task is to build a text classification system using the following process:

  • Extract features from the text messages using the TF-IDF method.
  • Split the dataset into training and testing sets (use a test size of 20% and fix random_state=42 for reproducibility).
  • Train a Logistic Regression model on the training set.
  • Predict the labels on the test set and compute the accuracy.

The accuracy is given by the formula:

$$\text{accuracy}=\frac{\text{number of correct predictions}}{\text{total number of predictions}}.$$

Your program should output the computed accuracy as a floating-point number to standard output.

inputFormat

The input is provided via standard input and consists of CSV data. The first line is a header: message,label. Each subsequent line contains a text message and its corresponding label separated by a comma. For example:

message,label
hello,ham
buy now,spam

outputFormat

Output a single floating-point number representing the accuracy of the Logistic Regression model on the test data. The result should be printed to standard output.

## sample
message,label
hello,ham
hello friend,ham
greetings,ham
good day,ham
buy now,spam
limited offer,spam
discount available,spam
cheap price,spam
1.0