#C14783. Document Classification

    ID: 44470 Type: Default 1000ms 256MiB

Document Classification

Document Classification

You are required to implement a simple text classification system. Given an input document, your task is to classify it into one of the three categories:

  • rec.sport.baseball
  • sci.med
  • rec.motorcycles

The classification should be based on keyword matching according to the following rules:

  • If the document contains the word baseball (case-insensitive), classify it as rec.sport.baseball.
  • If the document contains any of the words medicine, clinical, or cancer (case-insensitive), classify it as sci.med.
  • If the document contains the word motorcycle (case-insensitive), classify it as rec.motorcycles.

If none of these keywords appear, output Unknown as the category. The output should be formatted as detailed below.

Note: Use the LaTeX format for any formulas if needed. In our case, no formulas are needed.

inputFormat

Input is provided via standard input (stdin) as a single document. The document is a string that can span multiple lines and contains spaces and punctuation.

outputFormat

Output to standard output (stdout) a single line in the following format:

The document is classified as:

where is one of: rec.sport.baseball, sci.med, or rec.motorcycles. If none of the keywords is found, output Unknown.## sample

The baseball game was amazing and the team performed very well.
The document is classified as: rec.sport.baseball