#C752. Text Evaluation Metrics

    ID: 51400 Type: Default 1000ms 256MiB

Text Evaluation Metrics

Text Evaluation Metrics

You are given a block of text. Your task is to compute several metrics from the text. In particular, you must evaluate the text in terms of:

  • Total Words: The total number of words in the text.
  • Unique Words: The number of distinct words (case-insensitive, punctuation ignored).
  • Lexical Diversity: The ratio \(\frac{\text{unique words}}{\text{total words}}\) (if there are no words, output 0).
  • Most Common Words: The top five most frequent words. In case multiple words share the same frequency, sort them in lexicographical order (alphabetically) ascending. If there are fewer than five unique words then output all.

Your program should read the input text from standard input and output the computed metrics in JSON format.

The output JSON object should contain the following keys:

  • total_words : An integer representing the total number of words.
  • unique_words : An integer representing the number of distinct words.
  • lexical_diversity : A floating-point number representing the lexical diversity.
  • most_common_words : A list of pairs, where each pair contains a word (string) and its corresponding count (integer). The list should contain at most five such pairs.

inputFormat

The input consists of a single block of text (which may span multiple lines). The text may be empty. Words are defined as contiguous sequences of alphanumeric characters (letters and digits) and are case-insensitive.

outputFormat

The output should be a JSON object (printed as a single line) with the following keys:

  • total_words
  • unique_words
  • lexical_diversity
  • most_common_words (a list of [word, count] pairs)
## sample
Hello hello World world
{"total_words":4,"unique_words":2,"lexical_diversity":0.5,"most_common_words":[["hello",2],["world",2]]}