#K34732. Email Subject Word Frequency
Email Subject Word Frequency
Email Subject Word Frequency
You are given an integer n
representing the number of email subjects, followed by n
lines each containing one email subject. Your task is to parse these email subjects and compute the frequency of each word that is not a common stopword.
The words should be treated in a case-insensitive manner (e.g. "Quick" is equivalent to "quick"). Once the frequency is computed, output a JSON-formatted dictionary with the words sorted in lexicographical order. Each key in the dictionary should be a word and its corresponding value should be the frequency of that word.
Notes:
- Common stopwords to ignore are: \( \{\texttt{the, is, in, and, an, a, to, of, for, on, at, with, by, from, as, this, that, these, those} \} \).
- The output must be in valid JSON format without any extra spaces (as in compact representation).
Hint: Consider using a dictionary (or map) to store word counts and then sort the keys before printing.
inputFormat
The first line of input contains an integer n
(\(0 \leq n \leq 10^5\)), the number of email subjects. The following n
lines each contain a non-empty string representing an email subject.
outputFormat
Output a single line containing a JSON object. The keys of the JSON object are the non-stopword words (in lowercase) sorted in lexicographical order, and the values are the respective frequency counts.
For example: {"away":1,"brown":1,...}
3
The quick brown fox
Jumped over the lazy dog
AND ran away quickly
{"away":1,"brown":1,"dog":1,"fox":1,"jumped":1,"lazy":1,"over":1,"quick":1,"quickly":1,"ran":1}