#C7301. Classify Books by Genre
Classify Books by Genre
Classify Books by Genre
You are given a list of books along with associated keywords, and a list of genres with their specific keywords. For each book, determine the genre based on the number of matching keywords between the book's keywords and each genre's keywords. The process is case-insensitive and ignores non-alphabetical characters in keywords. If no genre keywords match, classify the book as Unknown
.
The matching is done by cleaning each keyword (converting to lowercase and removing any characters other than a-z) and then counting intersecting keywords between the book and each genre. The genre with the highest count is chosen. If multiple genres have the same count, the one encountered first in the input is selected.
inputFormat
The input is read from stdin
and is formatted as follows:
- An integer
n
representing the number of books. n
lines each containing a book record. Each record consists of the book title followed by a space and a comma-separated list of keywords. Note that the title itself might contain spaces.- An integer
m
representing the number of genre-keyword pairs. m
lines each containing a genre and its keywords in the format:Genre: keyword1,keyword2,...
.
outputFormat
For each book, output a line to stdout
containing the original book title followed by a space and the predicted genre. If no keywords match any genre, output Unknown
as the genre.
3
The Time Machine space,time,machine
The War of the Worlds alien,war,planet
To the Lighthouse sea,house,light
2
Science Fiction: space,alien,future
Drama: love,tragedy,life
The Time Machine Science Fiction
The War of the Worlds Science Fiction
To the Lighthouse Unknown
</p>