#C13782. Log Analyzer and Anomaly Summarizer

    ID: 43358 Type: Default 1000ms 256MiB

Log Analyzer and Anomaly Summarizer

Log Analyzer and Anomaly Summarizer

You are given multiple log files, each containing several log entries. Every log entry is in the format timestamp metadata, where timestamp is in the ISO format YYYY-MM-DDTHH:MM:SS, and metadata is a string containing details about the event.

Your task is to:

  • Merge all the log entries from the given log files.
  • Sort the merged logs by their timestamps in ascending order.
  • Detect anomalies by matching the metadata against a given regular expression pattern.
  • Within a given time window, summarize the anomalies by counting the number of occurrences of each anomaly message.
  • </p>

    The regular expression will use the usual syntax and you should use LaTeX formatting for any formulas. For example, the timestamp is given in the format: \(YYYY\mbox{-}MM\mbox{-}DDT HH:MM:SS\). The summary must list each anomaly and its frequency, in the order they appear in the sorted logs.

    inputFormat

    The input is read from standard input (stdin) and has the following format:

    N
    M1
    log_entry_1
    log_entry_2
    ... 
    log_entry_M1
    M2
    log_entry_1
    ... 
    log_entry_M2
    ...
    MN
    log_entry_1
    ... 
    log_entry_MN
    pattern
    start_time
    end_time
    

    where:

    • N is the number of log files.
    • For each log file, Mi is the number of log entries in that file.
    • Each log_entry is a string in the format: YYYY-MM-DDTHH:MM:SS metadata.
    • pattern is a regular expression used to detect anomalies in the metadata.
    • start_time and end_time define the time interval in ISO format (YYYY-MM-DDTHH:MM:SS) over which the anomalies should be summarized.

    outputFormat

    The output should be printed to standard output (stdout). For each unique anomaly detected (in order of their appearance in the sorted logs) that occurs between start_time and end_time (inclusive), print a line containing the anomaly message and its frequency separated by a space. If no anomalies are detected in the given time window, print a single line: No anomalies found.

    ## sample
    2
    2
    2023-10-01T12:00:00 EventA error occurred
    2023-10-01T12:05:00 EventB occurred
    2
    2023-10-01T12:03:00 EventC anomaly detected
    2023-10-01T12:10:00 EventD occurred
    error|anomaly
    2023-10-01T12:00:00
    2023-10-01T12:10:00
    
    EventA error occurred 1
    

    EventC anomaly detected 1

    </p>