#C14822. CSV Customer Data Preprocessing
CSV Customer Data Preprocessing
CSV Customer Data Preprocessing
You are given a CSV dataset from standard input containing customer information. The CSV file includes a header row with the following columns:
CustomerID, Name, Age, Email, PurchaseAmount, JoinDate
Your task is to preprocess the data with the following steps:
-
Replace missing values in the Age and PurchaseAmount columns with the arithmetic mean of the non‐missing values in the respective column. Do not round the mean value during computation; however, when printed, numeric values may be formatted to at most two decimal places as needed.
-
Convert the JoinDate column (format: YYYY-MM-DD) into a date and compute a new column DaysSinceJoining, representing the number of days between a fixed current date and the JoinDate. For this problem, assume the current date is (2020\text{-}12\text{-}31).
-
Based on the PurchaseAmount value, create a new column CustomerCategory with the following rules: (\text{Low}): PurchaseAmount < 100, (\text{Medium}): 100 (\le) PurchaseAmount < 500, (\text{High}): PurchaseAmount (\ge) 500.
-
Output the processed dataset as CSV to standard output, including the original columns and the two new columns (DaysSinceJoining and CustomerCategory). The output CSV must contain a header row.
Note: The input is read from standard input (stdin) and the output must be written to standard output (stdout).
inputFormat
The input is provided via standard input. The first line is a header row with the column names: "CustomerID,Name,Age,Email,PurchaseAmount,JoinDate". Each subsequent line contains one record with the corresponding data fields. Missing values for Age or PurchaseAmount will be represented by an empty field.
outputFormat
Output the processed CSV to standard output. The output CSV must include the original columns in the same order and, appended to them, two new columns: "DaysSinceJoining" and "CustomerCategory". The header row must be printed as the first line.## sample
CustomerID,Name,Age,Email,PurchaseAmount,JoinDate
1,John Doe,25,john@example.com,150,2020-01-01
2,Jane Smith,,jane@example.com,200,2019-05-15
3,Bob Johnson,35,bob@example.com,,2018-08-20
4,Alice Martin,45,alice@example.com,50,2017-04-30
CustomerID,Name,Age,Email,PurchaseAmount,JoinDate,DaysSinceJoining,CustomerCategory
1,John Doe,25,john@example.com,150,2020-01-01,365,Medium
2,Jane Smith,35,jane@example.com,200,2019-05-15,596,Medium
3,Bob Johnson,35,bob@example.com,133.33,2018-08-20,864,Medium
4,Alice Martin,45,alice@example.com,50,2017-04-30,1341,Low
</p>