#C13267. E-commerce Transaction Analysis
E-commerce Transaction Analysis
E-commerce Transaction Analysis
Given an e-commerce dataset in CSV format from standard input, you are required to perform the following tasks:
- Load and clean the dataset by removing any rows with missing values.
- Convert the
InvoiceDate
column into date objects and extract the month in the format \(\text{YYYY-MM}\) (month must be two digits if necessary). - Group the transactions by month and country, counting the number of transactions for each group.
- For each country, determine the peak month (i.e. the month with the maximum number of transactions). In case of a tie, choose the earliest month.
- Output the result for each country in the following format: "Peak month for {Country} is {Year-Month}". The results should be printed in alphabetical order by country.
Note: The CSV file has a header and the following columns: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country
. The InvoiceDate
field is given in the format MM/DD/YYYY HH:MM and may not contain leading zeros for months/days. You must handle the date conversion accordingly.
inputFormat
The input is provided via standard input (stdin) as raw CSV text. The first line is the header. Each subsequent line is a record containing 8 comma-separated fields: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country
. Rows with any missing field or improperly formatted InvoiceDate
should be ignored.
outputFormat
For each country found in the cleaned dataset, output a line in the format:
Peak month for {Country} is {Year-Month}
The countries should be output in alphabetical order. The month must be displayed in YYYY-MM format.
## sampleInvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,12/1/2010 8:26,2.55,17850,United Kingdom
536365,71053,WHITE METAL LANTERN,6,12/1/2010 8:26,3.39,17850,United Kingdom
536365,XXX,Invalid,,invalidDate,,,
Peak month for United Kingdom is 2010-12
</p>