K-Means Clustering

ID: 43137

Type: Default

1000ms

256MiB

You are given a dataset of n points in d-dimensional space. Your task is to implement the K-means clustering algorithm. The input provides the number of points, the number of dimensions, and the number of clusters k. Using a deterministic initialization (by selecting the first k points as initial centroids), perform K-means clustering until convergence or until 100 iterations have been reached.

The K-means algorithm works as follows:

\( \textbf{Initialization:} \text{Choose } k \text{ initial centroids from the dataset (the first } k \text{ points).} \)

\( \textbf{Assignment Step:} For each point \( \mathbf{x}_i \), assign it to the cluster with the closest centroid (using Euclidean distance): \( \min_{1 \leq j \leq k} \|\mathbf{x}_i - \mathbf{c}_j\|_2 \) \).

\( \textbf{Update Step:} For each cluster, update the centroid by calculating the mean of all points assigned to that cluster. If no points are assigned to a centroid, keep it unchanged. \)

Repeat the assignment and update steps until the change in centroids (measured by the Euclidean norm of the difference) is less than \(10^{-4}\) or 100 iterations have been reached. Finally, output the final centroids and the cluster labels for each input point.

Note: Output the centroids first (each centroid on a new line, with coordinates separated by a space and rounded to one decimal place), followed by a single line containing the cluster assignments (0-indexed) for the points in the order they were given, separated by a space.

inputFormat

The first line contains three integers: n (the number of points), d (the dimension of the space), and k (the number of clusters).

Then follow n lines, each containing d floating-point numbers representing a point in d-dimensional space.

outputFormat

Output exactly k lines, where the i-th line contains the coordinates of the i-th centroid (in order from cluster index 0 to k-1), with each coordinate rounded to one decimal place and separated by a space.

Then on a new line, output n integers (0-indexed cluster labels) separated by spaces, corresponding to the cluster assignment of each input point in the order given.

## sample

2.0 3.0
6.0 7.0
0 0 1 1

</p>

#C13583. K-Means Clustering

K-Means Clustering