pylipid.func.cluster_KMeans¶
- pylipid.func.cluster_KMeans(data, n_clusters)[source]¶
Cluster data using KMeans.
This function clusters the samples using KMeans provided by scikit. The KMeans cluster separates the samples into n clusters of equal variances, via minimizing the inertia, which is defined as:
\[\sum_{i=0}^{n} \min _{u_{i} \in C}\left(\left\|x_{i}-u_{i}\right\|^{2}\right)\]where \(u_{i}\) is the centroid of cluster i. KMeans scales well with large dataset but performs poorly with clusters of varying sizes and density.
- Parameters
data (numpy.ndarray, shape=(n_samples, n_dims)) – Sample data to find clusters.
n_clusters (int) – The number of clusters to form as well as the number of centroids to generate.
- Returns
labels – Cluster labels for each data point.
- Return type
array_like, shape=(n_samples)