pymatgen.analysis.diffusion.aimd.clustering module¶
This module implements clustering algorithms to determine centroids, with adaption for periodic boundary conditions. This can be used, for example, to determine likely atomic positions from MD trajectories.
- class Kmeans(max_iterations: int = 1000)[source]¶
Bases:
object
Simple kmeans clustering.
- Parameters:
max_iterations (int) – Maximum number of iterations to run KMeans algo.
- cluster(points, k, initial_centroids=None)[source]¶
- Parameters:
points (ndarray) – Data points as a mxn ndarray, where m is the number of features and n is the number of data points.
k (int) – Number of means.
initial_centroids (np.array) – Initial guess for the centroids. If None, a randomized array of points is used.
- Returns:
centroids are the final centroids, labels provide the index for each point, and ss in the final sum squared distances.
- Return type:
centroids, labels, ss
- static get_centroids(points, labels, k, centroids)[source]¶
Each centroid is the geometric mean of the points that have that centroid’s label. Important: If a centroid is empty (no points have that centroid’s label) you should randomly re-initialize it.
- Parameters:
points – List of points
labels – List of labels
k – Number of means
centroids – List of centroids
- class KmeansPBC(lattice, max_iterations=1000)[source]¶
Bases:
Kmeans
A version of KMeans that work with PBC. Distance metrics have to change, as well as new centroid determination. The points supplied should be fractional coordinates.
- Parameters:
lattice – Lattice
max_iterations – Maximum number of iterations to run KMeans.
- get_centroids(points, labels, k, centroids)[source]¶
Each centroid is the geometric mean of the points that have that centroid’s label. Important: If a centroid is empty (no points have that centroid’s label) you should randomly re-initialize it.
- Parameters:
points – List of points
labels – List of labels
k – Number of means
centroids – List of centroids