megnet.data.graph module¶

Abstract classes and utility operations for building graph representations and data loaders (known as Sequence objects in Keras). Most users will not need to interact with this module.

class BaseGraphBatchGenerator(dataset_size: int, targets: numpy.ndarray, sample_weights: Optional[numpy.ndarray] = None, batch_size: int = 128, is_shuffle: bool = True)[source]¶

Bases: keras.utils.data_utils.Sequence

Base class for classes that generate batches of training data for MEGNet. Based on the Sequence class, which is the data loader equivalent for Keras. Implementations of this base class must implement the _generate_inputs(), which generates the lists of graph descriptions for a batch. The process_atom_features() function and related functions are used to modify the features for each atom, bond, and global features when creating a batch.

Parameters

dataset_size (int) – Number of entries in dataset
targets (ndarray) – Feature to be predicted for each network
sample_weights (npdarray) – sample weights
batch_size (int) – Maximum batch size
is_shuffle (bool) – Whether to shuffle the data after each step

on_epoch_end()[source]¶: code to be executed on epoch end

process_atom_feature(x: numpy.ndarray) → numpy.ndarray[source]¶

Parameters: x (np.ndarray) – atom features
Returns: processed atom features

process_bond_feature(x: numpy.ndarray) → numpy.ndarray[source]¶

Parameters: x (np.ndarray) – bond features
Returns: processed bond features

process_state_feature(x: numpy.ndarray) → numpy.ndarray[source]¶

Parameters: x (np.ndarray) – state features
Returns: processed state features

class Converter[source]¶

Bases: monty.json.MSONable

Base class for atom or bond converter

convert(d: Any) → Any[source]¶

Convert the object d :param d: Any object d :type d: Any

Returns: returned object

class DummyConverter[source]¶

Bases: megnet.data.graph.Converter

Dummy converter as a placeholder

convert(d: Any) → Any[source]¶

Dummy convert, does nothing to input :param d: input object :type d: Any

Returns: d

class EmbeddingMap(feature_matrix: numpy.ndarray)[source]¶

Bases: megnet.data.graph.Converter

Convert an integer to a row vector in a feature matrix

Parameters: feature_matrix – (np.ndarray) A matrix of shape (N, M)

convert(int_array: numpy.ndarray) → numpy.ndarray[source]¶

convert atomic number to row vectors in the feature_matrix :param int_array: (1d array) number array of length L

Returns: (matrix) L*M matrix with N the length of d and M the length of centers

class GaussianDistance(centers: numpy.ndarray = array([0., 0.05050505, 0.1010101, 0.15151515, 0.2020202, 0.25252525, 0.3030303, 0.35353535, 0.4040404, 0.45454545, 0.50505051, 0.55555556, 0.60606061, 0.65656566, 0.70707071, 0.75757576, 0.80808081, 0.85858586, 0.90909091, 0.95959596, 1.01010101, 1.06060606, 1.11111111, 1.16161616, 1.21212121, 1.26262626, 1.31313131, 1.36363636, 1.41414141, 1.46464646, 1.51515152, 1.56565657, 1.61616162, 1.66666667, 1.71717172, 1.76767677, 1.81818182, 1.86868687, 1.91919192, 1.96969697, 2.02020202, 2.07070707, 2.12121212, 2.17171717, 2.22222222, 2.27272727, 2.32323232, 2.37373737, 2.42424242, 2.47474747, 2.52525253, 2.57575758, 2.62626263, 2.67676768, 2.72727273, 2.77777778, 2.82828283, 2.87878788, 2.92929293, 2.97979798, 3.03030303, 3.08080808, 3.13131313, 3.18181818, 3.23232323, 3.28282828, 3.33333333, 3.38383838, 3.43434343, 3.48484848, 3.53535354, 3.58585859, 3.63636364, 3.68686869, 3.73737374, 3.78787879, 3.83838384, 3.88888889, 3.93939394, 3.98989899, 4.04040404, 4.09090909, 4.14141414, 4.19191919, 4.24242424, 4.29292929, 4.34343434, 4.39393939, 4.44444444, 4.49494949, 4.54545455, 4.5959596, 4.64646465, 4.6969697, 4.74747475, 4.7979798, 4.84848485, 4.8989899, 4.94949495, 5.]), width=0.5)[source]¶

Bases: megnet.data.graph.Converter

Expand distance with Gaussian basis sit at centers and with width 0.5.

Parameters

centers – (np.array) centers for the Gaussian basis
width – (float) width of Gaussian basis

convert(d: numpy.ndarray) → numpy.ndarray[source]¶

expand distance vector d with given parameters :param d: (1d array) distance array

Returns: (matrix) N*M matrix with N the length of d and M the length of centers

class GraphBatchDistanceConvert(atom_features: List[numpy.ndarray], bond_features: List[numpy.ndarray], state_features: List[numpy.ndarray], index1_list: List[int], index2_list: List[int], targets: Optional[numpy.ndarray] = None, sample_weights: Optional[numpy.ndarray] = None, batch_size: int = 128, is_shuffle: bool = True, distance_converter: Optional[megnet.data.graph.Converter] = None)[source]¶

Bases: megnet.data.graph.GraphBatchGenerator

Generate batch of structures with bond distance being expanded using a Expansor

Parameters

atom_features – (list of np.array) list of atom feature matrix,
bond_features – (list of np.array) list of bond features matrix
state_features – (list of np.array) list of [1, G] state features, where G is the global state feature dimension
index1_list – (list of integer) list of (M, ) one side atomic index of the bond, M is different for different structures
index2_list – (list of integer) list of (M, ) the other side atomic index of the bond, M is different for different structures, but it has to be the same as the correponding index1.
targets – (numpy array), N*1, where N is the number of structures
sample_weights – (numpy array), N*1, where N is the number of structures
batch_size – (int) number of samples in a batch
is_shuffle – (bool) whether to shuffle the structure, default to True
distance_converter – (bool) converter for processing the distances

process_bond_feature(x) → numpy.ndarray[source]¶

Convert bond distances into Gaussian expanded vectors :param x: input distance array :type x: np.ndarray

Returns: expanded matrix

class GraphBatchGenerator(atom_features: List[numpy.ndarray], bond_features: List[numpy.ndarray], state_features: List[numpy.ndarray], index1_list: List[int], index2_list: List[int], targets: Optional[numpy.ndarray] = None, sample_weights: Optional[numpy.ndarray] = None, batch_size: int = 128, is_shuffle: bool = True)[source]¶

Bases: megnet.data.graph.BaseGraphBatchGenerator

A generator class that assembles several structures (indicated by batch_size) and form (x, y) pairs for model training.

Parameters

atom_features – (list of np.array) list of atom feature matrix,
bond_features – (list of np.array) list of bond features matrix
state_features – (list of np.array) list of [1, G] state features, where G is the global state feature dimension
index1_list – (list of integer) list of (M, ) one side atomic index of the bond,
structures (M is different for different) –
index2_list – (list of integer) list of (M, ) the other side atomic index of the bond, M is different for different structures, but it has to be the same as the corresponding index1.
targets – (numpy array), N*1, where N is the number of structures
sample_weights – (numpy array), N*1, where N is the number of structures
batch_size – (int) number of samples in a batch

class StructureGraph(nn_strategy: Optional[Union[str, pymatgen.analysis.local_env.NearNeighbors]] = None, atom_converter: Optional[megnet.data.graph.Converter] = None, bond_converter: Optional[megnet.data.graph.Converter] = None, **kwargs)[source]¶

Bases: monty.json.MSONable

This is a base class for converting converting structure into graphs or model inputs Methods to be implemented are follows:

convert(self, structure)
This is to convert a structure into a graph dictionary

get_input(self, structure)
This method convert a structure directly to a model input

get_flat_data(self, graphs, targets)
This method process graphs and targets pairs and output model input list.

Parameters

nn_strategy (str or NearNeighbors) – NearNeighbor strategy
atom_converter (Converter) – atom converter
bond_converter (Converter) – bond converter
**kwargs –

as_dict() → Dict[source]¶: Serialize to dict Returns: (dict) dictionary of information

convert(structure: pymatgen.core.structure.Structure, state_attributes: Optional[List] = None) → Dict[source]¶: Take a pymatgen structure and convert it to a index-type graph representation The graph will have node, distance, index1, index2, where node is a vector of Z number of atoms in the structure, index1 and index2 mark the atom indices forming the bond and separated by distance. For state attributes, you can set structure.state = [[xx, xx]] beforehand or the algorithm would take default [[0, 0]] :param state_attributes: (list) state attributes :param structure: (pymatgen structure) :param (dictionary):

classmethod from_dict(d: Dict) → megnet.data.graph.StructureGraph[source]¶

Initialization from dictionary :param d: dictionary :type d: dict

Returns: StructureGraph object

static get_atom_features(structure) → List[Any][source]¶

Get atom features from structure, may be overwritten :param structure: (Pymatgen.Structure) pymatgen structure

Returns: List of atomic numbers

static get_flat_data(graphs: List[Dict], targets: Optional[List] = None) → tuple[source]¶

Expand the graph dictionary to form a list of features and targets tensors. This is useful when the model is trained on assembled graphs on the fly. :param graphs: (list of dictionary) list of graph dictionary for each structure :param targets: (list of float or list) Optional: corresponding target

values for each structure

Returns: tuple(node_features, edges_features, global_values, index1, index2, targets)

get_input(structure: pymatgen.core.structure.Structure) → List[numpy.ndarray][source]¶: Turns a structure into model input

graph_to_input(graph: Dict) → List[numpy.ndarray][source]¶

Turns a graph into model input :param (dict): Dictionary description of the graph

Returns: Inputs in the form needed by MEGNet
Return type: ([np.ndarray])

class StructureGraphFixedRadius(nn_strategy: Optional[Union[str, pymatgen.analysis.local_env.NearNeighbors]] = None, atom_converter: Optional[megnet.data.graph.Converter] = None, bond_converter: Optional[megnet.data.graph.Converter] = None, **kwargs)[source]¶

Bases: megnet.data.graph.StructureGraph

This one uses a short cut to call find_points_in_spheres cython function in pymatgen. It is orders of magnitude faster than previous implementations

Parameters

nn_strategy (str or NearNeighbors) – NearNeighbor strategy
atom_converter (Converter) – atom converter
bond_converter (Converter) – bond converter
**kwargs –

convert(structure: pymatgen.core.structure.Structure, state_attributes: Optional[List] = None) → Dict[source]¶: Take a pymatgen structure and convert it to a index-type graph representation The graph will have node, distance, index1, index2, where node is a vector of Z number of atoms in the structure, index1 and index2 mark the atom indices forming the bond and separated by distance. For state attributes, you can set structure.state = [[xx, xx]] beforehand or the algorithm would take default [[0, 0]] :param state_attributes: (list) state attributes :param structure: (pymatgen structure) :param (dictionary):

classmethod from_structure_graph(structure_graph: megnet.data.graph.StructureGraph) → megnet.data.graph.StructureGraphFixedRadius[source]¶

Initialize from pymatgen StructureGraph :param structure_graph: pymatgen StructureGraph object :type structure_graph: StructureGraph

Returns: StructureGraphFixedRadius object

itemgetter_list(data_list: List, indices: List) → tuple[source]¶

Get indices of data_list and return a tuple :param data_list: data list :type data_list: list :param indices: (list) indices

Returns: (tuple)

megnet.data.graph module¶

Related Topics

This Page