megnet.utils.preprocessing module

Preprocessing codes

class DummyScaler[source]

Bases: monty.json.MSONable

Dummy scaler does nothing

classmethod from_training_data(structures: List[Union[pymatgen.core.structure.Structure, pymatgen.core.structure.Molecule]], targets: Union[List[float], numpy.ndarray], is_intensive: bool = True)[source]
Parameters
  • structures (list) – list of structures/molecules

  • targets (list) – vector of target properties

  • is_intensive (bool) – whether the target is intensive

Returns: DummyScaler

static inverse_transform(transformed_target: numpy.ndarray, n: int = 1) numpy.ndarray[source]

return as it is

Parameters
  • transformed_target (np.ndarray) – transformed target

  • n (int) – number of atoms

Returns

transformed_target

static transform(target: numpy.ndarray, n: int = 1) numpy.ndarray[source]
Parameters
  • target (np.ndarray) – target numerical value

  • n (int) – number of atoms

Returns

target

class Scaler[source]

Bases: monty.json.MSONable

Base Scaler class. It implements transform and inverse_transform. Both methods will take number of atom as the second parameter in addition to the target property

inverse_transform(transformed_target: numpy.ndarray, n: int = 1) numpy.ndarray[source]

Inverse transform of the target

Parameters
  • transformed_target (np.ndarray) – transformed target

  • n (int) – number of atoms

Returns

target

transform(target: numpy.ndarray, n: int = 1) numpy.ndarray[source]

Transform the target values into new target values :param target: target numerical value :type target: float :param n: number of atoms :type n: int

Returns

scaled target

class StandardScaler(mean: float = 0.0, std: float = 1.0, is_intensive: bool = True)[source]

Bases: megnet.utils.preprocessing.Scaler

Standard scaler with consideration of extensive/intensive quantity For intensive quantity, the mean is just the mean of training data, and std is the std of training data For extensive quantity, the mean is the mean of target/atom, and std is the std for target/atom

transform(self, target, n=1)[source]

standard scaling the target and

Parameters
  • mean (float) – mean value of target

  • std (float) – standard deviation of target

  • is_intensive (bool) – whether the target is already an intensive property

classmethod from_training_data(structures: List[Union[pymatgen.core.structure.Structure, pymatgen.core.structure.Molecule]], targets: Union[List[float], numpy.ndarray], is_intensive: bool = True) megnet.utils.preprocessing.StandardScaler[source]

Generate a target scaler from a list of input structures/molecules, a target value vector and an indicator for intensiveness of the property

Parameters
  • structures (list) – list of structures/molecules

  • targets (list) – vector of target properties

  • is_intensive (bool) – whether the target is intensive

Returns: new instance

inverse_transform(transformed_target: numpy.ndarray, n: int = 1) numpy.ndarray[source]

Inverse transform of the target

Parameters
  • transformed_target (np.ndarray) – transformed target

  • n (int) – number of atoms

Returns

original target

transform(target: numpy.ndarray, n: int = 1) numpy.ndarray[source]

Transform numeric values according the mean and std, plus a factor n

Parameters
  • target (np.ndarray) – target numerical value

  • n (int) – number of atoms

Returns

scaled target