Syllabus

Course

NANO 181/281 Data Science in Materials Science (4)

Description

To provide a comprehensive introduction to the application of data science to materials science.

Prerequisites

Consent of Instructor

Textbook, Required Materials:

  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition [Free]
  • Python Data Science Handbook [Free]

Class/Laboratory Schedule

Two 80-minute lectures per week and three computational laboratory sessions

Course Topics

  • Introduction to Data Science in Materials Science
  • Python for Data Science
  • Linear Methods for Regression
    • Ordinary least squares
    • Subset selection
    • Regularization and Shrinkage
    • Derived input directions
    • Extending linear methods
    • Transformations of inputs
    • Piece-wise polynomials
    • Basis expansion
  • Linear Methods for Classification
    • Discriminant Analysis
    • Logistic regression
  • Unsupervised learning
    • Principal Component Analysis
    • K-means
    • Hierarchical and density-based clustering
  • Kernel regression
    • k nearest neighbor
    • Kernel density estimation and classification
  • Trees
    • Decision trees
    • Ensemble of trees
  • Neural networks

Course Objectives

  1. To provide students with a foundation in data science techniques, with practical examples rooted in materials science domain applications.
  2. To provide hands-on experience in the use of the Python programming language for data science
  3. To inculcate best practices in developing and interpreting machine learning models for materials property predictions.

Methods of Evaluation

  • Jupyter notebook reports of three hands-on laboratory sessions

Performance Criteria

Objective 1: (basic engineering knowledge and applications)

1.1 Understand the fundamentals of data science methods.

1.2 Understand the successes and limitations of each method, and the tradeoff between accuracy and cost.

1.3 Understand how to derive material and nano-scale properties from first principles calculations.

Objective 2: (methods and problem solving)

2.1 Choosing the most appropriate data science method for a particular application.

2.2 Ability to effectively manage and analyze large materials datasets.

2.3 Best practices in construction and evaluation of machine learning models.


Copyright © 2019-2023 Shyue Ping Ong, Materials Virtual Lab