-
>
公路車寶典(ZINN的公路車維修與保養秘籍)
-
>
晶體管電路設計(下)
-
>
基于個性化設計策略的智能交通系統關鍵技術
-
>
花樣百出:貴州少數民族圖案填色
-
>
山東教育出版社有限公司技術轉移與技術創新歷史叢書中國高等技術教育的蘇化(1949—1961)以北京地區為中心
-
>
鐵路機車概要.交流傳動內燃.電力機車
-
>
利維坦的道德困境:早期現代政治哲學的問題與脈絡
材料信息學導論(上):機器學習基礎(英文版) 版權信息
- ISBN:9787030728982
- 條形碼:9787030728982 ; 978-7-03-072898-2
- 裝幀:一般膠版紙
- 冊數:暫無
- 重量:暫無
- 所屬分類:>
材料信息學導論(上):機器學習基礎(英文版) 內容簡介
材料信息學是一門新興的交叉學科,為在材料基因組理念下加速材料科學研究和技術發展提供了一個全新的方法。作為材料和力學學者,作者在推動材料信息學發展方面做了大量工作,在人工智能(AI)、機器學習(ML)和材料科學技術融合交叉方面,有諸多的嘗試和心得體會。作者旨在寫一本易懂的材料信息學簡介,以進一步推動材料信息學的發展。為便于讀者盡快理解和掌握材料信息學的核心內容,兼顧成書的完整性,本書分為上下兩卷,上卷側重于機器學習基礎,下卷側重于深度學習并綜述材料信息學的現狀及發展前景。
本上卷共十二章,內容包括線性回歸與線性分類、支持向量機、決策樹和K近鄰(KNN)、集成學習、貝葉斯定理和期望**化(EM)算法、符號回歸、神經網絡、隱型馬爾可夫鏈、數據預處理和特征選擇、可解釋性機器學習,等等。敘述力求從簡單明了的數學定義和物理圖像出發,密切結合材料科學研究案例,給出了各種算法的詳細步驟,便于讀者學習和運用。
材料信息學導論(上):機器學習基礎(英文版) 目錄
Foreword
Preface
Symbols and Notations
Chapter 1 Introduction 1
References 13
Chapter 2 Linear Regression 15
2.1 Least Squares Linear Regression 15
2.2 Principal Component Analysis and Principal Component Regression 26
2.3 Least Absolute Shrinkage and Selection Operator (L1) 37
2.4 Ridge Regression (L2) 40
2.5 Elastic Net Regression 44
2.6 Multiply Task LASSO (MultiTaskLASSO) 49
Homework 52
References 53
Chapter 3 Linear Classification 55
3.1 Perceptron 57
3.2 Logistic Regression 60
3.3 Linear Discriminant Analysis 73
Homework 80
References 82
Chapter 4 Support Vector Machine 83
4.1 SVC 83
4.2 Kernel Functions 88
4.3 Soft Margin 96
4.4 SVR 102
Homework 108
References 110
Chapter 5 Decision Tree and K-Nearest-Neighbors (KNN) 112
5.1 Classification Trees 112
5.2 Regression Tree 121
5.3 K-Nearest-Neighbors (KNN) Methods 129
Homework 133
References 134
Chapter 6 Ensemble Learning 136
6.1 Boosting 137
6.1.1 AdaBoost 137
6.1.2 Gradient Boosting Machine (GBM) 145
6.1.3 eXtreme Gradient Boosting (XGBoost) 151
6.2 Bagging 153
Homework 158
References 159
Chapter 7 Bayesian Theorem and Expectation-Maximization (EM) Algorithm 160
7.1 Bayesian Theorem 160
7.2 Naive Bayes Classifier 161
7.3 Maximum Likelihood Estimation 168
7.3.1 Gaussian distribution 168
7.3.2 Weibull distribution 170
7.4 Bayesian Linear Regression 175
7.5 Expectation-Maximization (EM) Algorithm 184
7.5.1 Gaussian mixture model (GMM) 185
7.5.2 The mixture of Lorentz and Gaussian distributions 197
7.6 Gaussian Process (GP) Regression 209
Homework 219
References 219
Chapter 8 Symbolic Regression 221
8.1 Overview of Evolutionary Computation 221
8.2 Genetic Programming 223
8.3 Grammar-Guided Genetic Programming and Grammatical Evolution 225
8.4 The Application of LASSO in Symbolic Regression 234
Homework 235
References 235
Chapter 9 Neural Networks 238
9.1 Neural Networks and Perceptron 238
9.2 Back Propagation Algorithm 241
9.3 Regularization in NNs 250
9.3.1 L1 regularization 250
9.3.2 L2 regularization 257
9.4 Classification NNs 261
9.4.1 Binary classification 261
9.4.2 Multiclassification of multiply grades in a category 267
9.5 Autoencoders 272
9.5.1 Introduction 272
9.5.2 Denoising autoencoder 273
9.5.3 Sparse autoencoder 280
9.5.4 Variational autoencoder 288
Homework 311
References 312
Chapter 10 Hidden Markov Chains 313
10.1 Markov Chain 313
10.2 Stationary Markov Chain 317
10.3 Markov Chain Monte Carlo Methods 318
10.3.1 Metropolis Hastings (M-H) algorithm 320
10.3.2 Gibbs sampling algorithm 321
10.4 Calculation Methods for the Probability of Observation Sequence 325
10.4.1 Direct method 325
10.4.2 Forward method 328
10.4.3 Backward method 330
10.5 Estimation of Optimal State Sequence 332
10.5.1 Direct method 332
10.5.2 Viterbi algorithm 333
10.6 Estimation of Intrinsic Parameters—The Baum-Welch Algorithm 334
Homework 344
References 345
Chapter 11 Data Preprocessing and Feature Selection 347
11.1 Reliable Data, Normals and Anomalies 348
11.1.1 Local outlier factor 348
11.1.2 Isolated forest 352
11.1.3 One-class support vector machine 355
11.1.4 Support vector data description 361
11.2 Feature Selection 365
11.2.1 Filter approach 366
11.2.2 Wrapper approach 394
11.2.3 Embedded approach 402
Homework 408
References 408
Chapter 12 Interpretative SHAP Value and Partial Dependence Plot 410
12.1 SHapley Additive exPlanation value 410
12.2 The joint SHAP value of two features 426
12.3 Partial Dependence Plot 427
Homework 440
References 440
Appendix 1 Vector and Matrix 442
A1.1 Definition 442
A1.1.1 Vector 442
A1.1.2 Matrix 442
A1.2 Matrix Algebra 442
A1.2.1 Inverse and transpose 442
A1.2.2 Trace 443
A1.2.3 Determinant 443
A1.2.4 Eigenvalues and eigenvectors 444
A1.2.5 Singular value decomposition (SVD) 444
A1.2.6 Pseudo inverse 445
A1.2.7 Some useful identities 445
A1.3 Matrix Analysis 446
A1.3.1 Derivative of matrix 446
A1.3.2 Derivative of the determinant of a matrix 446
A1.3.3 Derivative of an inverse matrix 447
A1.3.4 Jacobian matrix and Hessian matrix 447
A1.3.5 The chain rule 447
References 447
Appendix 2 Basic Statistics 448
A2.1 Probability 448
A2.1.1 Joint probability 448
A2.1.2 Bayesian theorem and conjugation 448
A2.1.3 Probability density of continuous variables 449
A2.1.4 Quantile function 449
A2.1.5 Expectation, variance and covariance of random variables 449
A2.2 Distributions 449
A2.2.1 Bernoulli distribution 450
A2.2.2 Binomial distribution 450
A2.2.3 Poisson distribution 450
A2.2.4 Gaussian distribution 450
A2.2.5 Weibull distribution 451
A2.2.6 The chi-square (χ2) distribution and χ2-test 451
A2.2.7 Th
材料信息學導論(上):機器學習基礎(英文版) 節選
Chapter 1 Introduction Materials informatics is an emerging and rapidly developing field, particularly after the launch of Materials Genome Initiative (MGI) in 2011. Ramakrishna et al. (2019) defined materials informatics as that “the materials informatics employs techniques, tools, and theories drawn from the emerging fields such as data science, internet, computer science and engineering, and digital technologies to the materials science and engineering to accelerate materials, products and manufacturing innovations”. Agrawal and Choudhary (2016) proposed that materials informatics has become the “fourth paradigm” in the research and development of materials. Following the guidance of MGI, many free-access and open-to-public material databases have been built up, and materials data are exponentially growing thanks to the development of high-throughput experiments and high-throughput calculation techniques, as well as data sharing among the materials communities. A list of available materials databases can be found in the review paper of Ramakrishna et al. (2019). Materials informatics integrates materials science and engineering and the sciences and technologies from artificial intelligence (AI), database, and machine learning (ML) to accelerate the innovation in the whole materials development continuum from discovery, development, property optimization, system design and integration, manufacturing to deployment, and to speed up the process from data to knowledge in order to understand and master the relationship between material micro-structures and macro-properties based on material composition, processing, and performance. Materials informatics adds the novel tools of AI and ML into the toolbox of materials science and engineering, which will definitely strengthen and enhance the power of the methodologies in materials research and development. Materials informatics utilizes AI and ML to analyze a large ensemble of materials data from experiments, computations, manufactures, industries, and daily life, etc., efficiently and cost-effectively, and to deliver materials knowledge and technology in user-friendly ways to designers, scientists, engineers, and manufacturers of materials and products. ML gives computers the ability to learn from data and make predictions based on data (Samuel 1967). Mitchell (1997) defined ML as follows: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” In materials informatics, ML algorithms learn from existing material data with some properties or/and performance of interest, in order to improve targeted properties and performance of existing materials or/and to design and discover new materials with desired properties and performance. Since material data are usually small, adaptive design with feedback from experiments or/and computations has been proposed to enhance the ML ability. Figure 1.1 shows the adaptive learning and design in materials informatics. Initially successful and failed data, with material properties and performance of interest and material composition, processing, testing conditions, or/and service environments, etc. of adjustable parameters, are generated from experiments, calculations, or/and manufacturing. A dataset is usually built up by one’s own produced data or/and by collecting data from various sources. The material properties and performance of interest are usually called output variables, and the parameters of material composition, processing, or/and testing conditions are normally called input variables. In addition, many molecular, atomic and electronic parameters, thermodynamic properties, kinetic parameters, crystalline and amorphous information, etc. are often employed as input or/and output variables. ML is conducted on an initial dataset. The learning results will be evaluated and interpreted automatically or/and by domain experts, who are authorities in a particular area or topic, and clearly, domain experts in materials informatics are materials experts and scientists. The adaptive learning process is an iterative and interactive loop. The learning results are adopted to design and guide next experiments, calculations, or/and manufacture, and the new results will validate the ML prediction and simultaneously be fed back to the dataset for the next cycle of iterations in the adaptive learning loop, which means that the dataset will grow after each iteration. Obviously, the next round of adaptive learning may give a different result because of more data added, which provides modified guidance on experiments, calculations, or/and manufacture. This means that each step in the adaptive design loop will refine the inputs and outputs more or less during the cycle of iterations, and the iterations will go continuously until the designed goal of material properties is reached. Lookman
- >
姑媽的寶刀
- >
巴金-再思錄
- >
朝聞道
- >
人文閱讀與收藏·良友文學叢書:一天的工作
- >
苦雨齋序跋文-周作人自編集
- >
羅曼·羅蘭讀書隨筆-精裝
- >
伊索寓言-世界文學名著典藏-全譯本
- >
回憶愛瑪儂