并行程序設計 版權信息
- ISBN:9787302660965
- 條形碼:9787302660965 ; 978-7-302-66096-5
- 裝幀:一般膠版紙
- 冊數:暫無
- 重量:暫無
- 所屬分類:>
并行程序設計 本書特色
本書注重并行編程能力訓練,在介紹并行編程語言/接口時,同步給出了很多示例代碼和例子程序,幫助讀者更清晰地了解如何在程序中使用這些編程接口,進而編寫出滿足需求的并行程序。
本書內容涵蓋了主流的并行編程語言/接口,包括多線程編程接口OpenMP和Pthreads、消息傳遞接口MPI,以及CUDA、OpenCL、OpenAcc等異構編程接口。
本書介紹了面向我國自主申威處理器架構的Athread編程接口,可以幫助讀者學習掌握如何在我國自主超算系統上編寫并行程序。
通俗易懂,偏重實踐,手把手教你編寫并行程序。
并行程序設計 內容簡介
介本書是針對“并行程序設計”“并行計算”等課程編寫的教材,內容包括并行計算基礎知識、共享內存系統的OpenMP和Pthreads多線程編程、消息傳遞系統的MPI編程、Slurm作業管理系統、GPU等異構系統的CUDA/OpenCL/OpenACC/Athread編程、常用的并行設計與性能優化方法、典型并行應用算法等,另外,還特別增加了針對我國自主研發的申威處理器編程的相關內容。本書涵蓋并行程序設計*常用的編程語言/接口、設計與性能優化方法、基礎應用算法等內容,一方面反映了OpenMP、MPI等成熟編程語言/接口的新特性,以及GPU異構編程等新型編程接口;另一方面在典型并行應用算法部分嘗試用計算機專業人員易于理解的方式介紹典型算法,特別是以線性方程組迭代求解方法中的共軛梯度法為例。本書每章都設置了以編程為主的習題,鼓勵讀者通過編寫程序掌握相關方法。本書適合作為計算機和信息類專業高年級本科生和研究生的教材,也可供高性能計算和并行計算領域的科研人員參考。
并行程序設計 目錄
第 1 章
并行程序設計概述
.
........................................................................1
1.1 并行性概述 ..................................................................................................1
1.2 如何衡量計算速度 ......................................................................................3
1.3 并行計算系統基本知識 ..............................................................................6
1.3.1 弗林分類 .........................................................................................6
1.3.2 共享內存系統與消息傳遞系統 .....................................................8
1.3.3 幾種常見的并行計算系統 ...........................................................10
1.3.4 互連網絡 .......................................................................................15
1.3.5 多級存儲體系結構 .......................................................................16
1.4 并行編程語言/接口分類 ........................................................................17
1.5 浮點數格式 ................................................................................................19
1.6 例子程序 ....................................................................................................21
1.6.1 矩陣相乘 .......................................................................................21
1.6.2 規約和掃描 ...................................................................................22
1.7 小結 ............................................................................................................26
題 ............................................................................................................27
第 2 章
共享內存系統并行編程
.
.............................................................28
2.1 共享內存系統中的并行模型 ....................................................................28
2.1.1 多線程并行概述 ...........................................................................29
2.1.2 同步與互斥的概念 .......................................................................30
2.2 OpenMP編程 ............................................................................................31
2.2.1 概述 ...............................................................................................31
2.2.2 OpenMP的基本命令 .....................................................................33
并行程序設計
X
2.2.3 共享工作構造及其組合 ...............................................................................35
2.2.4 線程間同步與互斥 .......................................................................................40
2.2.5 常用子句 .......................................................................................................43
2.2.6 OpenMP 示例程序:級數法計算圓周率 ....................................................51
2.2.7 task 工作構造 ................................................................................................52
2.3 Pthreads編程 .............................................................................................................57
2.3.1 Pthreads 簡介 .................................................................................................57
2.3.2 線程的創建和止 .......................................................................................57
2.3.3 線程互斥 .......................................................................................................63
2.3.4 Pthreads 示例程序:級數法計算圓周率 .....................................................67
2.3.5 線程同步 .......................................................................................................69
2.3.6 Pthreads 示例程序:生產者–消費者 ...........................................................76
2.3.7 線程死鎖與鎖粒度 .......................................................................................79
2.4 面向多核系統的新型編程語言/接口 ....................................................................82
2.4.1 Cilk與Cilk++ .................................................................................................82
2.4.2 TBB ................................................................................................................85
2.5 小結 ............................................................................................................................88
題 ............................................................................................................................88
第 3 章
消息傳遞系統并行編程
.
...........................................................................90
3.1 MPI 簡介 ...................................................................................................................90
3.1.1 MPI 是什么? ...............................................................................................90
3.1.2 MPI 的并行模式 ...........................................................................................91
3.1.3 一個簡單的MPI 程序 ...................................................................................92
3.1.4 MPI 基本環境 ...............................................................................................93
3.1.5 通信子、進程組、進程號 ...........................................................................95
3.1.6 MPI 數據類型 ...............................................................................................96
3.1.7 MPI 通信簡介 ...............................................................................................98
3.2 點對點通信 ................................................................................................................99
3.2.1 標準通信模式 .............................................................................................100
3.2.2 緩存通信模式 .............................................................................................104
3.2.3 同步通信模式 .............................................................................................106
3.2.4 就緒通信模式 .............................................................................................106
3.2.5 四種通信模式小結 .....................................................................................107
3.2.6 組合發送接收 .............................................................................................108
3.2.7 非阻塞通信 .................................................................................................109
3.3 集合通信 ..................................................................................................................117
3.3.1 集合通信概述 .............................................................................................117
3.3.2 數據廣播MPI_Bcast ...................................................................................118
3.3.3 數據分發MPI_Scatter .................................................................................119
3.3.4 數據收集MPI_Gather .................................................................................121
3.3.5 組收集MPI_Allgather .................................................................................123
3.3.6 全互換MPI_Alltoall ....................................................................................124
3.3.7 規約MPI_Reduce ........................................................................................126
3.3.8 組規約MPI_Allreduce .................................................................................130
3.3.9 掃描MPI_Scan .............................................................................................130
3.3.10 柵欄MPI_Barrier .......................................................................................131
3.4 一個MPI示例程序 ................................................................................................132
3.4.1 數值積分的計算 .........................................................................................132
3.4.2 基于數值積分的圓周率計算程序 .............................................................133
3.4.3 MPI墻鐘時間 ..............................................................................................134
3.5 進程組和通信子 ......................................................................................................135
3.5.1 組管理 .........................................................................................................136
3.5.2 通信子管理 .................................................................................................138
3.5.3 組間通信子 .................................................................................................140
3.6 MPI與多線程 .........................................................................................................141
3.6.1 如何在MPI程序中使用多線程 ..................................................................141
3.6.2 MPI+OpenMP示例程序 ..............................................................................142
3.6.3 分析和討論 .................................................................................................144
3.7 進程拓撲 ..................................................................................................................145
3.7.1 進程拓撲簡介 .............................................................................................145
3.7.2 創建進程拓撲 .............................................................................................146
3.7.3 進程拓撲相關的通信函數 .........................................................................149
3.8 PGAS編程及語言 ..................................................................................................150
3.9 作業管理系統及使用 ..............................................................................................156
3.9.1 作業管理系統簡介 .....................................................................................156
3.9.2 Slurm簡介 ....................................................................................................156
3.9.3 在Slurm中以作業方式執行程序 ................................................................158
3.9.4 Slurm的作業腳本 ........................................................................................160
3.9.5 在Slurm中以其他方式執行程序 ................................................................161
3.9.6 Slurm常用命令 ............................................................................................162
3.10 小結 ........................................................................................................................166
題 .........................................................................................................................167
第 4 章
異構系統并行編程
.
..................................................................................169
4.1 異構系統編程概述 ..................................................................................................169
4.2 面向NVIDIA GPU的CUDA編程 .......................................................................170
4.2.1 CUDA概述 ..................................................................................................170
4.2.2 Hello World程序:CUDA程序的基本形態 ..............................................172
4.2.3 兩個整數相加程序:CPU-GPU數據交換 ................................................173
4.2.4 向量求和程序:CUDA多線程 ..................................................................176
4.2.5 CUDA線程組織 ..........................................................................................177
4.2.6 CUDA內存層次與變量修飾符 ..................................................................181
4.2.7 函數修飾符 .................................................................................................184
4.2.8 CUDA流 ......................................................................................................185
4.2.9 性能化 .....................................................................................................192
4.2.10 CUDA統一內存空間 ................................................................................197
4.2.11 使用多GPU ................................................................................................198
4.3 OpenCL編程 ...........................................................................................................200
4.3.1 OpenCL概述 ................................................................................................200
4.3.2 OpenCL程序的執行流程及相關API .........................................................202
4.3.3 OpenCL示例程序一:向量求和 ................................................................211
4.3.4 OpenCL的執行模型與線程組織 ................................................................215
4.3.5 OpenCL的內存層次結構 ............................................................................218
4.3.6 OpenCL示例程序二:矩陣相乘 ................................................................220
4.4 面向申威處理器的Athread編程 ...........................................................................222
4.4.1 申威處理器及其編程簡介 .........................................................................222
4.4.2 Hello World程序:Athread程序的基本形態 .............................................223
4.4.3 Athread變量的局存儲空間屬性 .............................................................225
4.4.4 Athread主–從核編程接口 ...........................................................................225
4.4.5 Athread寄存器通信 .....................................................................................229
4.4.6 Athread版的Cannon并行矩陣相乘 ............................................................230
4.5 OpenACC編程 ........................................................................................................234
4.5.1 OpenACC概述 .............................................................................................234
4.5.2 OpenACC語法 .............................................................................................234
4.5.3 OpenACC循環并行性 .................................................................................237
4.5.4 基于申威處理器的OpenACC編程 .............................................................238
4.6 小結 ..........................................................................................................................246
題 ..........................................................................................................................246
第 5 章
并行程序性能化
.
.................................................................................248
5.1 Amdahl定律 ............................................................................................................248
5.2 影響性能的主要因素 ..............................................................................................250
5.2.1 并行開銷 .....................................................................................................250
5.2.2 負載均衡 .....................................................................................................251
5.2.3 并行粒度 .....................................................................................................252
5.2.4 并行劃分 .....................................................................................................252
5.2.5 依賴關系 .....................................................................................................253
5.2.6 局性 .........................................................................................................254
5.3 并行程序的可擴展性及性能化方法 ..................................................................255
5.3.1 什么是并行程序的可擴展性? .................................................................255
5.3.2 確并行程序可擴展性的重要原則:獨立計算塊 .................................256
5.3.3 數據劃分對性能和可擴展性的影響 .........................................................259
5.3.4 其他常用性能化方法 .............................................................................264
5.4 PCAM并行設計方法 .............................................................................................266
5.4.1 劃分 .............................................................................................................266
5.4.2 通信 .............................................................................................................268
5.4.3 組合 .............................................................................................................270
5.4.4 映射 .............................................................................................................271
5.5 小結 ..........................................................................................................................272
題 ..........................................................................................................................272
第 6 章
典型并行應用算法
.
.................................................................................274
6.1 矩陣相乘 ..................................................................................................................274
6.1.1 基于分塊的并行矩陣相乘 .........................................................................274
6.1.2 改進的分塊矩陣相乘——Cannon算法 .....................................................275
6.1.3 支持矩陣相乘的用硬件——脈動陣列 .................................................277
6.2 線性方程組的直接求解 ..........................................................................................279
6.2.1 線性方程組及其求解方法簡介 .................................................................279
6.2.2 三角方程組的回代求解 .............................................................................281
6.2.3 高斯消去法 .................................................................................................281
6.2.4 LU分解算法 ................................................................................................282
6.2.5 并行LU分解:逐行交錯條帶劃分和塊–循環分配 ..................................285
6.3 線性方程組的迭代求解 ..........................................................................................286
6.3.1 迭代求解方法 .....................................................................................286
6.3.2 共軛梯度求解方法 .....................................................................................289
6.3.3 迭代法求解示例:偏微分方程求解 .........................................................295
6.3.4 幾種迭代法的并行性討論 .........................................................................298
6.3.5 稀疏矩陣的壓縮數據格式 .........................................................................299
6.4 快速排序 ..................................................................................................................301
6.5 快速傅里葉變換 ......................................................................................................303
6.5.1 算法背景 .....................................................................................................303
6.5.2 算法原理 .....................................................................................................303
6.5.3 遞歸算法轉換為迭代算法 .........................................................................306
6.5.4 并行算法 .....................................................................................................307
6.6 基礎線性代數庫和軟件 ......................................................................................309
6.6.1 線性代數算法庫BLAS ...............................................................................309
6.6.2 線性代數軟件LAPACK ..........................................................................312
6.7 小結 ..........................................................................................................................314
題 ..........................................................................................................................314
附錄A
英文縮寫詞
.
................................................................................................316
參考文獻...................................................................................................................318
展開全部
并行程序設計 作者簡介
劉軼,博士,北京航空航天大學計算機學院教授、博士生導師。現為北京航空航天大學計算機學院中德聯合軟件研究所所長。主要研究方向為計算機系統結構及高性能計算、計算機網絡。近年來主持/參加了多項 重大/重點研究項目,在多核/眾核處理器并行編程、高性能計算系統模擬與容錯技術等方面開展研究工作。近年來在 期刊、學報及IEEE 會議上發表論文數十篇,擁有發明專利20余項。獲北京市科技進步一等獎一項、教學成果一等獎一項。近年來主講課程包括:計算機學院本科生“計算機網絡”、研究生“并行程序設計”和高等理工學院研究生“并行計算”。