Kahibaro
Discord
Login
Register
☰
Home
Courses
Help
Our website is made possible by displaying online advertisements to our visitors.
Please consider supporting us by disabling your ad blocker.
1 Course Overview ▼
2 Fundamentals of Computer Architecture ▼
3 Operating Systems and the Linux Environment ▼
4 HPC Clusters and Infrastructure ▼
5 Job Scheduling and Resource Management ▼
6 Parallel Computing Concepts ▼
7 Shared-Memory Parallel Programming ▼
8 Distributed-Memory Parallel Programming ▼
9 Hybrid Parallel Programming ▼
10 GPU and Accelerator Computing ▼
11 Compilers and Build Systems ▼
12 Performance Analysis and Optimization ▼
13 Numerical Libraries and Software Stacks ▼
14 Data Management and I/O ▼
15 Reproducibility and Software Environments ▼
16 Debugging and Testing Parallel Programs ▼
17 HPC in Practice ▼
18 Ethics, Sustainability, and Green Computing ▼
19 Future Trends in HPC ▼
20 Final Project and Hands-On Exercises ▼
☰
Introduction to HPC
Introduction to HPC
1 Course Overview
▼
1.1 What is High-Performance Computing
1.2 Why HPC is important in science, engineering, and industry
1.3 Examples of real-world HPC applications
1.4 Structure and goals of the course
2 Fundamentals of Computer Architecture
▼
2.1 CPUs, cores, and clock speeds
2.2 Memory hierarchy
2.2.1 Registers
2.2.2 Cache
2.2.3 Main memory (RAM)
2.3 Storage systems
2.4 GPUs and accelerators
2.5 SIMD and vectorization concepts
3 Operating Systems and the Linux Environment
▼
3.1 Why Linux dominates HPC
3.2 Basic Linux command line usage
3.3 Filesystems and directory structures
3.4 Environment modules
3.5 Software installation concepts
4 HPC Clusters and Infrastructure
▼
4.1 What is an HPC cluster
4.2 Login nodes
4.3 Compute nodes
4.4 Head and management nodes
4.5 Interconnects
4.5.1 Ethernet
4.5.2 InfiniBand
4.6 Shared memory systems
4.7 Distributed memory systems
4.8 Parallel filesystems
4.8.1 NFS
4.8.2 Lustre
4.8.3 GPFS
5 Job Scheduling and Resource Management
▼
5.1 Why job schedulers are needed
5.2 Batch systems overview
5.3 Introduction to SLURM
5.4 Writing job scripts
5.5 Submitting jobs
5.6 Monitoring jobs
5.7 Cancelling and modifying jobs
6 Parallel Computing Concepts
▼
6.1 Why parallel computing is needed
6.2 Types of parallelism
6.2.1 Task parallelism
6.2.2 Data parallelism
6.3 Strong scaling
6.4 Weak scaling
6.5 Amdahl’s Law
6.6 Gustafson’s Law
6.7 Load balancing
7 Shared-Memory Parallel Programming
▼
7.1 Introduction to OpenMP
7.2 Threads and thread management
7.3 Parallel regions
7.4 Work-sharing constructs
7.5 Synchronization mechanisms
7.6 Race conditions
7.7 Performance considerations
8 Distributed-Memory Parallel Programming
▼
8.1 Introduction to MPI
8.2 MPI processes
8.3 Point-to-point communication
8.4 Collective communication
8.5 Communicators
8.6 Process topologies
8.7 Performance pitfalls
9 Hybrid Parallel Programming
▼
9.1 Motivation for hybrid programming
9.2 Combining MPI and OpenMP
9.3 Node-level parallelism
9.4 Cluster-level parallelism
9.5 Common hybrid programming patterns
10 GPU and Accelerator Computing
▼
10.1 GPU architecture basics
10.2 Memory hierarchy on GPUs
10.3 Introduction to CUDA
10.4 Introduction to OpenACC
10.5 Performance considerations for accelerators
11 Compilers and Build Systems
▼
11.1 Common HPC compilers
11.1.1 GCC
11.1.2 Intel oneAPI
11.1.3 LLVM
11.2 Compiler optimization flags
11.3 Debug builds
11.4 Optimized builds
11.5 Introduction to Make
11.6 Introduction to CMake
12 Performance Analysis and Optimization
▼
12.1 Measuring performance
12.2 Benchmarking applications
12.3 Profiling tools
12.4 Memory optimization
12.5 Cache optimization
12.6 Vectorization strategies
12.7 Improving parallel efficiency
13 Numerical Libraries and Software Stacks
▼
13.1 Linear algebra libraries
13.1.1 BLAS
13.1.2 LAPACK
13.1.3 ScaLAPACK
13.2 Fast Fourier Transform libraries
13.3 Scientific software frameworks
13.4 Using precompiled software on clusters
14 Data Management and I/O
▼
14.1 Parallel I/O concepts
14.2 File formats used in HPC
14.3 Checkpointing strategies
14.4 Restart mechanisms
14.5 Managing large datasets
15 Reproducibility and Software Environments
▼
15.1 Software stacks
15.2 Containers in HPC
15.3 Introduction to Singularity and Apptainer
15.4 Best practices for reproducible workflows
16 Debugging and Testing Parallel Programs
▼
16.1 Common bugs in parallel programs
16.2 Deadlocks
16.3 Debugging tools for HPC
16.4 Testing strategies for parallel codes
17 HPC in Practice
▼
17.1 Typical HPC workflows
17.2 Developing code locally
17.3 Running applications on clusters
17.4 Case studies from science
17.5 Case studies from industry
18 Ethics, Sustainability, and Green Computing
▼
18.1 Energy consumption of HPC systems
18.2 Efficient resource usage
18.3 Job sizing and fair-share usage
18.4 Responsible computing practices
19 Future Trends in HPC
▼
19.1 Exascale computing
19.2 AI and machine learning in HPC
19.3 Heterogeneous architectures
19.4 Quantum computing and HPC integration
20 Final Project and Hands-On Exercises
▼
20.1 Designing an HPC application
20.2 Running large-scale simulations
20.3 Performance analysis and optimization report
20.4 Documentation and best practices summary
Where to Move
Move chapter:
☰
1 Course Overview
☰
1.1 What is High-Performance Computing
☰
1.2 Why HPC is important in science, engineering, and industry
☰
1.3 Examples of real-world HPC applications
☰
1.4 Structure and goals of the course
☰
2 Fundamentals of Computer Architecture
☰
2.1 CPUs, cores, and clock speeds
☰
2.2 Memory hierarchy
☰
2.2.1 Registers
☰
2.2.2 Cache
☰
2.2.3 Main memory (RAM)
☰
2.3 Storage systems
☰
2.4 GPUs and accelerators
☰
2.5 SIMD and vectorization concepts
☰
3 Operating Systems and the Linux Environment
☰
3.1 Why Linux dominates HPC
☰
3.2 Basic Linux command line usage
☰
3.3 Filesystems and directory structures
☰
3.4 Environment modules
☰
3.5 Software installation concepts
☰
4 HPC Clusters and Infrastructure
☰
4.1 What is an HPC cluster
☰
4.2 Login nodes
☰
4.3 Compute nodes
☰
4.4 Head and management nodes
☰
4.5 Interconnects
☰
4.5.1 Ethernet
☰
4.5.2 InfiniBand
☰
4.6 Shared memory systems
☰
4.7 Distributed memory systems
☰
4.8 Parallel filesystems
☰
4.8.1 NFS
☰
4.8.2 Lustre
☰
4.8.3 GPFS
☰
5 Job Scheduling and Resource Management
☰
5.1 Why job schedulers are needed
☰
5.2 Batch systems overview
☰
5.3 Introduction to SLURM
☰
5.4 Writing job scripts
☰
5.5 Submitting jobs
☰
5.6 Monitoring jobs
☰
5.7 Cancelling and modifying jobs
☰
6 Parallel Computing Concepts
☰
6.1 Why parallel computing is needed
☰
6.2 Types of parallelism
☰
6.2.1 Task parallelism
☰
6.2.2 Data parallelism
☰
6.3 Strong scaling
☰
6.4 Weak scaling
☰
6.5 Amdahl’s Law
☰
6.6 Gustafson’s Law
☰
6.7 Load balancing
☰
7 Shared-Memory Parallel Programming
☰
7.1 Introduction to OpenMP
☰
7.2 Threads and thread management
☰
7.3 Parallel regions
☰
7.4 Work-sharing constructs
☰
7.5 Synchronization mechanisms
☰
7.6 Race conditions
☰
7.7 Performance considerations
☰
8 Distributed-Memory Parallel Programming
☰
8.1 Introduction to MPI
☰
8.2 MPI processes
☰
8.3 Point-to-point communication
☰
8.4 Collective communication
☰
8.5 Communicators
☰
8.6 Process topologies
☰
8.7 Performance pitfalls
☰
9 Hybrid Parallel Programming
☰
9.1 Motivation for hybrid programming
☰
9.2 Combining MPI and OpenMP
☰
9.3 Node-level parallelism
☰
9.4 Cluster-level parallelism
☰
9.5 Common hybrid programming patterns
☰
10 GPU and Accelerator Computing
☰
10.1 GPU architecture basics
☰
10.2 Memory hierarchy on GPUs
☰
10.3 Introduction to CUDA
☰
10.4 Introduction to OpenACC
☰
10.5 Performance considerations for accelerators
☰
11 Compilers and Build Systems
☰
11.1 Common HPC compilers
☰
11.1.1 GCC
☰
11.1.2 Intel oneAPI
☰
11.1.3 LLVM
☰
11.2 Compiler optimization flags
☰
11.3 Debug builds
☰
11.4 Optimized builds
☰
11.5 Introduction to Make
☰
11.6 Introduction to CMake
☰
12 Performance Analysis and Optimization
☰
12.1 Measuring performance
☰
12.2 Benchmarking applications
☰
12.3 Profiling tools
☰
12.4 Memory optimization
☰
12.5 Cache optimization
☰
12.6 Vectorization strategies
☰
12.7 Improving parallel efficiency
☰
13 Numerical Libraries and Software Stacks
☰
13.1 Linear algebra libraries
☰
13.1.1 BLAS
☰
13.1.2 LAPACK
☰
13.1.3 ScaLAPACK
☰
13.2 Fast Fourier Transform libraries
☰
13.3 Scientific software frameworks
☰
13.4 Using precompiled software on clusters
☰
14 Data Management and I/O
☰
14.1 Parallel I/O concepts
☰
14.2 File formats used in HPC
☰
14.3 Checkpointing strategies
☰
14.4 Restart mechanisms
☰
14.5 Managing large datasets
☰
15 Reproducibility and Software Environments
☰
15.1 Software stacks
☰
15.2 Containers in HPC
☰
15.3 Introduction to Singularity and Apptainer
☰
15.4 Best practices for reproducible workflows
☰
16 Debugging and Testing Parallel Programs
☰
16.1 Common bugs in parallel programs
☰
16.2 Deadlocks
☰
16.3 Debugging tools for HPC
☰
16.4 Testing strategies for parallel codes
☰
17 HPC in Practice
☰
17.1 Typical HPC workflows
☰
17.2 Developing code locally
☰
17.3 Running applications on clusters
☰
17.4 Case studies from science
☰
17.5 Case studies from industry
☰
18 Ethics, Sustainability, and Green Computing
☰
18.1 Energy consumption of HPC systems
☰
18.2 Efficient resource usage
☰
18.3 Job sizing and fair-share usage
☰
18.4 Responsible computing practices
☰
19 Future Trends in HPC
☰
19.1 Exascale computing
☰
19.2 AI and machine learning in HPC
☰
19.3 Heterogeneous architectures
☰
19.4 Quantum computing and HPC integration
☰
20 Final Project and Hands-On Exercises
☰
20.1 Designing an HPC application
☰
20.2 Running large-scale simulations
☰
20.3 Performance analysis and optimization report
☰
20.4 Documentation and best practices summary
Close