Kahibaro
Discord Login Register

Final Project and Hands-On Exercises

Overview of the Final Project

The final part of this course is intentionally practical. Instead of just answering quiz questions, you will design, run, and analyze an HPC workload end‑to‑end, using the concepts you have learned.

This chapter describes:

The technical “how” (e.g., how to submit a SLURM job, how to use MPI, how to profile) is covered in earlier chapters. Here the focus is on turning those pieces into a coherent project and practice workflow.

Goals of the Final Project

By the end of the project you should be able to:

The project is intentionally open‑ended: you can work with provided examples or bring a small problem from your own domain, as long as it fits within the course constraints.

Project Constraints and Scope

To keep the project manageable for beginners, we impose some limits. A typical project should:

You do not need to:

Depth and clarity matter more than size or complexity.

Types of Acceptable Projects

You can choose one of several broad project types. Discuss with your instructor which option fits your background and time.

1. Parallelization of an Existing Serial Code

Typical steps:

  1. Understand the algorithm and identify the most time‑consuming part.
  2. Select a parallelization strategy (e.g., loop parallelism, domain decomposition).
  3. Implement a minimal but correct parallel version.
  4. Run scaling experiments (vary thread count, processes, or nodes).
  5. Analyze speedup and bottlenecks.

2. Scaling Study of a Preexisting Parallel Application

Typical steps:

  1. Learn how to run the application via job scripts.
  2. Choose a test case and define what you will measure.
  3. Run the same case with different core counts or nodes.
  4. Analyze how performance changes and why (I/O, communication, load imbalance, etc.).

3. End‑to‑End Workflow Project

Typical steps:

  1. Write scripts for each stage (e.g., data generation, simulation, analysis).
  2. Integrate them with the batch system (job dependencies, arrays).
  3. Log configuration, environment modules, and parameters.
  4. Measure throughput and resource utilization for the whole workflow.

4. Mini Application from Your Domain

Typical steps:

  1. Define a clear, limited problem (not your entire research project).
  2. Identify where parallelism is natural (e.g., independent tasks, data chunks).
  3. Implement a prototype and demonstrate at least one scaling experiment.
  4. Relate the results back to the characteristics of your problem.

Core Deliverables

Every project must produce four main artifacts:

  1. Code and job scripts
    • Source code files
    • Job submission scripts for the batch scheduler
    • Any auxiliary scripts (data generation, plotting, postprocessing)
  2. Configuration and environment description
    • Which modules or software stacks you used
    • Compiler and key compilation flags
    • Hardware and job parameters (nodes, tasks, threads, memory, GPUs)
  3. Performance and scaling results
    • Raw measurements (execution times, iterations per second, throughput, etc.)
    • At least one set of strong or weak scaling results
    • A short interpretation of the results
  4. Written report
    • See the next section for a suggested structure

You may also be asked for:

Suggested Report Structure

Keep the report concise but complete; 5–10 pages is typically sufficient if it is well organized.

1. Introduction

2. Methods and Implementation

Focus on what is specific to your implementation, not on re‑explaining basic parallel computing theory.

3. Experimental Setup

4. Results

Present numerical measurements rather than only qualitative statements.

Examples of what to include:

Use tables and basic plots (e.g., runtime vs. cores, speedup vs. cores) to make trends clear.

5. Discussion

Interpret the results in light of what you learned in the course:

You do not need extremely detailed performance modeling, but you should connect observations to plausible causes.

6. Limitations and Future Work

7. Reproducibility Notes

Evaluation Criteria

While exact grading rubrics vary, projects are commonly evaluated on:

  1. Correctness and robustness
    • The code runs and produces reasonable, consistent results
    • The parallel version is logically correct (no obvious race conditions, deadlocks, or incorrect outputs for tested cases)
  2. Appropriate use of HPC resources
    • Jobs request realistic resources (no massive over‑allocation for tiny tasks)
    • The chosen form of parallelism is sensible for the problem
    • Basic job management practices are followed (batch submission, not interactive overload)
  3. Performance investigation
    • You collected meaningful data (timings, scaling) rather than single anecdotal runs
    • You attempted at least basic performance improvements
    • You can explain performance trends in a reasoned way
  4. Quality of documentation
    • Clear, organized report
    • Enough detail to understand and reproduce the work
    • Transparent about limitations and problems encountered
  5. Professional practices
    • Clean structure of code and scripts
    • Version control usage when possible
    • Attention to reproducibility and environment management

Creativity and ambition are valued, but a smaller, well‑executed project is preferable to a grand plan that never fully works.

Recommended Workflow for the Project

To prevent last‑minute surprises, follow a staged approach.

Stage 1: Define the Problem and Plan

Stage 2: Get a Correct Baseline

Focus on correctness before performance.

Stage 3: Introduce Parallelism or Scaling

Make sure to keep the serial or single‑process version intact for comparison.

Stage 4: Systematic Experiments

Plan a small but structured experiment matrix, for example:

For each configuration:

Stage 5: Analyze and Visualize

Use profiling or logging information as needed to support your explanations.

Stage 6: Finalize Report and Packaging

Write and proofread the report, checking that all figures are readable and labeled, and that another reader could follow your workflow.

Progressive Hands‑On Exercises

Before or alongside the final project, you are encouraged (or may be required) to complete smaller hands‑on exercises. These are designed to practice individual skills you will need for the project.

Below is a suggested progression; details such as exact commands or little code snippets are covered in earlier chapters.

Exercise 1: Basic Job Submission

Goal: Become comfortable with the batch system and job lifecycle.

Exercise 2: Thread‑Level Parallelism

Goal: Practice thread control, environment variables, and basic performance measurement.

Exercise 3: Process‑Level Parallelism

Goal: Understand multi‑process execution and mapping onto cluster resources.

Exercise 4: Simple Scaling Study

Goal: Bridge between “it runs” and “how well does it scale,” in preparation for the project.

Exercise 5: Workflow and Reproducibility

Goal: Practice constructing small but repeatable HPC workflows.

Collaboration and Academic Integrity

Collaboration policies vary by course; follow your instructor’s rules. Common guidelines include:

Transparency is crucial: clearly stating what is yours, what is adapted, and what is used as‑is is part of professional HPC practice.

Practical Tips for a Successful Project

The final project is your opportunity to integrate everything you have learned about HPC into a concrete, working example. Treat it as a miniature version of real‑world HPC work: design, implement, run, measure, understand, and clearly communicate your results.

Views: 17

Comments

Please login to add a comment.

Don't have an account? Register now!