Kahibaro
Discord Login Register

Work-sharing constructs

What “work sharing” means in OpenMP

In shared-memory programming with OpenMP, a work-sharing construct tells the runtime how to divide the work of a region among the threads in a team. Unlike a parallel region (which creates threads), work-sharing constructs:

The main OpenMP work-sharing constructs are:

This chapter focuses on how these constructs divide work and the typical usage patterns, not on general OpenMP setup or threading basics.

The `for` / `do` construct (loop work sharing)

Loop work sharing is the most common pattern in OpenMP. Its purpose is to split loop iterations among threads.

Basic C/C++ form:

#pragma omp parallel
{
    #pragma omp for
    for (int i = 0; i < N; ++i) {
        // each iteration i is executed by exactly one thread
    }
}

Basic Fortran form:

!$omp parallel
!$omp do
do i = 1, N
    ! each iteration i is executed by exactly one thread
end do
!$omp end do
!$omp end parallel

Key properties:

Scheduling policies

for / do supports several schedule options that control how iterations are split:

Only the work-sharing behavior is covered here; detailed performance implications belong to the performance chapter.

Static scheduling

#pragma omp for schedule(static)
for (int i = 0; i < N; ++i) { ... }

Useful when:

Dynamic scheduling

#pragma omp for schedule(dynamic, chunk)
for (int i = 0; i < N; ++i) { ... }

Useful when:

Guided scheduling

#pragma omp for schedule(guided, chunk)
for (int i = 0; i < N; ++i) { ... }

Useful when:

Runtime and auto

#pragma omp for schedule(runtime)
for (int i = 0; i < N; ++i) { ... }

These are mostly for tuning without recompiling or relying on compiler/runtime heuristics.

Using `nowait` to omit the barrier

By default, for / do has an implicit barrier at the end: all threads wait until every thread finishes its iterations.

You can skip this barrier with nowait:

#pragma omp parallel
{
    #pragma omp for nowait
    for (int i = 0; i < N; ++i) {
        // work on part 1
    }
    // No barrier here; threads may continue immediately
    #pragma omp for
    for (int i = 0; i < N; ++i) {
        // work on part 2
    }
}

You should only use nowait when:

Combined `parallel for` / `parallel do`

A shorthand combines creating the parallel region and distributing loop iterations:

#pragma omp parallel for schedule(static)
for (int i = 0; i < N; ++i) { ... }

Fortran:

!$omp parallel do schedule(static)
do i = 1, N
    ...
end do
!$omp end parallel do

This behaves (roughly) like:

#pragma omp parallel
{
    #pragma omp for schedule(static)
    for (int i = 0; i < N; ++i) { ... }
}

Use combined forms for simple loop-only parallel regions where you do not need extra code inside the parallel region outside the loop.

The `sections` construct

sections is a work-sharing construct for dividing different code blocks (not loop iterations) among threads.

C/C++:

#pragma omp parallel
{
    #pragma omp sections
    {
        #pragma omp section
        {
            // Work A
        }
        #pragma omp section
        {
            // Work B
        }
        #pragma omp section
        {
            // Work C
        }
    }
}

Fortran:

!$omp parallel
!$omp sections
!$omp section
    ! Work A
!$omp section
    ! Work B
!$omp section
    ! Work C
!$omp end sections
!$omp end parallel

Key properties:

Typical uses:

`sections` with `nowait`

#pragma omp parallel
{
    #pragma omp sections nowait
    {
        #pragma omp section
        { /* Work A */ }
        #pragma omp section
        { /* Work B */ }
    }
    // No barrier here; threads may proceed before all sections finish
}

Use nowait only when later code does not require the completion of all sections.

The `single` construct

single is a work-sharing construct that specifies code that should be executed by exactly one thread in a team, while the other threads skip it.

C/C++:

#pragma omp parallel
{
    // ... some parallel work ...
    #pragma omp single
    {
        // Only one thread executes this block
        // e.g., input, initialization, or logging
    }
    // implicit barrier by default
}

Fortran:

!$omp parallel
    ! ... some parallel work ...
    !$omp single
        ! Only one thread executes this block
    !$omp end single
!$omp end parallel

Key properties:

Typical uses:

`single` with `nowait`

You can add nowait to avoid the barrier at the end:

#pragma omp parallel
{
    #pragma omp single nowait
    {
        // One thread executes this; others do not wait
    }
    // No barrier here
}

Use this only if other threads do not depend on the result of the single block immediately after it.

The `task` construct as a flexible work-sharing tool

While tasks are often treated separately in OpenMP documentation, they function as a more dynamic work-sharing construct, where work units (tasks) are created at runtime and scheduled onto threads.

Basic C/C++ form:

#pragma omp parallel
{
    #pragma omp single
    {
        for (int i = 0; i < N; ++i) {
            #pragma omp task
            {
                // work for element i
            }
        }
    }
}

Key work-sharing aspects:

Task-specific features (dependencies, taskloops, etc.) are covered elsewhere; here, the main point is that tasks provide dynamic work distribution, beyond simple loop or section division.

Fortran `workshare` (language-specific construct)

In Fortran, workshare is a construct that can automatically distribute certain array operations and constructs over threads. It is primarily relevant to Fortran array syntax and similar features.

Example pattern (conceptual):

!$omp parallel
!$omp workshare
    A = B + C   ! array operation; iterations over elements are shared
!$omp end workshare
!$omp end parallel

Key idea:

Detailed behavior and best practices around workshare are typically Fortran-specific and depend on compiler support.

Choosing between work-sharing constructs

Some common decision guidelines, focusing purely on how work is shared:

Interactions with data scoping and synchronization

Work-sharing constructs interact strongly with:

Only the work-sharing aspect is emphasized here:

The detailed rules and best practices for data scoping and synchronization are discussed in other chapters; when using work-sharing constructs, always ensure:

  1. Each piece of work has the necessary private or shared variables correctly specified.
  2. You understand whether the implicit barrier should remain or be removed (nowait), based on dependencies between threads.

Views: 11

Comments

Please login to add a comment.

Don't have an account? Register now!