Tutorial: Terms

Previous: #pramga simd Up: Overview  


The code following a cilk_spawn'd function call.  The continuation may be stolen by another worker to execute in parallel with the cilk_spawn'd function.


Annotations written to the binary file specifying "interesting" locations.  Each metadata entry includes the address of an instruction, the type of the metadata as a string, and a DWARF expression to specify a piece of data.


In mathematics, a monoid comprises a set of values (type), an associative operation on that set, and an identity value for that operation. Some examples of monoids:

Monoid Set Operation Identity
(integer, +, 0) integers addition 0
(real, *, 1) reals multiplication 1

Race Condition

A race condition occurs when two parallel strands access the same memory location and at least one of them writes to the memory location.  The result is undefined behavior.

Serial Semantics

An Intel Cilk Plus application has serial semantics. That is, the result of a race-free, deterministic parallel run is the same as if the program had executed serially. This makes it easier to reason about the parallel application. In addition, developers can use familiar tools to debug the application.


To compile an Intel Cilk Plus application so that all of the task parallelism keywords are macroed away:

  • cilk_spawn'd functions become ordinary function calls.
  • cilk_sync statements become empty statements.
  • cilk_for loops become ordinary for loops.

This can be done by including cilk/cilk_stub.h which contains macros to replace the Intel Cilk Plus keywords with serial equivalents.


An acronym for "Single Instruction, Multiple Data." Modern CPUs include vector units that can perform the same operation on multiple data values simultaneously, providing data parallelism.  SIMD instructions are particularly applicable to operations on arrays of data.


A sequence of instructions that starts or ends on a statement which will change the parallelism.  A strand is delimited by one of the following Cilk Plus keywords: cilk_spawncilk_sync (including the implied cilk_sync at the end of a function), or cilk_for.


Modern CPUs include vector units that can execute an operation on multiple units of data simultaneously.  Vectorization is generating code to use SIMD instructions instead of a series of instructions that operate on single data values.


The cilk_spawn keyword causes the compiler to generate code to add an entry to the back of a queue of spawned tasks maintained for each processor. The entry is added before the call to the spawned function is made. When each processor has work to do, a spawn is roughly the cost of a call. When a spawned function returns, the back entry is popped off the queue.

When a processor has no work to do, it will steal a task from the front of some other processor's queue. The stolen task is actually the continuation of the function that spawned a function call. With sufficient parallelism, steals are rare and the result is speedup proportional to the number of processor cores (ignoring memory effects).


A concurrent agent that executes the instructions in one strand, possibly at the same time that another worker executes instructions in a parallel strand. Workers are managed by the Intel Cilk Plus runtime system's work-stealing scheduler. A worker is implemented as an operating system thread.

Previous: #pragma simd Up: Overview