Basic Spin Manual

Promela Manual (extracted from Spin Manual at http://spinroot.com/spin/Man/Manual.html)

Promela is a verification modeling language. It provides a vehicle for making abstractions of protocols (or distributed systems in general) that suppress details that are unrelated to process interaction. The intended use of Spin is to verify fractions of process behavior, that for one reason or another are considered suspect. The relevant behavior is modeled in Promela and verified. A complete verification is therefore typically performed in a series of steps, with the construction of increasingly detailed Promela models. Each model can be verified with Spin under different types of assumptions about the environment (e.g., message loss, message duplications etc). Once the correctness of a model has been established with Spin, that fact can be used in the construction and verification of all subsequent models.

Promela programs consist of processes, message channels, and variables. Processes are global objects. Message channels and variables can be declared either globally or locally within a process. Processes specify behavior, channels and global variables define the environment in which the processes run.

Executability

In Promela there is no difference between conditions and statements, even isolated boolean conditions can be used as statements. The execution of every statement is conditional on its executability . Statements are either executable or blocked. The executability is the basic means of synchronization. A process can wait for an event to happen by waiting for a statement to become executable. For instance, instead of writing a busy wait loop:

        while (a != b)

               skip    /* wait for a==b */

one can achieve the same effect in Promela with the statement

        (a == b)

A condition can only be executed (passed) when it holds. If the condition does not hold, execution blocks until it does.

Variables are used to store either global information about the system as a whole, or information local to one specific process, depending on where the declaration for the variable is placed. The declarations

        bool flag;

        int state;

        byte msg;

define variables that can store integer values in three different ranges. The scope of a variable is global if it is declared outside all process declarations, and local if it is declared within a process declaration.

Data Types

The table below summarizes the basic data types, sizes, and the corresponding value ranges (on a DEC VAX computer).

Table 1 - Data Types

Typename
bit or bool
byte
short
int

C-equivalent
bit-field
uchar
short
int

Macro in limits.h
-
CHAR_BIT (width in bits)
SHRT_MIN..SHRT_MAX
INT_MIN..INT_MAX

Typical Range
0..1
0..255
-2^15 - 1 .. 2^15 - 1
-2^31 - 1 .. 2^31 - 1

The names bit and bool are synonyms for a single bit of information. A byte is an unsigned quantity that can store a value between 0 and 255. shorts and ints are signed quantities that differ only in the range of values they can hold.

An mtype variable can be assigned symbolic values that are declared in an mtype = { ... } statement, to be discussed below.

Array Variables

Variables can be declared as arrays. For instance,

        byte state[N]

declares an array of N bytes that can be accessed in statements such as

        state[0] = state[3] + 5 * state[3*2/n]

wheren is a constant or a variable declared elsewhere. The index to an array can be any expression that determines a unique integer value. The effect of an index value outside the range 0.. N-1 is undefined; most likely it will cause a runtime error. (Multi-dimensional arrays can be defined indirectly with the help of the typedef construct)

So far we have seen examples of variable declarations and of two types of statements: boolean conditions and assignments. Declarations and assignments are always executable. Conditions are only executable when they hold.

Process Types

The state of a variable or of a message channel can only be changed or inspected by processes. The behavior of a process is defined in a proctype declaration. The following, for instance, declares a process with one local variable state.

        proctype A()

               byte state;

               state = 3

The process type is named A. The body of the declaration is enclosed in curly braces. The declaration body consists of a list of zero or more declarations of local variables and/or statements. The declaration above contains one local variable declaration and a single statement: an assignment of the value 3 to variable state.

The semicolon is a statement separator (not a statement terminator, hence there is no semicolon after the last statement). Promela accepts two different statement separators: an arrow `->'and the semicolon `;'. The two statement separators are equivalent. The arrow is sometimes used as an informal way to indicate a causal relation between two statements. Consider the following example.

        byte state = 2;

        proctype A()

        {       (state == 1) -> state = 3

        proctype B()

        {       state = state - 1

In this example we declared two types of processes, A and B. Variable state is now a global, initialized to the value two. Process type A contains two statements, separated by an arrow. In the example, process declaration B contains a single statement that decrements the value of the state variable by one. Since the assignment is always executable, processes of type B can always complete without delay. Processes of type A, however, are delayed at the condition until the variable state contains the proper value.

Process Instantiation

A proctype definition only declares process behavior, it does not execute it. Initially, in the Promela model, just one process will be executed: a process of type init, that must be declared explicitly in every Promela specification. The smallest possible Promela specification, therefore, is:

        init { skip }

where skip is a dummy, null statement. More interestingly, however, the initial process can initialize global variables, and instantiate processes. An init declaration for the above system, for instance, could look as follows.

        init

        {       run A(); run B()

run is used as a unary operator that takes the name of a process type (e.g. A). It is executable only if a process of the type specified can be instantiated. It is unexecutable if this cannot be done, for instance if too many processes are already running.

The run statement can pass parameter values of all basic data types to the new process. The declarations are then written, for instance, as follows:

        proctype A(byte state; short foo)

               (state == 1) -> state = foo

        init

               run A(1, 3)

Data arrays or process types can not be passed as parameters. As we will see below, there is just one other data type that can be used as a parameter: a message channel.

Run statements can be used in any process to spawn new processes, not just in the initial process. Processes are created with the run statements. An executing process disappears again when it terminates (i.e., reaches the end of the body of its process type declaration), but not before all processes that it started have terminated.

With the run statement we can create any number of copies of the process types A and B. If, however, more than one concurrent process is allowed to both read and write the value of a global variable a well-known set of problems can result; for example see [2]. Consider, for instance, the following system of two processes, sharing access to the global variable state.

        byte state = 1;

        proctype A()

        {       byte tmp;

               (state==1) -> tmp = state; tmp = tmp+1; state = tmp

        proctype B()

        {       byte tmp;

               (state==1) -> tmp = state; tmp = tmp-1; state = tmp

        init

        {       run A(); run B()

If one of the two processes completes before its competitor has started, the other process will block forever on the initial condition. If both pass the condition simultaneously, both will complete, but the resulting value of state is unpredictable. It can be any of the values 0, 1, or 2.

Many solutions to this problem have been considered, ranging from an abolishment of global variables to the provision of special machine instructions that can guarantee an indivisible test and set sequence on a shared variable. The example below was one of the first solutions published. It is due to the Dutch mathematician Dekker. It grants two processes mutually exclusion access to an arbitrary critical section in their code, by manipulation three additional global variables. The first four lines in the Promela specification below are C-style macro definitions. The first two macros define true to be a constant value equal to 1 and false to be a constant 0. Similarly, Aturn and Bturn are defined as constants.

        #define true   1

        #define false  0

        #define Aturn  false

        #define Bturn  true

        bool x, y, t;

        proctype A()

        {       x = true;

               t = Bturn;

               (y == false || t == Aturn);

               /* critical section */

               x = false

        proctype B()

        {       y = true;

               t = Aturn;

               (x == false || t == Bturn);

               /* critical section */

               y = false

        init

        {       run A(); run B()

The algorithm can be executed repeatedly and is independent of the relative speeds of the two processes.

Atomic Sequences

In Promela there is also another way to avoid the test and set problem: atomic sequences. By prefixing a sequence of statements enclosed in curly braces with the keyword atomic the user can indicate that the sequence is to be executed as one indivisible unit, non-interleaved with any other processes. It causes a run-time error if any statement, other than the first statement, blocks in an atomic sequence. This is how we can use atomic sequences to protect the concurrent access to the global variable state in the earlier example.

        byte state = 1;

        proctype A()

        {       atomic {

                 (state==1) -> state = state+1

        proctype B()

        {       atomic {

                 (state==1) -> state = state-1

        init

        {       run A(); run B()

In this case the final value of state is either zero or two, depending on which process executes. The other process will be blocked forever.

Atomic sequences can be an important tool in reducing the complexity of verification models. Note that atomic sequence restricts the amount of interleaving that is allowed in a distributed system. Otherwise untractable models can be made tractable by, for instance, labeling all manipulations of local variables with atomic sequences. The reduction in complexity can be dramatic.

Message Passing

Message channels are used to model the transfer of data from one process to another. They are declared either locally or globally, for instance as follows:

        chan qname = [16] of { short }

This declares a channel that can store up to 16 messages of type short. Channel names can be passed from one process to another via channels or as parameters in process instantiations. If the messages to be passed by the channel have more than one field, the declaration may look as follows:

        chan qname = [16] of { byte, int, chan, byte }

This time the channel stores up to sixteen messages, each consisting of two 8-bit values, one 32-bit value, and a channel name.

The statement

        qname!expr

sends the value of expression expr to the channel that we just created, that is: it appends the value to the tail of the channel.

        qname?msg

receives the message, it retrieves it from the head of the channel, and stores it in a variable msg. The channels pass messages in first-in-first-out order. In the above cases only a single value is passed through the channel. If more than one value is to be transferred per message, they are specified in a comma separated list

        qname!expr1,expr2,expr3

        qname?var1,var2,var3

It is an error to send or receive either more or fewer parameters per message than was declared for the message channel used.
By convention, the first message field is often used to specify the message type (i.e. a constant). An alternative, and equivalent, notation for the send and receive operations is therefore to specify the message type, followed by a list of message fields enclosed in braces. In general:

        qname!expr1(expr2,expr3)

        qname?var1(var2,var3)

The send operation is executable only when the channel addressed is not full. The receive operation, similarly, is only executable when the channel is non empty. Optionally, some of the arguments of the receive operation can be constants:

        qname?cons1,var2,cons2

in this case, a further condition on the executability of the receive operation is that the value of all message fields that are specified as constants match the value of the corresponding fields in the message that is at the head of the channel. Again, nothing bad will happen if a statement happens to be non-executable. The process trying to execute it will be delayed until the statement, or, more likely, an alternative statement, becomes executable.

Here is an example that uses some of the mechanisms introduced so far.

        proctype A(chan q1)

        {       chan q2;

               q1?q2;

               q2!123

        proctype B(chan qforb)

        {       int x;

               qforb?x;

               printf("x = %d\n", x)

        init {

               chan qname = [1] of { chan };

               chan qforb = [1] of { int };

               run A(qname);

               run B(qforb);

               qname!qforb

The value printed will be ?????.

A predefined function len(qname) returns the number of messages currently stored in channel qname. Note that if len is used as a statement, rather than on the right hand side of an assignment, it will be unexecutable if the channel is empty: it returns a zero result, which by definition means that the statement is temporarily unexecutable. Composite conditions such as

        (qname?var == 0)       /* syntax error */

        (a > b && qname!123)   /* syntax error */

are invalid in Promela (note that these conditions can not be evaluated without side-effects). For a receive statement there is an alternative, using square brackets around the clause behind the question mark.

        qname?[ack,var]

is evaluated as a condition. It returns 1 if the corresponding receive statement

        qname?ack,var

is executable, i.e., if there is indeed a message ack at the head of the channel. It returns 0 otherwise. In neither case has the evaluation of a statement such as

        qname?[ack,var]

any side-effects: the receive is evaluated, not executed.

Note carefully that in non-atomic sequences of two statements such as

        (len(qname) < MAX) -> qname!msgtype

        qname?[msgtype] -> qname?msgtype

the second statement is not necessarily executable after the first one has been executed. There may be race conditions if access to the channels is shared between several processes. In the first case another process can send a message to channel qname just after this process determined that the channel was not full. In the second case, the other process can steal away the message just after our process determined its presence.

Rendez-Vous Communication (not covered in lecture)

So far we have talked about asynchronous communication between processes via message channels, declared in statements such as

        chan qname = [N] of { byte }

where N is a positive constant that defines the buffer size. A logical extension is to allow for the declaration

        chan port = [0] of { byte }

to define a rendezvous port that can pass single byte messages. The channel size is zero, that is, the channel port can pass, but can not store messages. Message interactions via such rendezvous ports are by definition synchronous. Consider the following example.

        #define msgtype 33

        chan name = [0] of { byte, byte };

        proctype A()

        {       name!msgtype(124);

               name!msgtype(121)

        proctype B()

        {       byte state;

               name?msgtype(state)

        init

        {       atomic { run A(); run B() }

Channel name is a global rendezvous port. The two processes will synchronously execute their first statement: a handshake on message msgtype and a transfer of the value 124 to local variable state. The second statement in process A will be unexecutable, because there is no matching receive operation in process B.

If the channel name is defined with a non-zero buffer capacity, the behavior is different. If the buffer size is at least 2, the process of type A can complete its execution, before its peer even starts. If the buffer size is 1, the sequence of events is as follows. The process of type A can complete its first send action, but it blocks on the second, because the channel is now filled to capacity. The process of type B can then retrieve the first message and complete. At this point A becomes executable again and completes, leaving its last message as a residual in the channel.

Rendez-vous communication is binary: only two processes, a sender and a receiver, can be synchronized in a rendezvous handshake. We will see an example of a way to exploit this to build a semaphore below. But first, let us introduce a few more control flow structures that may be useful.

Control Flow

Between the lines, we have already introduced three ways of defining control flow: concatenation of statements within a process, parallel execution of processes, and atomic sequences. There are three other control flow constructs in Promela to be discussed. They are case selection, repetition, and unconditional jumps.

Case Selection

The simplest construct is the selection structure. Using the relative values of two variables a and b to choose between two options, for instance, we can write:

if

        :: (a != b) -> option1

        :: (a == b) -> option2

fi

The selection structure contains two execution sequences, each preceded by a double colon. Only one sequence from the list will be executed. A sequence can be selected only if its first statement is executable. The first statement is therefore called a guard.

In the above example the guards are mutually exclusive, but they need not be. If more than one guard is executable, one of the corresponding sequences is selected nondeterministically. If all guards are unexecutable the process will block until at least one of them can be selected. There is no restriction on the type of statements that can be used as a guard. The following example, for instance, uses input statements.

        #define a 1

        #define b 2

        chan ch = [1] of { byte };

        proctype A()

        {       ch!a

        proctype B()

        {       ch!b

        proctype C()

        {       if

               :: ch?a

               :: ch?b

fi

        init

        {       atomic { run A(); run B(); run C() }

The example defines three processes and one channel. The first option in the selection structure of the process of type C is executable if the channel contains a message a, where a is a constant with value 1, defined in a macro definition at the start of the program. The second option is executable if it contains a message b, where, similarly, b is a constant. Which message will be available depends on the unknown relative speeds of the processes.

A process of the following type will either increment or decrement the value of variable count once.

        byte count;

        proctype counter()

if

               :: count = count + 1

               :: count = count - 1

fi

Repetition

A logical extension of the selection structure is the repetition structure. We can modify the above program as follows, to obtain a cyclic program that randomly changes the value of the variable up or down.

        byte count;

        proctype counter()

do

               :: count = count + 1

               :: count = count - 1

               :: (count == 0) -> break

od

Only one option can be selected for execution at a time. After the option completes, the execution of the structure is repeated. The normal way to terminate the repetition structure is with a break statement. In the example, the loop can be broken when the count reaches zero. Note, however, that it need not terminate since the other two options always remain executable. To force termination when the counter reaches zero, we could modify the program as follows.

        proctype counter()

do

               :: (count != 0) ->

if

                       :: count = count + 1

                       :: count = count - 1

fi

               :: (count == 0) -> break

od

Unconditional Jumps

Another way to break the loop is with an unconditional jump: the infamous goto statement. This is illustrated in the following implementation of Euclid's algorithm for finding the greatest common divisor of two non-zero, positive numbers:

        proctype Euclid(int x, y)

do

               :: (x >  y) -> x = x - y

               :: (x <  y) -> y = y - x

               :: (x == y) -> goto done

od;

        done:

               skip

The goto in this example jumps to a label named done. A label can only appear before a statement. Above we want to jump to the end of the program. In this case a dummy statement skip is useful: it is a place holder that is always executable and has no effect. The goto is also always executable.

The following example specifies a filter that receives messages from a channel in and divides them over two channels large and small depending on the values attached. The constant N is defined to be 128 and size is defined to be 16 in the two macro definitions.

        #define N    128

        #define size  16

        chan in    = [size] of { short };

        chan large = [size] of { short };

        chan small = [size] of { short };

        proctype split()

        {       short cargo;

do

               :: in?cargo ->

if

                       :: (cargo >= N) ->

                               large!cargo

                       :: (cargo <  N) ->

                               small!cargo

fi

od

        init

        {       run split()

A process type that merges the two streams back into one, most likely in a different order, and writes it back into the channel in could be specified as follows.

        proctype merge()

        {       short cargo;

do

               ::      if

                       :: large?cargo

                       :: small?cargo

fi;

                       in!cargo

od

If we now modify the init process as follows, the split and merge processes could busily perform their duties forever on.

        init

        {       in!345; in!12; in!6777;

               in!32;  in!0;

               run split();

               run merge()

As a final example, consider the following implementation of a Dijkstra semaphore, using binary rendezvous communication.

        #define p      0

        #define v      1

        chan sema = [0] of { bit };

        proctype dijkstra()

        {       byte count = 1;

do

               :: (count == 1) ->

                       sema!p; count = 0

               :: (count == 0) ->

                       sema?v; count = 1

od

        proctype user()

        {       do

               :: sema?p;

                  /*     critical section */

                  sema!v;

                  /* non-critical section */

od

        init

        {       run dijkstra();

               run user();

               run user();

               run user()

The semaphore guarantees that only one of the user processes can enter its critical section at a time. It does not necessarily prevent the monopolization of the access to the critical section by one of the processes.

Modeling Procedures and Recursion

Procedures can be modeled as processes, even recursive ones. The return value can be passed back to the calling process via a global variable, or via a message. The following program illustrates this.

        proctype fact(int n; chan p)

        {       chan child = [1] of { int };

               int result;

if

               :: (n <= 1) -> p!1

               :: (n >= 2) ->

                       run fact(n-1, child);

                       child?result;

                       p!n*result

fi

        init

        {       chan child = [1] of { int };

               int result;

               run fact(7, child);

               child?result;

               printf("result: %d\n", result)

The process fact(n, p) recursively calculates the factorial of n , communicating the result via a message to its parent process p .

Timeouts

We have already discussed two types of statement with a predefined meaning in Promela: skip, and break. Another predefined statement is timeout. The timeout models a special condition that allows a process to abort the waiting for a condition that may never become true, e.g. an input from an empty channel. The timeout keyword is a modeling feature in Promela that provides an escape from a hang state. The timeout condition becomes true only when no other statements within the distributed system is executable. Note that we deliberately abstract from absolute timing considerations, which is crucial in verification work, and we do not specify how the timeout should be implemented. A simple example is the following process that will send a reset message to a channel named guard whenever the system comes to a standstill.

        proctype watchdog()

do

               :: timeout -> guard!reset

od

Assertions

Another important language construct in Promela that needs little explanation is the assert statement. Statements of the form

        assert(any_boolean_condition)

are always executable. If the boolean condition specified holds, the statement has no effect. If, however, the condition does not necessarily hold, the statement will produce an error report during verifications with Spin.