Skip to main content

Flex Protocols

Flex protocols are formal descriptions of the behavior of software components that communicate with each other in a distributed system via synchronous message passing.

Protocols come in two flavors: global and local. Global protocols describe sequences of message exchanges between two or more components. Local protocols describe sequences of actions taken by a single component.

Global ProtocolLocal Protocol
# components involvedtwo or moreone
perspectivethird-personfirst-person
analogous toentire sequence diagramone sequence diagram lifeline

A system composed of a collection of local protocols can be analyzed with a model checker that detects potential deadlock or proves that deadlock between conforming components is impossible (given a perfectly functioning network).

Online Purchase Example Scenario

The details of Flex protocols are discussed in the subsequent sections, but first we model an example interaction using Flex protocols to gain some high-level understanding. Our example involves customer, store, and warehouse components. During this interaction, the customer tries to purchase something by sending an Order to the store. The store checks with the warehouse to see if the item is in stock, and the warehouse responds appropriately. The store then concludes the interaction by sending a Confirmation or Denial back to the customer.

We capture the complete interaction by writing a Flex global protocol, provided below:

module examples.onlinePurchase

struct Order {}
struct InStockRequest {}
struct InStockResponse { isInStock: bit; }
struct Confirmation {}
struct Denial {}

component Customer;
component Store;
component Warehouse;

global protocol OnlinePurchaseProtocol {
  in Store { var r: InStockResponse; }

  exch any Order from Customer to Store;
  exch any InStockRequest from Store to Warehouse;
  exch any InStockResponse into r from Warehouse to Store;
  choice in Store
  | r.isInStock =>
      exch any Confirmation from Store to Customer;
  | !r.isInStock =>
      exch any Denial from Store to Customer;
  end
}

For testing, implementation, or integration tasks, it is often more helpful to have the local protocols which focus on the behavior of each component individually. Since we have a global protocol, we can automatically generate the local protocols via an operation called projection. We can also write local protocols manually.

Here are the local protocols that describe the three components:

local protocol OnlinePurchaseProtocol__Customer in Customer {
  send any Order to Store;
  listen
  | recv _: Confirmation from Store =>
  | recv _: Denial from Store =>
  end
}

local protocol OnlinePurchaseProtocol__Store in Store {
  var r: InStockResponse;

  recv any Order from Customer;
  send any InStockRequest to Warehouse;
  recv r from Warehouse;
  branch
  | r.isInStock =>
      send any Confirmation to Customer;
  | !r.isInStock =>
      send any Denial to Customer;
  end
}

local protocol OnlinePurchaseProtocol__Warehouse in Warehouse {
  recv any InStockRequest from Store;
  send any InStockResponse to Store;
}

We can use the model checker to prove that a system composed of these three local protocols cannot deadlock by defining a system:

system OnlinePurchaseProtocolSystem {
  OnlinePurchaseProtocol__Customer;
  OnlinePurchaseProtocol__Store;
  OnlinePurchaseProtocol__Warehouse;
}

Components

A component declaration declares the existence of a software component, which is any entity that can send or receive messages to and from other components. Examples of things that could be components are Unix processes, Java threads, programs that communicate using a messaging system like ZeroMQ or Kafka, or a physical computer with electrical input/output wires.

Syntactically, components are top-level Flex declarations. They are referred to by name in the bodies of global and local protocols. The form is this:

component NAME;

where,

  • NAME – a Flex identifier that is the name of the component

Connections

A connection represents some mechanism via which a pair of components can exchange messages. Examples include raw TCP, ZeroMQ, and Kafka.

Connections can be either named or unnamed. A named connection is declared as a top-level Flex declaration with the following syntax:

connection NAME from SENDER to RECEIVER;

where,

  • NAME – a Flex identifier used to refer to the connection
  • SENDER – the component that sends messages
  • RECEIVER – the component that receives messages

All connections are unidirectional, meaning one of the components always sends and the other always receives. Bidirectional communication must be modeled using a pair of connections.

An unnamed connection does not need a declaration and is referred to by its sender-receiver component pair. A single unnamed connection implicitly exists for all ordered pairs of components.

Local Protocols

A local protocol describes sequences of actions that a single software component should or will perform. It is analogous to a C++ function or the arrows to and from a single lifeline in a sequence diagram. A local protocol has the following form:

local protocol NAME in COMPONENT {
  LOCAL_STATEMENTS
}

where:

  • NAME - the name of the local protocol
  • COMPONENT - the name of the component that executes this protocol
  • LOCAL_STATEMENTS - zero or more local protocol statements

The following sections describe the local protocol statements that can appear in the protocol body.

Variable Initialization and Mutation

A mutable or immutable variable can be initialized using a var or let statement respectively. Mutable variables can then later be mutated using set statements:

var NAME: TYPE = VALUE;
let NAME: TYPE = VALUE;
set NAME = VALUE;

where:

  • NAME – a Flex identifier that becomes the name of the new variable
  • TYPE – a Flex type indicating the type of the variable to initialize
  • VALUE – a Flex expression whose value is used to initialize the variable

send Statements and Connection Clauses

A send statement represents the action of sending a message to a different component on a connection:

send SEND_WHAT CONNECTION_CLAUSE;

where:

  • SEND_WHAT – indicates what message is allowed to be sent
  • CONNECTION_CLAUSE – specifies the connection on which the message should be exchanged

The SEND_WHAT part can be one of four forms (where brackets indicate optional syntax elements):

EXP
any TYPE
any IDENT: TYPE where PREDICATE
let IDENT: TYPE [where PREDICATE]

In order, these mean:

  1. The sent message must be exactly the value that EXP evaluates to.
  2. The sent message can be any value of TYPE.
  3. The sent message can be any value of TYPE that satisfies the boolean-valued expression PREDICATE. IDENT can be used in PREDICATE to refer to the sent message.
  4. The sent message can be any value of TYPE that satisfies the boolean-valued expression PREDICATE. IDENT can be used in PREDICATE and in successive protocol statements in the enclosing scope. Omitting where PREDICATE is equivalent to including where true.

The connection clause specifies the connection on which the message is sent. It is used in other statement forms as well but is introduced and described in detail here. The syntax is thus (where brackets indicate optional syntax parts):

[from SENDER] [to RECEIVER] [on CONNECTION]

where:

  • SENDER – the name of the sending component
  • RECEIVER – the name of the receiving component
  • CONNECTION – the name of the connection over which to exchange the message

Even though the connection clause has three optional parts, at least one of the parts must be present to be valid.

If the on part of the connection clause is present, then the clause specifies a named connection, otherwise it specifies an unnamed connection.

Connection clauses are allowed, but not required, to redundantly specify the sender and receiver. In a send statement, the sender is always the component named after the in keyword in the protocol header, meaning the from SENDER part can be omitted:

local protocol example in Comp1 {           // Protocol runs in Comp1
  send any Foo from Comp1 to Comp2;         // Ok
  send any Foo to Comp2;                    // Ok: Sender is Comp1
  send any Foo from Comp3 to Comp2;         // Error: Comp3 is not Comp1
}

recv Statements

A receive statement represents the action of listening for a message from a different component:

recv RECV_WHAT CONNECTION_CLAUSE;

where:

  • RECV_WHAT – specifies which values should be received. This can be an underscore if the received message does not need to be referenced later.
  • CONNECTION_CLAUSE – specifies the connection on which the message should be exchanged

The RECV_WHAT part can be one of these forms (where brackets indicate optional syntax elements):

VARIABLE [assuming ASSUMPTION]
_ [: TYPE]
any IDENT: TYPE assuming ASSUMPTION
let IDENT: TYPE [assuming ASSUMPTION]

In order these mean:

  1. VARIABLE is an existing variable that should be set to the received value. The received value is expected to satisfy boolean-valued expression ASSUMPTION.
  2. Any message (of type TYPE) can be received and its value is immediately discarded.
  3. Any message of type TYPE satisfying ASSUMPTION can be received. IDENT is an identifier that is in scope within ASSUMPTION, but is not in scope anywhere else.
  4. Any message of type TYPE satisfying ASSUMPTION can be received, and the value is bound to IDENT to be used in further statements. Omitting assuming ASSUMPTION is equivalent to including assuming true.

Receive statements are blocking statements, which means the component should not advance beyond the receive statement until the sender performs the send.

branch Blocks

There are two distinct syntactic constructs that specify a branching of control flow: branch and listen. The branch block is discussed here and listen is discussed in the next section. The syntax of branch is as follows:

branch
| GUARD1 =>
    CONT1
...
| GUARDn =>
    CONTn
end

where:

  • GUARDi – a Flex boolean expression that must evaluate to true if the execution is to continue with CONTi
  • CONTi – Zero or more protocol statements that could be executed if GUARDi is true

A branch describes behavior where the component makes a decision on which branch is taken based on evaluating the guard expressions. If a guard evaluates to true, then its branch is active and the component is allowed to continue with that branch. If a guard evaluates to false, then its branch is inactive and should not be chosen.

Note that if two branches are simultaneously active, then the branch block does not specify which branch should be chosen. In this scenario, an implementation can choose the branch at random or based on some additional criteria that is not included in the protocol specification. A common idiom when writing "quick-and-dirty" specifications is to use the keyword true for all branches which means that implementations have complete freedom over the choice of branch. A branch block in which it is possible for two guards to be simultaneously active is called a nondeterministic branch.

For convenience, the last guard in a branch block is allowed to be the keyword else which is shorthand for the conjunction of the negations of all previous guard expressions.

listen Blocks

A listen block describes behavior where the component uses an incoming message to decide which branch to take. The syntax is as follows:

listen
| RECV_STMT1 =>
    CONT1
...
| RECV_STMTn =>
    CONTn
end

where:

  • RECV_STMTi – a recv statement (without the terminal semicolon)
  • CONTi – zero or more protocol statements to execute if RECV_STMTi is executed

The guards of a listen block are actually recv statements rather than expressions. The branches whose guard can be executed without error by consuming the next incoming message are called active. The component is allowed to choose the branch from among its active guards.

A listen block in which two guards could possibly be active simultaneously is called nondeterministic. While nondeterministic listen blocks are allowed, they are discouraged. Instead, try using a nondeterministic branch block nested inside a deterministic listen block.

Using a listen instead of a branch in receiving components is often necessary for specifying deadlock-free systems as demonstrated by the example below:

// A system that deadlocks if the sender and receiver choose incompatible branches
local protocol sender in S {      local protocol receiver in R {
  branch                            branch
  | true => send any Foo to R;      | true => recv _: Foo from S;
  | true => send any Bar to R;      | true => recv _: Bar from S;
  end                               end
}                                 }

// A system that never deadlocks because the receiver must choose the branch
// compatible with the sender's choice
local protocol sender in S {      local protocol receiver in R {
  branch                            listen
  | true => send any Foo to R;      | recv _: Foo from S =>
  | true => send any Bar to R;      | recv _: Bar from S =>
  end                               end
}                                 }

loop Blocks and break Statements

A loop block groups a sequence of statements that should be repeated until a break statement is executed. The syntax is as follows (where brackets indicate optional syntax elements):

loop [LABEL] [invariant INVARIANT] {
  STATEMENTS
}

where:

  • LABEL – a Flex identifier that labels the loop block
  • INVARIANT – the loop invariant, discussed below; equivalent to true if omitted
  • STATEMENTS – one or more local protocol statements

The form of a break statement is simply break LABEL;. A break statement indicates that control should continue with the statement immediately after the loop labeled LABEL. The break statement must appear somewhere inside the loop labeled LABEL. The LABEL may also be omitted from the break statement, in which case the break statement refers to the inner-most loop.

Loops can also have an invariant, which is a boolean-valued expression that must be true when the loop is first entered and every time the loop repeats. The loop invariant is used by the SMT-based model checker to retain necessary information about variables after an arbitrary number of loop iterations.

do Statements

A do statement indicates that the system should now exhibit the behavior of a different local protocol (a sub-protocol) defined elsewhere. Once the sub-protocol is finished, the system should continue with what is specified after the do statement. It is essentially a function call. There are two forms:

do PROTOCOL;
do tail PROTOCOL;

where:

  • PROTOCOL – the name of the local protocol that should be performed next

do statements cannot call a protocol attached to a different component.

A do tail statement is a do statement that never returns back to the caller. A do tail statement can only appear in tail position, which is any position in the body of a protocol that is guaranteed to be the last statement executed. These additional restrictions allows the model checker to take advantage of tail-call optimization. do tail statements are also allowed to be recursive or mutually recursive while normal do statements are not.

Global Protocols

A global protocol describes sequences of message exchanges that should or will occur between components in a distributed software system. A global protocol is like a formal textual representation of one entire sequence diagram. The syntax is as follows:

global protocol NAME {
  GLOBAL_STATEMENTS
}

where:

  • NAME - the name of the local protocol
  • GLOBAL_STATEMENTS - zero or more global protocol statements

The following sections describe the global protocol statements that can appear in the protocol body.

exch Statements

An exchange statement describes a synchronous message exchange between two components. The syntax is as follows (where brackets indicate optional syntax elements):

exch SEND_WHAT [into RECV_WHAT] CONNECTION_CLAUSE;

where:

  • SEND_WHAT – indicates what values are allowed to be sent
  • RECV_WHAT – indicates what values the receiver is expecting to receive.
  • CONNECTION_CLAUSE – specifies the connection on which the message should be exchanged; see send Statements and Connection Clauses for syntax

The SEND_WHAT and RECV_WHAT are syntactically the same as their counterparts in the local send and recv statements. Normally, the RECV_WHAT is omitted and inferred from the SEND_WHAT, however there are two reasons for including it:

  • If the receiver stores the incomming message in a var- or let-bound variable which must be specified.
  • If the receiver's requirement (assuming clause) for an incomming message is different from the property guaranteed by the sender (where clause), and the user wishes for this difference to be reflected in the local projections of this protocol.

choice Blocks

A choice block indicates that control flow could continue down one of multiple branches. A single component is responsible for choosing the branch.

choice in COMPONENT
| GUARD1 =>
    CONT1
...
| GUARDn =>
    CONTn
end

where:

  • COMPONENT – the component that chooses the branch to continue down
  • GUARDi – a Flex boolean expression that must evaluate to true if the execution is to continue with CONTi
  • CONTi – Zero or more statements that could be executed if GUARDi is true

A choice block is like the global version of a branch block. Like its local counterpart, a choice can be nondeterministic and have an else guard. The main difference is that the choosing component of a choice block must be explicitly provided (after the in keyword); the choosing component of a branch is always the component of the enclosing local protocol. The guard expressions of a choice block are evaluated in the choosing component.

Synchronized Choice

Synchronized choice describes a situation where two or more components all make a choice of branch on their own, but they are all guaranteed to choose the same branch because they previously synchronized some of their data through message passing:

choice in C1, ..., Cm
where X1 = in C1{E11}, ..., in Cm{E1m}
      ...
      Xn = in C1{En1}, ..., in Cm{Enm}
BRANCHES
end

where:

  • C1..Cm – names of the components that make the internal choice
  • X1..Xn – a set of synchronized variables that each component C1..Cm can compute and whose values are guaranteed to be equal across C1..Cm
  • E11..Enm – Flex expressions used to calculate the synchronized variables in each component. Specifically, Eij is the expression to compute Xi in component Cj.
  • BRANCHES – Same as the branches in the normal choice block. The only variables that are allowed to be referenced in the guards are X1..Xn.

The where clause defines a list of synchronized variables X1..Xn. By defining a synchronized variable X, the user is asserting that all of the components C1..Cm are able to perform some computation to arrive at the same value (which becomes the value of X). It is a fatal error if during model checking or runtime the components arrive at different values of X.

Unlike normal choice, synchronized choice blocks cannot be nondeterministic as this would always result in a system with potential deadlock.

loop Blocks and break Statements

The global loop blocks and break statements have the same syntax and similar semantics to their local counterparts.

parallel Blocks

A parallel block specifies that the events of two or more sequences of statements should be interleaved. The effect is the same as if multiple "execution threads" were running, but it really correlates to the more general idea of concurrency.

parallel {
  STATEMENTS1
} with {
  ...
} with {
  STATEMENTSn
}

where:

  • STATEMENTSi – zero or more global statements comprising a "thread"

in Blocks

Local protocol statements can be used as global statements by placing them inside an in block.

in COMPONENT {
  LOCAL_STATEMENTS
}

where:

  • COMPONENT – is the component in which the local statements are executed
  • LOCAL_STATEMENTS – zero or more local protocol statements

do Statements

The global do statement has the same syntax as local do statements except that it can only call global protocols instead of local protocols.

system Declarations

A system declaration groups together local protocols to pass to the model checker.

system NAME { LOCAL_PROTOCOLS }

where,

  • NAME – a Flex identifier that is the name of the system
  • LOCAL_PROTOCOLS – zero or more names of local protocols belonging to this system; A semicolon should follow each name

For every component, there should be at most one attached local protocol in any particular system. This local protocol (if present) is considered by the model checker to be the "top-level" protocol that runs on the component (i.e., the protocol that runs when the component is "plugged in").

Statement Annotations

A statement annotation is attached to an exch, send, or recv statement.

Annotations are placed at the end of the statement immediately before the semicolon. Multiple annotations are separated by whitespace. For example:

exch any Order from Customer to Store
    @topic("shopping")
    @recv_within_milliseconds(1000);

Generally, annotations should only be used to express hardcoded information about transport mechanisms that varies from message to message. Examples include topic names and URLs. Annotations are not meant for specifying the "content" or "payload" of messages.

All Flex tools except the "test generation" and "compile" tools will pretty much completely ignore the meaning of these annotations.

State Annotations

A state annotation attaches metadata to the state of a protocol between the execution of two protocol statements. Here's an example:

local protocol sendOneOrThreeMessages {
    send any Msg to OtherComponent;

    @name("before sending")
    @end_state

    send any Msg to OtherComponent;
    send any Msg to OtherComponent;
}

The model checker recognizes two state annotations that modify its behavior:

  • The @name annotation simply provides a human readable name to the state which is then displayed in deadlocking traces or control flow graphs.

  • The @end_state annotation marks the state as an end state, which means the model checker does not consider it a deadlock if a protocol gets "stuck" in that state. The very end of a protocol body is always implicitly marked as an end state; all other end states must be explicitly marked with the annotation. It is almost always better to use branching and looping than to use this annotation, but it can be useful for debugging.