State Abstraction
Abstraction is a key concept in programming that can drastically simplify both the implementation and maintenance of code. It's particularly well suited to SPARK and its modular analysis. This section explains what state abstraction is and how you use it in SPARK. We explain how it impacts GNATprove's analysis both in terms of information flow and proof of program properties.
State abstraction allows us to:
express dependencies that wouldn't otherwise be expressible because some data that's read or written isn't visible at the point where a subprogram is declared — examples are dependencies on data, for which we use the
Global
contract, and on flow, for which we use theDepends
contract.reduce the number of variables that need to be considered in flow analysis and proof, a reduction which may be critical in order to scale the analysis to programs with thousands of global variables.
What's an Abstraction?
Abstraction is an important part of programming language design. It provides two views of the same object: an abstract one and a refined one. The abstract one — usually called specification — describes what the object does in a coarse way. A subprogram's specification usually describes how it should be called (e.g., parameter information such as how many and of what types) as well as what it does (e.g., returns a result or modifies one or more of its parameters).
Contract-based programming, as supported in Ada, allows contracts to be added to a subprogram's specification. You use contracts to describe the subprogram's behavior in a more fine-grained manner, but all the details of how the subprogram actually works are left to its refined view, its implementation.
Take a look at the example code shown below.
We've written a specification of the subprogram Increase
to say that it's
called with a single argument, a variable of type Integer
whose
initial value is less than 100. Our contract says that the only effect of
the subprogram is to increase the value of its argument.
Why is Abstraction Useful?
A good abstraction of a subprogram's implementation is one whose specification precisely and completely summarizes what its callers can rely on. In other words, a caller of that subprogram shouldn't rely on any behavior of its implementation if that behavior isn't documented in its specification.
For example, callers of the subprogram Increase
can assume that it
always strictly increases the value of its argument. In the code snippet
shown below, this means the loop must terminate.
Callers can also assume that the implementation of Increase
won't cause
any runtime errors when called in the loop. On the other hand, nothing in
the specification guarantees that the assertion show above is correct: it
may fail if Increase
's implementation is changed.
If you follow this basic principle, abstraction can bring you significant benefits. It simplifies both your program's implementation and verification. It also makes maintenance and code reuse much easier since changes to the implementation of an object shouldn't affect the code using this object. Your goal in using it is that it should be enough to understand the specification of an object in order to use that object, since understanding the specification is usually much simpler than understanding the implementation.
GNATprove relies on the abstraction defined by subprogram contracts and
therefore doesn't prove the assertion after the loop in Client
above.
Abstraction of a Package's State
Subprograms aren't the only objects that benefit from abstraction. The
state of a package — the set of persistent variables defined in it —
can also be hidden from external users. You achieve this form of
abstraction — called state abstraction — by defining variables in
the body or private part of a package so they can only be accessed through
subprogram calls. For example, our Stack
package shown below provides
an abstraction for a Stack
object which can only be modified using the
Pop
and Push
procedures.
package Stack is
procedure Pop (E : out Element);
procedure Push (E : in Element);
end Stack;
package body Stack is
Content : Element_Array (1 .. Max);
Top : Natural;
...
end Stack;
The fact that we implemented it using an array is irrelevant to the caller. We could change that without impacting our callers' code.
Declaring a State Abstraction
Hidden state influences a program's behavior, so SPARK allows that state to
be declared. You can use the Abstract_State
aspect, an abstraction
that names a state, to do this, but you aren't required to use it even for
a package with hidden state. You can use several state abstractions to
declare the hidden state of a single package or you can use it for a
package with no hidden state at all. However, since SPARK doesn't allow
aliasing, different state abstractions must always refer to disjoint sets
of variables. A state abstraction isn't a variable: it doesn't have a type
and can't be used inside expressions, either those in bodies or contracts.
As an example of the use of this aspect, we can optionally define a state
abstraction for the entire hidden state of the Stack
package like this:
package Stack with
Abstract_State => The_Stack
is
...
Alternatively, we can define a state abstraction for each hidden variable:
package Stack with
Abstract_State => (Top_State, Content_State)
is
...
Remember: a state abstraction isn't a variable (it has no type) and can't be used inside expressions. For example:
pragma Assert (Stack.Top_State = ...);
-- compilation error: Top_State is not a variable
Refining an Abstract State
Once you've declared an abstract state in a package, you must refine it
into its constituents using a Refined_State
aspect. You must place
the Refined_State
aspect on the package body even if the package
wouldn't otherwise have required a body. For each state abstraction you've
declared for the package, you list the set of variables represented by that
state abstraction in its refined state.
If you specify an abstract state for a package, it must be complete,
meaning you must have listed every hidden variable as part of some state
abstraction. For example, we must add a Refined_State
aspect on our
Stack
package's body linking the state abstraction (The_Stack
) to
the entire hidden state of the package, which consists of both Content
and Top
.
Representing Private Variables
You can refine state abstractions in the package body, where all the
variables are visible. When only the package's specification is available,
you need a way to specify which state abstraction each private variable
belongs to. You do this by adding the Part_Of
aspect to the
variable's declaration.
Part_Of
annotations are mandatory: if you gave a package an abstract
state annotation, you must link all the hidden variables defined in its
private part to a state abstraction. For example:
Since we chose to define Content
and Top
in Stack
's private
part instead of its body, we had to add a Part_Of
aspect to both of
their declarations, associating them with the state abstraction
The_Stack
, even though it's the only state abstraction. However, we
still need to list them in the Refined_State
aspect in Stack
's
body.
package body Stack with
Refined_State => (The_Stack => (Content, Top))
Additional State
Nested Packages
So far, we've only discussed hidden variables. But variables aren't the
only component of a package's state. If a package P
contains a nested
package, the nested package's state is also part of P
's state. If the
nested package is hidden, its state is part of P
's hidden state and
must be listed in P
's state refinement.
We see this in the example below, where the package Hidden_Nested
's
hidden state is part of P
's hidden state.
Any visible state of Hidden_Nested
would also have been part of P
's
hidden state. However, if P
contains a visible nested package, that
nested package's state isn't part of P
's hidden state. Instead, you
should declare that package's hidden state in a separate state abstraction
on its own declaration, like we did above for Visible_Nested
.
Constants that Depend on Variables
Some constants are also possible components of a state abstraction. These are constants whose value depends either on a variable or a subprogram parameter. They're handled as variables during flow analysis because they participate in the flow of information between variables throughout the program. Therefore, GNATprove considers these constants to be part of a package's state just like it does for variables.
If you've specified a state abstraction for a package, you must list such hidden constants declared in that package in the state abstraction refinement. However, constants that don't depend on variables don't participate in the flow of information and must not appear in a state refinement.
Let's look at this example.
Here, Max
— the maximum number of elements that can be stored in
the stack — is initialized from a variable in an external package.
Because of this, we must include Max
as part of the state abstraction
The_Stack
.
Subprogram Contracts
Global and Depends
Hidden variables can only be accessed through subprogram calls, so you
document how state abstractions are modified during the program's execution
via the contracts of those subprograms. You use Global
and
Depends
contracts to specify which of the state abstractions are
used by a subprogram and how values flow through the different variables.
The Global
and Depends
contracts that you write when
referring to state abstractions are often less precise than contracts
referring to visible variables since the possibly different dependencies of
the hidden variables contained within a state abstraction are collapsed
into a single dependency.
Let's add Global
and Depends
contracts to the Pop
procedure in our stack.
In this example, the Pop
procedure only modifies the value of the
hidden variable Top
, while Content
is unchanged. By using distinct
state abstractions for the two variables, we're able to preserve this
semantic in the contract.
Let's contrast this example with a different representation of
Global
and Depends
contracts, this time using a single
abstract state.
Here, Top_State
and Content_State
are merged into a single state
abstraction, The_Stack
. By doing so, we've hidden the fact that
Content
isn't modified (though we're still showing that Top
may be
modified). This loss in precision is reasonable here, since it's the whole
point of the abstraction. However, you must be careful not to aggregate
unrelated hidden state because this risks their annotations becoming
meaningless.
Even though imprecise contracts that consider state abstractions as a whole
are perfectly reasonable for users of a package, you should write
Global
and Depends
contracts that are as precise as possible
within the package body. To allow this, SPARK introduces the notion of
refined contracts, which are precise contracts specified on the bodies of
subprograms where state refinements are visible. These contracts are the
same as normal Global
and Depends
contracts except they refer
directly to the hidden state of the package.
When a subprogram is called inside the package body, you should write
refined contracts instead of the general ones so that the verification can
be as precise as possible. However, refined Global
and
Depends
are optional: if you don't specify them, GNATprove will
compute them to check the package's implementation.
For our Stack
example, we could add refined contracts as shown below.
Preconditions and Postconditions
We mostly express functional properties of subprograms using preconditions and postconditions. These are standard Boolean expressions, so they can't directly refer to state abstractions. To work around this restriction, we can define functions to query the value of hidden variables. We then use these functions in place of the state abstraction in the contract of other subprograms.
For example, we can query the state of the stack with functions
Is_Empty
and Is_Full
and call these in the contracts of procedures
Pop
and Push
:
Just like we saw for Global
and Depends
contracts, you may
often find it useful to have a more precise view of functional contracts in
the context where the hidden variables are visible. You do this using
expression functions in the same way we did for the functions Is_Empty
and Is_Full
above. As expression function, bodies act as contracts for
GNATprove, so they automatically give a more precise version of the
contracts when their implementation is visible.
You may often need a more constraining contract to verify the package's
implementation but want to be less strict outside the abstraction. You do
this using the Refined_Post
aspect. This aspect, when placed on a
subprogram's body, provides stronger guarantees to internal callers of a
subprogram. If you provide one, the refined postcondition must imply the
subprogram's postcondition. This is checked by GNATprove, which reports a
failing postcondition if the refined postcondition is too weak, even if
it's actually implied by the subprogram's body. SPARK doesn't peform a
similar verification for normal preconditions.
For example, we can refine the postconditions in the bodies of Pop
and
Push
to be more detailed than what we wrote for them in their
specification.
Initialization of Local Variables
As part of flow analysis, GNATprove checks for the proper initialization of variables. Therefore, flow analysis needs to know which variables are initialized during the package's elaboration.
Todo
What is the target audience of this book? If its people who have no Ada experience then they likely have no idea what elaboration is. One option is to add the prerequisite knowledge for this book in the intro, and link to the Introduction to Ada (which also does not discuss elaboration).
You can use the Initializes
aspect to specify the set of visible
variables and state abstractions that are initialized during the
elaboration of a package. An Initializes
aspect can't refer to a
variable that isn't defined in the unit since, in SPARK, a package can only
initialize variables declared immediately within the package.
Initializes
aspects are optional. If you don't supply any, they'll
be derived by GNATprove.
For our Stack
example, we could add an Initializes
aspect.
Flow analysis also checks for dependencies between variables, so it must be
aware of how information flows through the code that performs the
initialization of states. We discussed one use of the Initializes
aspect above. But you also can use it to provide flow information. If the
initial value of a variable or state abstraction is dependent on the value
of another visible variable or state abstraction from another package, you
must list this dependency in the Initializes
contract. You specify
the list of entities on which a variable's initial value depends using an
arrow following that variable's name.
Let's look at this example:
Here we indicated that V2
's initial value depends on the value of
Q.External_Variable
by including that dependency in the
Initializes
aspect of P
. We didn't list any dependency for
V1
because its initial value doesn't depend on any external
variable. We could also have stated that lack of dependency explicitly by
writing V1 => null
.
GNATprove computes dependencies of initial values if you don't supply an
Initializes
aspect. However, if you do provide an
Initializes
aspect for a package, it must be complete: you must list
every initialized state of the package, along with all its external
dependencies.
Code Examples / Pitfalls
This section contains some code examples to illustrate potential pitfalls.
Example #1
Package Communication
defines a hidden local package, Ring_Buffer
,
whose capacity is initialized from an external configuration during
elaboration.
This example isn't correct. Capacity
is declared in the private part
of Communication
. Therefore, we should have linked it to State
by
using the Part_Of
aspect in its declaration.
Example #2
Let's add Part_Of
to the state of hidden local package Ring_Buffer
,
but this time we hide variable Capacity
inside the private part of
Ring_Buffer
.
This program is correct and GNATprove is able to verify it.
Example #3
Package Counting
defines two counters: Black_Counter
and
Red_Counter
. It provides separate initialization procedures for each,
both called from the main procedure.
This program doesn't read any uninitialized data, but GNATprove fails to
verify that. This is because we provided a state abstraction for package
Counting
, so flow analysis computes the effects of subprograms in terms
of this state abstraction and thus considers State
to be an in-out
global consisting of both Black_Counter
and
Red_Counter
. So it issues the message requiring that State
be
initialized after elaboration as well as the warning that no procedure in
package Counting
can initialize its state.
Example #4
Let's remove the abstract state on package Counting
.
This example is correct. Because we didn't provide a state abstraction, GNATprove reasons in terms of variables, instead of states, and proves data initialization without any problem.
Example #5
Let's restore the abstract state to package Counting
, but this time
provide a procedure Reset_All
that calls the initialization procedures
Reset_Black_Counter
and Reset_Red_Counter
.
This example is correct. Flow analysis computes refined versions of
Global
contracts for internal calls and uses these to verify that
Reset_All
indeed properly initializes State
. The
Refined_Global
and Global
annotations are not mandatory and
can be computed by GNATprove.
Example #6
Let's consider yet another version of our abstract stack unit.
This example isn't correct. There's a compilation error in Push
's
postcondition: The_Stack
is a state abstraction, not a variable, and
therefore can't be used in an expression.
Example #7
In this version of our abstract stack unit, a copy of the stack is returned
by function Get_Stack
, which we call in the postcondition of Push
to specify that the stack shouldn't be modified if it's full. We also
assert that after we push an element on the stack, either the stack is
unchanged (if it was already full) or its top element is equal to the
element just pushed.
This program is correct, but GNATprove can't prove the assertion in
Use_Stack
. Indeed, even if Get_Stack
is an expression function, its
body isn't visible outside of Stack
's body, where it's defined.
Example #8
Let's move the definition of Get_Stack
and other expression functions
inside the private part of the spec of Stack
.
This example is correct. GNATprove can verify the assertion in
Use_Stack
because it has visibility to Get_Stack
's body.
Example #9
Package Data
defines three variables, Data_1
, Data_2
and
Data_3
, that are initialized at elaboration (in Data
's package
body) from an external interface that reads the file system.
This example isn't correct. The dependency between Data_1
's, Data_2
's, and
Data_3
's initial values and File_System
must be listed in
Data
's Initializes
aspect.
Example #10
Let's remove the Initializes
contract on package Data
.
This example is correct. Since Data
has no Initializes
aspect,
GNATprove computes the set of variables initialized during its elaboration
as well as their dependencies.