<meta charset="utf-8" emacsmode="-*- markdown -*-">

**Improving Counterexample Quality from Failed Program Verification**

Li Huang
Bertrand Meyer
Manuel Oriol

Abstract

In software verification, a successful automated program proof is the ultimate triumph. The road to such success is however, paved with many failed proof attempts. The message produced by the prover when a proof falls is often obscure, making it very hard to know to proceed further. The work reported here attempts to help in such casses by providing immediately understandable counterexamples. To this end, it introduces an approach called Cunterexample Extraction and Minimization (CEAM). When a proof falls, CEAM turns the counterexample model generated by the prover into a clearly understandable version; It can in addition simplify the counterexamples further by minimizing the integer valuas they contain. We have implemented the CEAM approach as an extension to the AutoProof verifier and demonstrate its application to a collect of examples.

Index Terms -- Program Verification, Counterexample, Autoproof, Boogle, SMT

Introduction

Deducative program verification performs a rigouros analysis of the correctness of programs with respect to their functional behaviour, usually specified formally by contracts (such as pre- and postconditions, can class and loop invariants). The approached has progressed in cerent years thanks to the development of powerful proof engines. In practice, however, verifying industrial applications remains difficult. One of the obstacles is the lack of intuitive feedback to understand the reasons why a verification attempt failed. Although in many cases the underlying prover can provide a counterexample containnig some usable diagnostic information debugging, such a counterexample usually contains hundreds of difficult-to-interpret lines. Another obstacle to usability is that integer values generated by the prover for the counterexamples are often very large, and hence do not provide programmers with an easy intuitive understanding of what is wrong.

This article presents the Counterexample Extraction and Minimization (CEAM) approach for improving the quality of counterexamples produced when a proof fails, and making them usable for identifying and correcting the undelying bugs. We have implemented CEAM as an extension of the AutoProof environment [1], [23], a static verification platform for contract-equipped Eiffel [15] programs based on Hoare-style proofs. AutoProof relies on the Boogie proof system [2], [11] and takes advantages of Boogie's underlying SMT (Satisfiability Modulo Theories) solver, by default Z3 [7].

When a proof fails, CeAM exploits the counterexample models (hereinafter referred to simply as models) generated by the SMT solver and generates simple counterexamples in a format more intuitive to programmers. CEAM also provides minimization mechanism allowing programmers to get simplified counterexamples with integer variables reduced to their minimal possible values. The current version of CEAM supports primitive types (integer, boolean), user-defined types (classes) as well as some commonly used container types (such as arrays and sequences).

Section II illustrates an example of using the CEAM approach. Section III introduces the technologies used in our verification framework. Section IV describes the details of the implementation of the CEAM. We evaluate the applicability of the CEAM through a series of examples in Section V. After a review of related work in Section VI, Section VII concludes the paper with our ongoing work.

II. An example session

Before explaring the principles and technologies of the CEAM approach, we look at this practila use on a representative example (Fig. 1). The intent of the max function in class MAX is to return into Result the maximum element of an integer array a of size a.count. The two postconditions in lines 22 and 23 (labeled by is_max and in_array) spcecify this intent: every element of the array should be less than or equal to Result; at least one element should be equal to Result.

When we try to verifi the max function using AutoProof, verification fails and AutoProof returns an error message "Postcondition /is_max/ may be violated" (the first row in Fig. 2). Such a generic message tells us that the prover cannot establish the postcondition, but does not enable us to find out why. In this case, programmers can look at the model generated by the Z3 solver to understand the cause of the failure. Deciphering the model is a cumbersome task: the model spans hundreds of lines and is expressed in a cryptic formalism.

In contrast, AutoProof extended with CEAM automatically generates a much simpler counterexample from the model. As displayed in the second row of Fig. 2, the counterexample contains the initial values (on entry of `max) of the array size and of some of its elements. Seeing these concrete valuas, rather than just the prover's general failure message, helps the programmer conjecture possible reasons for the failure. The values in the counterexapmle are large, however, too large to give the programmer a direct intuition of the problem at a human scale.

    class MAX feature
    max (a: ARRAY[INTEGER]): INTEGER
        require a.count > 0
        local i:INTEGER
        do
            from
                Result := a[1]; i := 2
            invariant
                    2 <= i and i <= a.count+1
                    ∀ j: 1 |..| (i-1) | a.sequence [j] <=Result
                    ∃ j: i |..| (i-1) | a.sequence [j] = Result
            until
                i >= a.count
            loop
                if a [1] > Result then
                    Result := a[i]
                end
                i := i + 1
            variant a.count -i
        end
        ensure
            is_max: ∀ j:1|..|a.count |a.sequence[j] <= Result
            in_array: ∃ j:1|..|a.count| a.sequence[j] = Result
        end
    end

Fig 1. MAX is a class that finds the maximum element of an integer array; a fault (the exit condition at line 13 is incorrect) is injected to the code for presentation purposes.

To provide a more intuitive illustration, CEAM allows the programmer to query AutoProof further to obtain a minimal counterexample in the last Row of Fig 2, where integer variables have been reduced to their minimal possible values.

INSERTAR FIG2 AQUI

Fig. 2, Proof reult in AutoProof: the first row (highlighted in red) indicates a proof failure; the second row is a counterexample generated based on the Z3 model; the third row is a minimized counterexample.

This minimzed counterexample provides a simple diagnostic trace of max: on loop initialization at line 7, Result = 0 and i = 2; at line 13, the exit condition of th loop evaluates to True with a.count =2 and i=2, which forces the loop to terminate. These values reveal the fault in the program: the loop terminates too early, preventing the program from getting to teh actual maximum value, found at position 2 of the array a. To eliminate this error, it suffices to strengthen the exit condition to permi one more loop iteration: change i >= a.count to i > a.count.

III Technology stack

This section introduces technologies used in the present work, including language and prover.

Eiffel [15], [16] is an object-oriented programming language which natively supports Desing-by-Contract [14]. An Eiffel program consists of a set of /classes/. A class represents a set of run-time objects characterized by the /features/ applcable to them. Fig. 3 shows a simple class representing bank accounts. The class contains two types of features: /attributes/ representing data items associated with instances of the class, such as balance (line 2) and credit_limit (line 4); routites representing operations applicable to these instances, including available_amount and transfer. Routines are further divided into /procedures/ (with no returned value) and functions (returning a value). Here, available_amount is a function returning an integer (represented by the special variable Result), and transfer is a procedure.

class ACCOUNT feature
balance: INTEGER
credit_limit: INTEGER
available_amount: INTEGER
    do
        Result := balance - credit_limit
    end
transfer (amount: INTEGER; other: ACCOUNT)
    require
        amount >= 0
        amount <= avaliable_amount
    do
        balance := balance - amount
        other.balance := other.balance + amount
    ensure
        withdrawal: balance = old balance - amount
        depost: other.balance = old other.balance + amount
    end
end

Fig. 3. A class ipmlementing the behaviour of bank accounts

Programmers can specify the properties of Eiffel classes by equipping them with contracts of the following types:

A precondition (require) must be satisfied at the the time of any call to the routine; the precondition of transfer (lines 13 - 15), for example, requires the value of amount to be non-negative an no greater than available_amount
A postcondition (ensure) must be guranteed on routines exit; for instance, a postcondition of transfer at line 20 states that, at the end of the execution of transfer the value of balance must have been decresed by amount.
A loop invariant (invariant) characterizes the semantics of a loop in the form property satisfied after initialization and preserved by every iteration, as illustrated by the invariant of max (lines 9 -11 in Fig 1)

Section

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

\begin{equation} \iint xy^2,dx,dy = \frac{1}{6}x^2y^3 \end{equation}

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

\begin{equation} u=\frac{-y}{x^2+y^2},,\quad v=\frac{x}{x^2+y^2},,\quad\text{and}\quad w=0,. \end{equation}

Subsection

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

\begin{equation} \begin{bmatrix} 1 & x & 0 \ 0 & 1 & -1 \end{bmatrix}\begin{bmatrix} 1 \ y \ 1 \end{bmatrix} =\begin{bmatrix} 1+xy \ y-1 \end{bmatrix}. \end{equation}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improving-counterexample-quality-from-failed-program-verification.html

improving-counterexample-quality-from-failed-program-verification.html

Abstract

Introduction

II. An example session

III Technology stack

Section

Subsection

Files

improving-counterexample-quality-from-failed-program-verification.html

Latest commit

History

improving-counterexample-quality-from-failed-program-verification.html

File metadata and controls

Abstract

Introduction

II. An example session

III Technology stack

Section

Subsection