Skip to main content
PRL Project

CRASH - Clean-slate Resilient Adaptive Secure Hosts

Correct-by-construction Attack-tolerant Systems:

a project in the DARPA Clean-slate Resilient Adaptive Secure Hosts (CRASH) Program


Project goals

Correct by construction systems

We build correct-by-construction concurrent systems and maintain them. This means that system code and formal proofs are developed together and modified together. The proofs show that the code meets design specifications. Design specifications ensure that correctly built systems are attack tolerant (see Stockholm paper (PDF)), as we explain below. The general idea is that the system adapts to attack by changing process code, perhaps also limiting services temporarily. We have built system components this way in collaboration with colleagues in distributed systems and operating systems. Components of concurrent systems interact in complex ways that can be understood only by careful reasoning; even one page of protocol code can be highly error prone. Our expertise in building these systems and expressing their logical properties in a language that makes sense to systems builders who are not experts in formal methods is relevant to the goals of the DARPA Clean-slate Resilient Adaptive Secure Hosts (CRASH) program (see Kickoff talk (PDF) and Poster Slides (PDF)).

We use an interactive proof assistant whose base logical language is Computational Type Theory (CTT). We invoke a program extractor to synthesize provably correct code from the assistant's formal constructive proofs and a computation system to execute the code, both concretely and symbolically. The code and specifications are stored in a formal database which we call the Library. The proof assistant, program extractor, computation system, and Library are components of a Logical Programming Environment (see LPE) in which we create, demonstrate, and maintain systems. Maintenance involves re-proving properties and invariants when code or specifications are changed. Our LPE provides unique, fine-grained, concurrent replay mechanisms that help automate the maintenance task. The LPE is itself a distributed system that executes concurrent processes - key to allocating more computing power during computationally intensive proofs and to experimenting with the systems we build. In our work on CRASH we will export our computation system by writing it in other programming languages such as Java,F # , Erlang, Python, etc.

Logic of Events

We use a formal event logic to capture system behavior over time at a high level of abstraction, using event classes. We developed this intuitive logic based on our long running research interactions with system builders and colleagues who are leading researchers in systems. Event logic is integrated into the base logical language of our Logical Programming Environment.

Logic of Space

We organize systems into uniquely named components built from con- current processes. These components determine the logical structure of system space just as events determine the logical structure of time. Logical structures we impose on system space determine safe ways of sharing memory. Moreover, we have discovered methods for assembling components that preserve formal properties established for its processes. The key is guaranteeing interference free composition. Demonstration systems built over the course of CRASH, will most likely have a structure we can define in our logic of space. For example we can naturally model a tagged architecture.

Adaptive attack tolerance

We have explored in depth a technical approach to attack- tolerance that relies on the formal synthesis of process code variants all of which satisfy the required logical properties. This diversity of code at a high level of abstraction can be multiplied into numerous diverse specific instances of code running in a system. If the host system detects that a process has been compromised, it can replace the compromised modules by a provably equivalent variant or one known to satisfy a modified specification. This capability to create a "moving target" is clearly relevant to the CRASH program goals, and our experience in providing attack-tolerance for distributed systems is relevant in operating systems, especially those built over multiple core hardware, for virtual machine monitors, and for microkernels, where it is plausible to add diversity.

Correct theory/system modification

Another capability we bring to all of our formal work is the ability to correctly and semi-automatically modify entire formal theories stored in the LPE library. System components are embodied in theories, and we think of theories as systems at the logical level. Thus our theory modification mechanism, a tool for maintaining correct systems, is a vital component of our LPE. Our ability to rapidly modify large formal theories is also relevant to experiments with design alternatives. We have a great deal of experience semi-automatically modifying formal theories. The cost savings of our methods could be considerable.

There are already over seven thousand definitions and theorems, and nearly 17K inferences, and many realizers in the database for the logic of events, and they provide a considerable resource for this DARPA CRASH effort. That growing database will support the timely synthesis of new correct by construction process code, and allow us to track the dependence of code properties on formal assumptions about the system. Ultimately we will be able to correctly modify an entire system semi-automatically.

Goals

We will provide formal support to design, build and maintain correct-by-construction attack-tolerant systems, synthesizing correct process code from proofs of specifications. We will formulate conditions under which attack-tolerance properties are guaranteed and provide the means to determine consequences of attacks that alter these conditions. We will work with a design team(s) to explore additional, and perhaps highly novel, ways to use formal methods and proof technology to advance their goals and to explore software immunity.

Role of formal methods in systems research

Intellectual scope

The intellectual scope and technical power of formal methods have steadily increased over the past forty years. Computer science has bestowed high honors, e.g. the Turing Award, etc. on several of its leaders such as McCarthy, Hoare, Scott, Milner, Pnueli, and Clarke to name those who have directly influenced our work. Moreover, one of the great intellectual accomplishment of computer science is the demonstration that computers can assist us in complex reasoning tasks beyond the reach of the unaided human mind, illustrated by Georges Gonthier's definitive formal proof of the Four Color Theorem with the Coq proof assistant. We are confident that among the other accomplishments of formal methods will be the creation of a software immune system for computers, one that can survive cyber attacks and learn from them.

Humans are distinguished not only by their brain power, but also by the "intelligence of their immune systems" [44]. We believe that just as computer-assisted reasoning has allowed humans to solve a new class of intellectual problems, computer assisted reasoning will help the computers acquire a smart immune system. We increasingly see formal methods used to bootstrap formal tools and systems-for example, interactive proof assistants and their meta-languages such as ML (which Milner named because it was the Meta-Language of LCF [24]). The extension of this autocatalytic process to clean slate systems is natural. A powerful proof technology has already arisen along with a methodology for applying it to important tasks such as reliable compiler construction, verified distributed protocols, fully automated verification of authentication protocols, automatic optimization of protocol stacks, smart card domain separation, and verified OS kernels. It is clear to many formal methods researchers that our methods are essential to achieving the goals of the CRASH project, but not sufficient. CRASH requires excellent system designers, ideally those who also appreciate the many ways that precise formal analysis can be applied to achieving their goals. [In mathematics, the application of formal methods to the Four Color Problem was possible because Appel and Haken understood the value of computers in mathematics research]

An inspiring example of cooperation between operating systems and formal methods research is verification of the seL4 microkernel [30]. The authors say :"We have created a methodology for rapid kernel design and implementation that is the fusion of traditional operating systems and formal methods techniques." We would say the same thing about our previous work in distributed systems [38, 8, 7]. We stress that we can develop the code and the proofs together, that we can semi-automatically modify vast amounts of formal material, which is critical to tracking design changes. The cost to seL4 of some changes was on the order of a programmer year (py).

Important principles

There are a few principles that guide successful formal methods in systems research. First, it is critical to formalize at high levels of relevant abstraction--e.g. event classes not state machines, a Haskell OS model not C code. This is required because appropriate abstract concepts are both easier for engineers to understand and much easier to formalize, and they tend to have broad application. On the other hand, abstractions should be grounded in an implementation path. Steps toward implementation are taken by refinement, moving closer to executable code along a feasible path-e.g., the path from the pi-calculus to code is usually too long, while from event classes or a functional model it is feasible.

Declarative specifications and properties complement operational specifications and properties and provide an independent way of thinking about a task. Assertions can be organized into theories and connected to mathematical theories. Declarative abstractions provide an independent source of explanation and documentation of the code; there are many powerful software tools now available to use formal assertions (type checkers, model checkers, SMT solvers, SAT solvers, fully automatic provers, decision procedures, and so forth).

Powerful heuristic methods should be built on top of the formal machinery, such as tactics allowing the programmer/prover to explore plausible methods for finding proofs; tactics are safe because they produce formal proofs when they succeed. Moreover, in the safe proof assistants, say the LCF-based proof assistants such as HOL, Coq, Nuprl and MetaPRL, formal proofs are reduced to primitive proofs which can be independently checked by short programs (proof checkers) which can be written in several programming languages and executed on different platforms to provide a statistical complement to the semantic justifications for the primitive proofs. (By the way, nearly every major interactive proof assistant in use today is based on this LCF safe-tactic model due to Robin Milner.)

Correct by construction processes

This project is based on our experience in building collections of adaptive protocols for distributed systems that are correct by construction and which we have been able to modify and maintain using a rich Logical Programming Environment whose features we discussed above. We express system designs and properties using a (constructive) Logic of Events [8, 6]; safety and liveness properties for processes are naturally expressed in this very rich logic, which is integrated into the base logic of our LPE. This base logic is Computational Type Theory [3], one of the most expressive used by any proof assistant. Tools in the LPE synthesize executable code from constructive proofs that the specifications are realizable [8]-i.e., from a proof that a specification can be implemented we automatically extract an implementation of it.

We have used this proofs-as-processes method to build fault-tolerant protocols, provably secure protocols, adaptive protocols [38], automatically verified authentication protocols, and a semantics for access control logics and methods for their applications in operating systems. This basic methodology has led us to experiment with processes that we define as attack- tolerant in the sense that they can respond to perceived threats from the environment such as denial of service attacks, message order attacks, Byzantine attacks, and other threats [15]. These attack-tolerant processes will respond by adapting on-the-fly to alternative versions that are also known to be provably correct and provide the same functionality-or, if the environment does not permit that, provide a well-defined safe mode. Many of our methods and results apply to certain aspects of operating systems, especially our work on logical diversity, adaptive systems, access control, and authentication.

This system development capability, whether distributed or single host, relies on a formal process model that provides a computational semantics for assertions in our Logic of Events. The concepts of event structures [45, 2] and event classes [6] are defined over execution histories in a formal model of communicating processes. This semantics is expressed in such a way that proofs of assertions can contain distributed realizers. These realizers are abstract processes which can be compiled into appropriate programming languages such as C, Java, ML, Erlang, F # , etc. A key step in making this methodology practical is the use of programmable event classes to specify computing tasks at a high level of abstraction that can be translated automatically to process code in various standard programming languages. Recently we have extended our methodology to a still more abstract notion of process, a General Process Model [9] which we compile to code executable in the computation system of the LPE using a verified compiler -another service provided by the LPE. Thus our LPE is a prototype system development environment.

A practical result of this methodology is that compiled process code is automatically generated from constructive proofs that high-level event-based specifications are achievable. The seL4 designers used a Haskell model as an intermediate representation of the microkernel; our LPE permits the generation of such a model that is correct by construction.

Synthetic diversity

Attack Tolerance

We assume that our systems will be attacked. Even seL4 can be attacked on its assumptions, those offer several points of vulnerability, and there will always be assumptions about the environment that can be attacked. We will protect components by formally generating a large number of logically equivalent variants, stored in an attack response library. Each variant uses distinctly different code which a system under attack can install on-the-fly to replace compromised components. Each variant provably satisfies an appropriate specification-they are correct by construction. We have used this synthetic diversity to make protocols adaptive, and we can use them to make microkernels like seL4 adaptive as well.

We express threatening features of the environment formally and discriminate among the different types. We can do this in our new General Process Model because it can cleanly model the threat environment as well as components of modern operating systems. This capability allows us to study in depth how diversity works to defend systems. We can implement attack scenarios as part of our experimental platform.

Code Diversity

We introduce diversity at all levels of the formal code development (synthesis) process starting at a very high level of abstraction. Our latest experience is with distributed protocols, where we have had good results. The methods will carry over to processes in system components based on our experience with the Nexus system and NAL and with devices such as in line reference monitors and trusted platform modules. We mention first examples of distributed consensus protocols, our most recent work.

We can also inject code diversity into the fully automatic verification of authentication protocols in Protocol Composition Logic (PCL) [18] implemented in our system. These synthetic diversity techniques generate a large set of components, each of which is associated with a "genotype" that describes the parameters (such as choice of algorithm, choice of data structure, choice of implementation language) used to generate components of its "species".

References

[1] Martin Abadi. Variations in access control logic. Deonitic Logic in Computer Science, pages 96-109, 2000.

[2] Uri Abraham. Models for Concurrency, volume 11 of Algebra, Logic and Applications Series. Gordon and Breach, 1999.

[3] Stuart Allen, Mark Bickford, Robert Constable, Richard Eaton, Christoph Kreitz, Lori Lorigo, and Evan Moran. Innovations in computational type theory using Nuprl. Journal of Applied Logic, 4(4):428-469, 2006.

[4] H. Attiya and J. Welch. Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley Interscience, New York, 2nd edition, 2004.

[5] Moritz Y. Becker, Cdric Fournet, and Andrew D. Gordon. Secpal: Design and semantics of a decentralized authorization language. Technical report, In Proceedings of the 20th IEEE Computer Security Foundations Symposium (CSF, 2006.

[6] Mark Bickford. Component specification using event classes. In CBSE '09: Proceedings of the 12th International Symposium on Component-Based Software Engineering, Lecture Notes in Computer Science, volume 5582, pages 140-155, 2009.

[7] Mark Bickford. Automated proof of authentication protocols in a logic of events. In VERIFY'10: Proceedings of the 6th International Verification Workshop, to appear.

[8] Mark Bickford and Robert Constable. Formal foundations of computer security. In Formal Logical Methods for System Security and Correctness, volume 14, pages 29-52, 2008.

[9] Mark Bickford, Robert Constable, and David Guaspari. Generating event logics with higher-order processes as realizers. Technical Report CS Technical Report, Cornell University, 2010.

[10] Ken Birman, Robert Constable, Mark Hayden, Jason Hickey, Christoph Kreitz, Robbert van Renesse, Ohad Rodeh, and Werner Vogels. The Horus and Ensemble projects: Accomplishments and limitations. In DARPA Information Survivability Conference and Exposition (DISCEX 2000), pages 149-161, Hilton Head, SC, 2000. IEEE Computer Society Press.

[11] Kenneth P. Birman. A review of experience with reliable multicast. Software Practice and Experience, 29(9):741-774, 1999.

[12] Kenneth P. Birman and Thomas A. Joseph. Exploiting virtual synchrony in distributed systems. Proc 11th Symposium on Operating Systems Principles (SOSP), pages 123-138, November 1987.

[13] Tushar Chandra, Robert Griesemer, and Joshua Redstone. Paxos made live: an engi- neering perspective. In In Proc. of PODC, pages 398-407. ACM Press, 2007.

[14] W. Rance Cleaveland, editor. 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, volume 1579 of Lecture Notes in Computer Science. Springer, 1999.

[15] Robert Constable, Mark Bickford, and Robbert van Renesse. Correct-by-construction attack-tolerant systems. Technical Report CS Technical Report, Cornell University, 2010.

[16] Robert L. Constable. Effectively nonblocking consensus procedures can execute forever - a constructive version of flp. Technical Report Tech Report 11512, Cornell University, 2008.

[17] Robert L. Constable, Stuart F. Allen, H. M. Bromley, W. R. Cleaveland, J. F. Cre- mer, R. W. Harper, Douglas J. Howe, T. B. Knoblock, N. P. Mendler, P. Panangaden, James T. Sasaki, and Scott F. Smith. Implementing Mathematics with the Nuprl Proof Development System. Prentice-Hall, NJ, 1986.

[18] Anupam Datta, Ante Derek, John C. Mitchell, and Arnab Roy. Protocol composition logic (pcl). Electron. Notes Theor. Comput. Sci., 172:311-358, 2007.

[19] Roberto De Prisco, Butler Lampson, and Nancy Lynch. Revisiting the paxos algorithm. Theor. Comput. Sci., 243(1-2):35-91, 2000.

[20] Dhammika Elkaduwe, Gerwin Klein, and Kevin Elphinstone. Verified protection model of the sel4 microkernel. In VSTTE '08: Proceedings of the 2nd international conference on Verified Software: Theories, Tools, Experiments, pages 99-114, Berlin, Heidelberg, 2008. Springer-Verlag.

[21] Amy P. Felty and Douglas J. Howe. Hybrid interactive theorem proving using Nuprl and HOL. In William McCune, editor, Proceedings of the 14th International Conference on Automated Deduction, volume 1249 of Lecture Notes in Artificial Intelligence, pages 351-365. Springer, July 13-17 1997.

[22] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of dis- tributed consensus with one faulty process. J. ACM, 32(2):374-382, 1985.

[23] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. SIGOPS Oper. Syst. Rev., 37(5):29-43, 2003.

[24] Michael Gordon, Robin Milner, and Christopher Wadsworth. Edinburgh LCF: a mechanized logic of computation, volume 78 of Lecture Notes in Computer Science. Springer-Verlag, NY, 1979.

[25] Yuri Gurevich and Itay Neeman. Dkal: Distributed-knowledge authorization language. In CSF '08: Proceedings of the 2008 21st IEEE Computer Security Foundations Symposium, pages 149-162, Washington, DC, USA, 2008. IEEE Computer Society.

[26] Jason J. Hickey, Nancy Lynch, and Robbert van Renesse. Specifications and proofs for Ensemble layers. In Cleaveland [14], pages 119-133.

[27] Chi Ho, Robbert van Renesse, Mark Bickford, and Danny Dolev. Nysiad: practical protocol transformation to tolerate byzantine failures. In NSDI'08: Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pages 175-188, Berkeley, CA, USA, 2008. USENIX Association.

[28] Douglas J. Howe. Importing mathematics from HOL into Nuprl. In J. von Wright, J. Grundy, and J. Harrison, editors, Theorem Proving in Higher Order Logics, volume 1125, of Lecture Notes in Computer Science, pages 267-282. Springer-Verlag, Berlin, 1996.

[29] Douglas J. Howe. Semantic foundations for embedding HOL in Nuprl. In Martin Wirsing and Maurice Nivat, editors, Algebraic Methodology and Software Technology, volume 1101 of Lecture Notes in Computer Science, pages 85-101. Springer-Verlag, Berlin, 1996.

[30] Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. sel4: formal verification of an os kernel. In SOSP '09: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 207-220, New York, NY, USA, 2009. ACM.

[31] Christoph Kreitz. Automated fast-track reconfiguration of group communication sys- tems. In Cleaveland [14], pages 104-118.

[32] Christoph Kreitz. Building reliable, high-performance networks with the nuprl proof development system. JFP, 14(1):21-68, 2004.

[33] Christoph Kreitz, Mark Hayden, and Jason J. Hickey. A proof environment for the development of group communications systems. In Fifteen International Conference on Automated Deduction, number 1421 in Lecture Notes in Artificial Intelligence, pages 317-332. Springer, 1998.

[34] Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-65, 1978.

[35] Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133-169, 1998.

[36] Xavier Leroy. Formal certification of a compiler back-end or: programming a compiler with a proof assistant. SIGPLAN Not., 41(1):42-54, 2006.

[37] Jochen Liedtke. Toward real microkernels. COMMUNICATIONS OF THE ACM, 39(9), 1996.

[38] Xiaoming Liu, Christoph Kreitz, Robbert van Renesse, Jason J. Hickey, Mark Hayden, Kenneth Birman, and Robert Constable. Building reliable, high-performance communication systems from components. In David Kotz and John Wilkes, editors, 17th ACM Symposium on Operating Systems Principles (SOSP'99), volume 33(5) of Operating Systems Review, pages 80-92. ACM Press, December 1999.

[39] Nancy Lynch. Distributed Algorithms. Morgan Kaufmann Publishers, San Mateo, CA, 1996.

[40] Robin Milner. The Space and Motion of Communicating Agents. Cambridge University Press, New York, NY, USA, 2009.

[41] Fred B. Schneider, Kevin Walsh, and Emin Gun Sirer. Nexus authorization logic (nal): Design rationale and applications. Department of Computer Science Tech Report, Cornell University, Ithaca, NY, 2009.

[42] Jonathan S. Shapiro, Jonathan M. Smith, and David J. Farber. Eros: a fast capability system. In In Symposium on Operating Systems Principles, pages 170-185, 1999.

[43] H. Tuch. Formal Memory Modles for Verifying C Systems Code. PhD thesis, UNSW, 2008.

[44] Frank T. Vertosick, Jr. The Genius Within: Discovering the Intelligence of Every Living Thing. Harcourt, 2002.

[45] G. Winskel. An introduction to event structures. In J. W. de Bakker et al., editors, Linear Time, Branching Time and Partial Order in Logics and Models for Concurrency, number 345 in Lecture Notes in Computer Science, pages 364-397. Springer, 1989.