Presented annually to the author of an outstanding doctoral dissertation in the area of Programming Languages. The award includes a prize of $1,000. The awardee can choose to receive the award at ICFP, SPLASH, POPL, or PLDI. At the discretion of the Selection Committee, multiple awards and/or honorable mentions may be presented for a given year.
The Reynolds Dissertation Award recognizes the contributions to computer science that John C. Reynolds made during his life. It is a renaming of the SIGPLAN Outstanding Doctoral Dissertation Award to encourage the clarity and rigor that Reynolds embodied and at the same time provides a reminder of Reynolds’s legacy and the difference a person can make in the field of programming language research.
All questions about the John C. Reynolds Doctoral Dissertation Award should be directed to the SIGPLAN Awards co-Chairs.
Nominations can be submitted at any time using the Web form at https://awards.sigplan.org/nominate/reynolds/. Nominations submitted on or before January 15th will be considered for award that year. The nominated dissertation must have been submitted for award of the doctoral degree in the year prior to the nomination deadline, and be available in English (to facilitate evaluation by the selection committee).
Recipients are selected by a committee constituted as follows:
The current committee is:
If any member of the committee has a conflict of interest with a given nominee they shall declare that to the committee; once so declared, conflicts of interest shall not automatically prevent a committee member from taking part in the selection process. However, if a member of the committee, or the chair of the committee, feels that the association of a committee member with a nominee would interfere with impartial consideration of the nominees, that conflicted member shall be absented from the relevant parts of the discussion. If a committee member has conflicts of interest with more than one nominee, the Chair of the Committee may ask the constituency that appointed the committee member to select a replacement member. The SIGPLAN EC Chair will adjudicate as necessary.
Machine Learning for Programming Language Processing
Advisor: Eran Yahav
Uri Alon’s dissertation has made significant contributions to the emerging area of “language models for code,” an exciting topic that lies at the intersection of Machine Learning and Programming Languages research.
A central question of the dissertation is how to effectively leverage the structure available in programming languages to provide novel neural solutions to programming-related tasks. The key contribution of this work is to leverage paths in the AST representation of the program to design new machine-learning models and to apply such models to solve a number of tasks of practical importance, including code completion, code captioning, and method-name prediction, and to do this across a number of different programming languages.
Overall, Uri Alon’s dissertation combines insights from the PL and ML communities in an elegant way to solve many coding-related problems of practical interest. The thesis stands out in terms of its breadth of contributions, interdisciplinary nature, and high-quality writing.
- Selection committee: Işil Dillig, Tom Reps, Milind Kulkarni, Stephanie Weirich, Hidehiko Masuhara
Novel Polynomial Approximation Methods for Generating Correctly Rounded Elementary Functions
Advisor: Santosh Nagarakatte
Jay Lim’s dissertation makes foundational advances on techniques for generating correctly rounded math libraries for numerous elementary functions (e.g., sin, cos, log) for multiple representations (float, posits) and rounding modes. The resulting libraries are significantly faster than the state-of-the-art while producing correct results for all inputs. This is a remarkable achievement that stands out in terms of its direct and immediate impact.
In more detail, this dissertation proposes a collection of novel techniques for generating polynomial approximations that produce correctly rounded results of an elementary function. The dissertation shows how this problem can be structured as a linear-programming problem in a way that accounts for numerical errors that arise from polynomial approximation and range reduction. The dissertation also shows how to generate a single polynomial approximation that produces the correctly rounded results for multiple rounding modes and multiple precision configurations.
Overall, the dissertation is very well-written and combines elegant theoretical insights with exemplary experimental results to solve an open problem of practical interest.
- Selection committee: Işil Dillig, Tom Reps, Milind Kulkarni, Stephanie Weirich, Hidehiko Masuhara
Understanding and Evolving the Rust Programming Language
Advisor: Derek Dreyer
Ralf Jung’s dissertation aims to provide an understanding and justification for the practical memory and concurrency safety that the type system of Rust is meant to enforce. The main new challenge is posed by libraries that use unsafe, aren’t checked by the type system, but are assumed to be “safe”. In particular, the usual progress-and-preservation style of proof of type soundness does not work in this setting because it relies on a closed-world assumption. To address the problem, the thesis contributes a semantic characterization of the semantics of the properties to enforce, which is non-trivial due to the richness of Rust’s typing abstractions. In collaboration with colleagues Ralf Jung developed a rich program logic, called Iris, based on ideas of higher-order, concurrent, separation logic, used to give a semantic interpretation of Rust’s types. Iris has now been used in other contexts, e.g., in the practice to reason formally about other kinds of low-level systems code (e.g., C and assembly), as well as by systems and security researchers.
The primary contribution of the dissertation is the development (and validation) of a semantic model of Rust’s type system using Iris, all developed within Coq, that makes precise the informal notions that the Rust designers had in mind. That precision, in turn, makes it possible to set up criteria that unchecked libraries must satisfy in order to ensure that checked code that is linked against them will respect the memory and data-race safety properties that the type system was intended to enforce. The effectiveness is demonstrated by verifying a number of the critical un-checked library abstractions. Like Iris, the ideas underlying the verification regime proposed here transfer to reasoning about any practical systems that are a mix of of languages, e.g., to reasoning about Java programs involves reasoning about the run-time system that is implemented in C.
Ralf Jung did not only formalize everything in a proof assistant, so the proofs are machine-checked and re-usable by others, but also put together practical tools that people can use today. The thesis will continue to influence PL researchers via Iris and will continue to influence practitioners via Rust.
Scalable Automated Reasoning for Programs and Deep Learning
Advisor: Martin Vechev and Markus Püschel
The central research question addressed by Singh’s dissertation is how to enable scalable and precise automated reasoning based on the abstract interpretation framework in the domains of numerical software and deep learning models.
With regard to numerical software the thesis contributes a new method to online decomposition of program variables, which speeds up the Polyhedra domain and also generalizes to other abstract domains by providing a theory that gives a general construction for obtaining decomposed transformers from existing non-decomposed ones. This method is effective in removing redundant computations at each analysis step. In addition, the thesis proposes a reinforcement learning method to learn policies for selectively losing precision at different analysis steps, which however do not impact the overall precision, because some precise steps may eventually be discarded later and hence do not affect the end result. The methods are implemented in the ELINA library, which includes a number of important numerical domains (e.g., Octagon, Polyhedra, Zone), and is shown to outperform the state-of-the-art Polyhedra library by orders of magnitude. ELINA is currently the state-of-the-art library for numerical static analysis and is used in both industry and academia
To address the analysis of deep learning models, the dissertation provides a set of abstract domains, specifically designed for deep learning models. The observation motivating this design decision is that standard numerical abstract domains are not suited for analyzing neural networks due to the nature of transformations of these networks and their non-linearity. A key contribution is the DeepPoly domain and its transformers for handling the usual activation functions used in deep networks. All domains and transformers are carefully implemented in a neural network verifier, called ERAN, which is currently the state-of-the-art system for neural network certification and is also used in both academia and industry.
Overall, the thesis includes everything one would like to see in an outstanding doctoral dissertation: clean mathematical concepts with corresponding efficient algorithms which solve hard and important problems, ones that have resisted a solution for decades, a complete implementation of all ideas in state-of-the-art libraries used in academia and industry, all while opening new directions of research that have been picked up by the community as demonstrated by the large set of follow-up works.
Advisor: Rupak Majumdar
Soundness is at the core of most PL verification techniques and random testing is a commonly used technique for analyzing software. Developing a theory of soundness for random testing is thus very important, but very few results existed before this thesis. Randomized techniques are seldom used in (sound) program analyses; hence, addressing the problem requires the development of new ways to approaching it. Filip’s thesis is among the first to apply deep techniques from randomized algorithms and combinatorics understanding and explaining the effectiveness of random testing. In turn, the theory helps with the design of new random testing approaches. The thesis adresses a hard problem brining in novel theory and proving hard theorems. When we see a phenomenon that we cannot immediately explain (in this case that random testing is so effective), we should try to build a scientific explanation. For some problems, including random testing, it is unclear that one can actually formulate a precise theory, because the “real world” is extremely messy. The fact that Filip is able to formulate the problem precisely and prove nontrivial theorems about them is surprising and opens the door to a new field.
Advisor: David Walker
Ryan Beckett’s thesis addresses the highly relevant topic of ensuring the correctness of computer network configurations. Computer networks connect key components of the world’s critical infrastructure - their misconfiguration may have severe consequences for our society. While the problems Ryan considers are from the networking community, the methods that he uses to solve them are drawn from the programming languages and formal methods communities, including declarative languages, automata, logic, compilers, bisimulation, static analysis, abstraction. Using these methods Ryan thesis describes new principles, algorithms, and tools for both verification and synthesis of network control plane algorithms.
On the verification side, Ryan’s key insight is that one can characterize the fixed points to which routing algorithms converge. By carefully representing the constraints that define such fixed points as SMT formula, he developed the world’s most general and efficient control plane verification engine. To further speed up verification, he defined clever new algorithms capable of computing small, abstract networks that are behaviorally equivalent to much larger ones. On the synthesis side, Ryan defines a new programming language called Propane for specifying network control plane behavior. The work demonstrates that it takes just 50 lines of code in the right high-level programming language (as opposed to 1000s of lines of configuration, per device, for hundreds of separate devices) to specify core network requirements that are compiled to industry-standard devices. Ryan also defines new analyses that guarantee correctness in the presence of device failures.
Ryan’s results have been published in top programming languages and networking conferences. While all aspects of the work could easily have been submitted to programming languages or formal methods venues, submitting his work to networking conferences has certainly maximized his impact in his domain of study. Most notably, his work on the Propane language won the best paper award at SIGCOMM in 2016. With his thesis, Ryan Beckett has demonstrated his capability to conduct truly interdisciplinary research of the highest scientific quality: The results were possible only with a deep knowledge across the programming languages, formal methods and networking domains. Moreover, the thesis is an excellent witness of the profound impact that programming language and formal reasoning methods can have on other research areas.
Advisor: Benjamin C. Pierce and Aaron Roth
The thesis explores and generalizes the COUPLING proof technique, for establishing properties of randomized algorithms. A correspondence between two different probabilistic programs (or two runs of the same program) requires the specification of the c orrelation between corresponding pairs of random draws and then extending this coupling on samples to a coupling on the resulting output distributions, which can then be used to establish the desired property on the programs. As Probabilistic Relation al Hoare Logic has just the right structure to be able to formally encode these coupling arguments, the thesis analyzes the structure of these arguments through this formal lens, justifying the attractiveness of the coupling approach in terms of compo sitionality. It then considers an enriched logic and its connection to approximate couplings, which in turn are directly connected to differential privacy. Working in this logic, it gives novel proofs of some key constructions from differential priv acy, including the exponential and sparse vector mechanisms. The proof for sparse vector is the first ever to be carried out in a machine-checkable form.
Taken together, these results constitute a significant advance in our ability to mechanize key properties of important randomized algorithms such as those found in the differential privacy literature.
Advisor: Santosh Nagarakatte
This thesis proposes abstractions and formal tools to develop correct LLVM peephole optimizations. A domain specific language (DSL) Alive enables the specification and verification of peephole optimizations. An Alive transformation is shown to be correct automatically by encoding the transformation and correctness criteria as constraints in first-order logic, which are automatically checked for validity using an SMT solver. It then generates C++ code for an LLVM pass. Peephole optimizations in LLVM are executed numerous times until no optimization is applicable and one optimization could undo the effect of the other resulting in non-terminating compilation. A novel algorithm based on directed-acyclic-graph (DAG) composition determines whether such non-termination bugs can occur with a suite of peephole optimizations. The Alive toolkit can generate concrete input to demonstrate non-termination as well as automatically generating weakest preconditions. It is actively used by the LLVM community and has detected numerous bugs in existing passes and is preventing bugs from being added to the compiler.
Advisor: Mike Gordon and Magnus Myreen
This thesis establishes end-to-end verification with a comprehensive chain of connections all the way from the semantics of a theorem prover expressed in set theory down to x86 machine code running it. It also makes striking use of self-application for both the compiler and the theorem prover. The “CakeML” compiler is compiled with itself. But more than that: it is formally proved correct, and the core of the theorem prover used to prove its correctness is also compiled using CakeML and formally verified using itself. Not only is this a compelling demonstration of the possibilities for formally correct software, and the promise of the CakeML system as an enabling technology for it, but gives perhaps the first really convincing correctness proof for the core of a higher-order logic interactive theorem prover. It is possible that this combination of theorem prover and formally verified path to machine code will become one of the primary platforms for developing high-assurance software.
Advisor: Azadeh Farzan
This thesis proposes a new solution for the problem of concurrent program verification introducing the use of explicitly parallel models and logics to represent and reason about concurrent programs. An effective way of finding a sweet spot in the cost-precision spectrum is provided, weaving together the two steps of constraint generation and cons traint resolution, offering a new way to think about proofs of concurrent programs. This paradigm shift has been missing in the space of “automated” program verification of infinite-state programs, since despite the absolute elegance of Owicki-Gries and Rely-Guarantee proof techniques, the completeness of these techniques heavily relies on the concept of auxiliary proof state. In this thesis, “inductive data flow graphs” (iDFG) offer the same completeness and elegance as the Owicki-Gries method minus the need for the auxiliary state in generating provably “compact” proof arguments. The elegance of iDFGs are generalized into a proof method “proof spaces” for concurrent programs with “unboundedly” many threads.
Advisor: Mooly Sagiv
Automated verification of imperative data structures such as lists is challenging because of the need to define complex loop invariants that have a sensible interpretation in an underlying program logic. This thesis presents a number of foundational results that greatly simplify the proof obligations that must be provided by the programmer for the verification of such programs. Through the introduction and application of concepts such as deterministic transitive closure and property-directed reachability, the thesis demonstrates the feasibility of using a decidable logic (EPR) as an effective basis for answering reachability queries on an expressive class of imperative list-manipulating programs. The thesis also extends these foundational ideas to define modular principles for reasoning about imperative data structures across procedure boundaries. These contributions ultimately lead to a system that can effectively infer loop invariants from an expressive template family using existing SAT solver and shape analysis technology. Collectively, these results lead to a thesis that makes very important foundational and practical contributions to our understanding of the potential of automated program verification and its application to real-world programs.
Advisor: Stephanie Weirich
This work represents a major step toward the holy grail of “general-purpose dependently typed programming” – i.e., the design of programming languages that allow programs to be written using the full spectrum of standard features and idioms while, supporting machine-checked correctness proofs for these programs, expressed in the very same language. Such languages, combining the logical power of full-spectrum dependent languages such as Coq and Agda with the convenience and expressiveness of mainstream functional languages and supporting “lightweight verification” targeting just the most important properties of the most critical parts of the code, have been imagined for decades, but realizing this dream has proved technically challenging. The language and its accompanying metatheory introduce two important innovations. The first, and more technical, of these is the design of a core language combining a call-by-value evaluation order, a pragmatically motivated treatment of computational irrelevance (to support compilation to efficient machine code), and a novel treatment of propositional equality. The second is a new approach to surface-language design, where two terms (including proof terms) are considered to be equivalent if one can be rewritten to the other by applying a set of “known equalities” arising from previous definitions, which is quite convenient and intuitive for programmers. This beautiful thesis will be a cornerstone of a new generation of language designs supporting significantly more robust and reliable software development.
Advisor: Peter Sewell
Mark Batty’s dissertation makes significant contributions to the understanding of memory models for C and C++. The ISO C++ committee proposed a design for C and C++ concurrency that was not up to the task of capturing a realistic relaxed-memory concurrency model. Batty’s work uncovered a number of subtle and serious flaws in the design, and produced an improved design in completely rigorous and machine-checked mathematics. Using software tools to explore the consequences of the design, derived directly from the mathematics, it showed that it has the desired behavior on many examples, and developed mechanized proofs that the design meets some of the original goals, showing that for programs in various subsets of the language one can reason in simpler models. The standards committee have adopted this work in their C11, C++11, and C++14 standards. The members of the award committee were impressed with the quality of the work, the impact it has had on the standardization process for C++, and the clarity of the presentation.
Advisor: Mitchell Wand
Aaron Turon’s dissertation makes several major contributions to the design, implementation, and verification of scalable concurrent programs. First, the dissertation presents “reagents”, a high-level language of combinators for designing—and composing—lock-free data structures. Second, the dissertation shows how lock-free data structures can be used to scalably implement Fournet and Gonthier’s join calculus, in a newly re-engineered C# library that significantly outperforms prior lock-based implementations. Third, the dissertation develops powerful theoretical foundations—based on logical relations and separation logic—for verifying the correctness of scalable concurrent algorithms via contextual refinement. The members of the award committee were impressed with both the breadth and depth of the work, as well as the elegance of the exposition.
Advisor: Ranjit Jhala
Patrick Rondon’s dissertation makes several significant contributions to the field of automatic program verification. It takes a type system – a highly scalable yet not quite precise method of dealing with programs – and refines it using Satisfiability Modulo Theory (SMT) techniques to compensate for the precision loss. There are implementations for both OCaml and C. The achieved degree of effectiveness and automation is astonishing: programs that are beyond the existing verification tools can be handled fully automatically within seconds. It demonstrates that formal verification can yield significant reliability guarantees for mainstream software engineering, at a reasonable cost. In addition, the thesis contains a comprehensive formalization with very detailed, readable proofs. The members of the award committee were impressed by the quality of the work and the clarity of the presentation.
Advisor: Todd Millstein
This dissertation addresses the problem of obtaining reliable results from concurrent programs. As a first step, the dissertation presents LiteRace, which uses sampling to dynamically detect race conditions. As a second step, the dissertation presents DRFx, which is a memory model that enforces sequential consistency, where hardware and software share responsibility for detecting violations of sequential consistency. Finally, the dissertation presents the design of an optimizing compiler that preserves sequential consistency. The dissertation thus demonstrates how a revised distribution of responsibilities among programmers, programming languages, and hardware can help detect and avoid concurrency violations. The committee was impressed with the dissertation’s broad vision for both the problems of concurrency and the possible solutions.
- John Boyland (U. Wisconsin Milwaukee)
- Chen Ding (U. Rochester)
- Matthew Flatt (U. Utah)
- David Gregg (Trinity U.)
- Norman Ramsey (Tufts U.)
- Jeremy Siek (U. Colorado)
- Adam Welc (Oracle)
Advisor: Vikram Adve
This dissertation makes several significant contributions to the field of parallel and concurrent programming. The main technical contribution is a type and effect system that enables reasoning about non-interference at a fine granularity. A second contribution is support for non-deterministic code sections that are explicitly marked as such. A third contribution is support for object-oriented frameworks, where user extensions are guaranteed to adhere to the framework’s effect restrictions. These contributions are backed by formal models, soundness proofs, and the Deterministic Parallel Java implementation. Evaluation shows that highly satisfactory speedups can be achieved on interesting code bases, sometimes beating the performance of hand-crafted implementations. The members of the award committee were impressed by the quality of the work and the clarity of the presentation.
Selection commmittee: Ras Bodik, Matthew Dwyer, Matthew Flatt, Matthew Fluet, Kevin Hammond, Nathaniel Nystrom, Kostis Sagonas, Peter Sewell, Peter Thiemann
Advisor: Thomas Reps
This dissertation develops improvements to interprocedural program analysis through context-bounded analysis and through Lal’s extended weighted push down systems, which generalize weighted push down systems to handle local variables. The dissertation describes both algorithms and experiments, and it shows, for example, a 30-fold speedup over existing algorithms for analyzing concurrent programs. The members of the award committee were impressed by the unusual scope and depth of the dissertation and its excellent presentation.
Advisor: Saman Amarasinghe
This dissertation describes the StreamIt synchronous dataflow language, for which Thies led the definition. The language supports several novel constructs, notably teleport messaging. Thies’s dissertation includes a technique for processing compressed video data, and it also describes dynamic analysis techniques to convert legacy C applications to streaming applications. The members of the award committee were impressed with the novelty, interdisciplinary nature, and breadth of the work, the care given to evaluation, and the quality of the presentation.
Advisor: Kathryn McKinley
This dissertation makes several significant contributions to the problems of tracking down and tolerating software errors in deployed systems. It proposes a variety of techniques, ranging from a breakthrough, probabilistic method of compactly representing calling contexts, to novel techniques for tracking null pointers, to garbage collector modifications that let programs tolerate memory leaks. The evaluation committee was impressed by Michael’s fresh perspective on these problems and the thorough experimental evaluation by which he backs up his claims. His research has already had broad adoption and impact, and we believe that his techniques will be brought to bear on a wide range of future applications.
Advisor: Alan Mycroft and Matthew Parkinson
This dissertation introduces a novel logic for reasoning about concurrent shared-memory programs. This logic subsumes both rely/guarantee reasoning and separation logic in an elegant and natural manner. The dissertation establishes the semantic properties of the logic and demonstrates its applicability on a range of highly complex concurrent algorithms and data structures. The evaluation committee found the clarity of Viktor’s presentation and the technical depth of his results particularly compelling, and we believe that this work creates a foundation for new tools and automated techniques for reasoning about concurrent programs.
Advisor: Rajeev Alur
The thesis explores a formalism called nested trees, that can represent complex branching behavior (loops and recursion) and support modular statement of context-sensitive correctness conditions. It further makes a specific technical contribution by offering the first algorithm for reachability in in nested trees that is sub-cubic in performance. The committee believes this work has great potential for long-term utility.
Advisor: Rajiv Gupta
Dynamic slicing is a technique for determining which variables and data structures affected values causing a fault (bug) at a particular location in a particular run of a program, thus allowing a programmer to work backwards to determine the ultimate cause of a fault. Previously this approach was too expensive to use in practice. Zhang has improved the performance by orders of magnitude, making it practical. The committee believes this work will have considerable impact and value in practice.
Advisor: George Necula
Advisor: Wilson Hsieh
Advisor: Scott Nettles
Advisor: Rajiv Gupta and Mary Lou Soffa