In 2013 I moved from to industry after over 25 years in academia, when Facebook acquired a verification startup, Monoidics, that I was involved with. In this talk I’ll recount the clash of cultures I encountered, where traditionally calm and cool formal reasoning techniques came in contact with a heated software development methodology based on rapid modification of large codebases (thousands of modifications per day on 10s MLOC). I will tell how we found that static formal reasoning could thrive, if certain technical approaches (based on compositionality), how the industrial experience caused me to question some of the assumptions I learned in academic static analysis, and how I’ve come out the other side with new science spurred by that experience (most recently, incorrectness logic). Overall, I hope to convey that having science and engineering playing off one another in a tight feedback loop is possible, even advantageous, when practicing static analysis in industry at present.
Most Java static analysis frameworks provide an intermediate presentation (IR) of Java Bytecode to facilitate the development of static analyses. While such IRs are often based on three-address code, the transformation itself is a great opportunity to apply optimizations to the transformed code, such as constant propagation.
In this paper, we propose TACAI, a refinable IR that is based on abstract interpretation results of a method's bytecode. Exchanging the underlying abstract interpretation domains enables the creation of various IRs of different precision levels. Our evaluation shows that TACAI can be efficiently computed and provides slightly more precise receiver-type information than Soot's Shimple representation. Furthermore, we show how exchanging the underlying abstract domains directly impacts the generated IR.
Bug finders are mainstream tools used during software development that significantly improve the productivity of software engineers and lower maintenance costs. These tools search for software anomalies by scrutinising the program's code using static program analysis techniques, i.e., without executing the code. However, current bug finders do not explain why bugs were found, primarily due to coarse-grain abstractions that abstract away large portions of the operational semantics of programming languages. To further improve the utility of bug finders, it is paramount to explain reported bugs to the end-users.
In this work, we devise a new technique that produces a program trace for a reported bug giving insight into the root cause for the reported bug. For the generation of the program trace, we use an abstracted flow-based semantics for programs to overcome the undecidability of the problem. We simplify the semantic problem by mapping an input program with a reported bug to a Constant Copy Machine (CCM) for the trace construction. Using CCM the semantics of the program can be weakened, and thus bug provenance can be solved in polynomial time, producing a shortest trace in the process which gives the shortest explanation. The technique is reified in the bug tracing tool Digger and is evaluated on several open-source Java programs.