Program Analysis Resources
source link: https://gist.github.com/MattPD/00573ee14bf85ccac6bed3c0678ddbef
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Instantly share code, notes, and snippets.
Program Analysis Resources
(draft; work in progress)
See also:
- Compilers
- Program analysis:
- Dynamic analysis - instrumentation, translation, sanitizers
- Static analysis - static analysis (static checkers and compilers) and verification
General
Symbolic Execution
- Awesome Symbolic Execution - https://github.com/ksluckow/awesome-symbolic-execution
- Symbolic execution history timeline: https://github.com/enzet/symbolic-execution
- Symbolic Execution: Intuition and Implementation - http://www.usrsb.in/symbolic-execution-intuition-and-implementation.html
- A bibliography of papers related to symbolic execution - https://github.com/saswatanand/symexbib
- A Survey of Symbolic Execution Techniques
- ACM Computing Surveys 51(3) 2018
- Roberto Baldoni, Emilio Coppa, Daniele Cono D'Elia, Camil Demetrescu, Irene Finocchi
- https://github.com/season-lab/survey-symbolic-execution
- https://arxiv.org/abs/1610.00502
All You Ever Wanted to Know About Dynamic Taint Analysis and Forward Symbolic Execution (but might have been afraid to ask)
Systematic Comparison of Symbolic Execution Systems: Intermediate Representation and its Generation
Scaling Symbolic Evaluation for Automated Verification of Systems Code with Serval
Symbolic Execution: Software
- Maat: an open-source Dynamic Symbolic Execution and Binary Analysis framework
- Provides various functionalities such as symbolic execution, taint analysis, constraint solving, binary loading, environment simulation, and leverages Ghidra's sleigh library for assembly lifting
- https://github.com/trailofbits/maat
- https://maat.re
- https://blog.trailofbits.com/2022/02/23/maat-symbolic-execution-made-easy/
Lectures & Courses
CMPUT 416: Foundations of Program Analysis
DECA: Designing Code Analyses for Large-scale Software Systems
Foundations of Programming Languages
- Static Program Analysis
- Anders Møller and Michael I. Schwartzbach
- https://cs.au.dk/~amoeller/spa/
PLISS 2019 - Anders Møller
25 Years of Program Analysis
Software Analysis
Software Analysis and Testing
Program Analysis and Reliability - Nick Sumner, CMPT 886, Spring 2015, SFU
Program analysis for reverse engineers: from T to ⊥
CS 252r: Advanced Topics in Programming Languages
- A Gentle Introduction to Program Analysis
- Programming Languages Mentoring Workshop 2014
- Isıl Dillig (University of Texas)
- https://www.cis.upenn.edu/~alur/CIS673/isil-plmw.pdf
UFMG DCC888: Static Program Analysis
Software
- awesome-llvm: A curated list of awesome LLVM related docs, tools, and other resources
- LLVM Weekly - http://llvmweekly.org/
- https://eli.thegreenplace.net/tag/llvm-clang
Building a Checker in 24 hours
Code transformation and analysis using Clang and LLVM
Compiler-assisted Performance Analysis
- Dg: LLVM Static Slicer
- Dependence graph for programs. A set of generic program analyses and a static slicer for LLVM bitcode
- https://github.com/mchalupa/dg
FPSCEV: Floating-Point Scalar Evolution in LLVM
- LLVM Dataflow Info Printer Pass
- https://github.com/regehr/llvm-dataflow-info
- Tell us what some of LLVM's dataflow analyses think about the code being compiled.
- Testing Dataflow Analyses for Precision and Soundness
- Testing Static Analyses for Precision and Soundness
- CGO (Code Generation and Optimization) 2020
- Jubi Taneja, Zhengyang Liu, John Regehr
- http://www.cs.utah.edu/~regehr/cgo20.pdf
- LLVM Debugging Tips and Tricks
Phasar - A LLVM-based static code analysis framework
- SVF: Interprocedural Static Value-Flow Analysis in LLVM
- Pointer Analysis and Program Dependence Analysis for C and C++ Programs
- http://svf-tools.github.io/SVF/
- https://github.com/unsw-corg/SVF
- SVF: Interprocedural Static Value-Flow Analysis in LLVM
- Compiler Construction (CC '16)
- Yulei Sui and Jingling Xue
- https://yuleisui.github.io/publications/cc16.pdf
- 2016 EuroLLVM Developers' Meeting: Y. Sui "SVF: Static Value-Flow Analysis in LLVM"
- https://www.youtube.com/watch?v=nD-i-enA8rc
LLVM - Symbolic Execution
- haybale: Symbolic execution of LLVM IR with an engine written in Rust
- operates on LLVM IR, which allows it to analyze programs written in C/C++, Rust, Swift, or any other language which compiles to LLVM IR
- https://github.com/PLSysSec/haybale
KLEE Symbolic Virtual Machine
SymCC: efficient compiler-based symbolic execution
LLVM - Verification
Crux: a tool for improving the assurance of software using symbolic testing, currently supporting C, C++, and Rust
LLBMC: The Low-Level Bounded Model Checker
- llrêve: Automatic regression verification for LLVM programs
- Automatically check programs for equivalence
- https://github.com/mattulbrich/llreve
- https://formal.iti.kit.edu/projects/improve/reve/
- Relational Program Reasoning Using Compiler IR – Combining Static Verification and Dynamic Analysis
- Journal of Automated Reasoning 60(3) 2017
- Moritz Kiefer, Vladimir Klebanov, Mattias Ulbrich
- http://dx.doi.org/10.1007/s10817-017-9433-5
- SAW: Software Analysis Workbench
- https://saw.galois.com/
- Constructing Semantic Models of Programs with the Software Analysis Workbench
- Verified Software: Theories, Tools and Experiments (VSTTE) 2016
- Robert Dockins, Adam Foltzer, Joe Hendrix, Brian Huffman, Dylan McNamee, Aaron Tomb
- https://doi.org/10.1007/978-3-319-48869-1_5
SeaHorn Verification Framework: A fully automated analysis framework for LLVM-based languages
- SMACK Software Verifier and Verification Toolchain
- http://smackers.github.io
- https://github.com/smackers/smack
- SMACK: Decoupling Source Language Details from Verifier Implementations
- Computer Aided Verification (CAV) 2014
- Zvonimir Rakamaric, Michael Emmi
- https://soarlab.org/publications/2014_CAV_SMACK/
- SMACK Software Verification Toolchain
- International Conference on Software Engineering (ICSE) Companion 2016
- Montgomery Carter, Shaobo He, Jonathan Whitaker, Zvonimir Rakamaric, Michael Emmi
- https://soarlab.org/publications/2016_ICSE_SMACK/
- Leveraging Compiler Intermediate Representation for Multi- and Cross-Language Verification
- Verification, Model Checking, and Abstract Interpretation (VMCAI) 2020
- Jack Garzella, Marek S. Baranowski, Shaobo He, Zvonimir Rakamaric
- https://soarlab.org/publications/2020_vmcai_gbhr/
Clang
Clang Static Analyzer
- Developing the Clang Static Analyzer
- 2019 LLVM Developers’ Meeting; Artem Dergachev
- https://www.youtube.com/watch?v=g0Mqx1niUi0
- Faster, Stronger C++ Analysis with the Clang Static Analyzer
- 2018 LLVM Developers’ Meeting; George Karpenkov, Artem Dergachev
- https://www.youtube.com/watch?v=4n3l-ZcDJNY
Introduction
- Building Program Reasoning Tools using LLVM and Z3
- POPL 2020 Tutorial
- Introduction to LLVM and Z3; Static Dataflow Analysis, Dynamic Symbolic Execution, Assertion Verification
- http://rightingcode.org/tutorials/popl20/
- Getting Started With LLVM: Basics
- 2019 LLVM Developers’ Meeting; Jessica Paquette, Florian Hahn
- https://www.youtube.com/watch?v=3QQuhL-dSys
- How to contribute to LLVM
- 2022; Nikita Popov
- https://developers.redhat.com/articles/2022/12/20/how-contribute-llvm
- How to Contribute to LLVM
- 2019 LLVM Developers’ Meeting; Chris Bieneman, Kit Barton
- https://www.youtube.com/watch?v=C5Y977rLqpw
- Intro to the Architecture of LLVM
- 2020; Nick Desaulniers
- https://www.youtube.com/watch?v=bUTXhcf_aNc
Introduction to LLVM
- Intrinsics, Metadata, and Attributes: The story continues!
- 2016 LLVM Developers’ Meeting
- Hal Finkel, Argonne National Laboratory
- https://www.youtube.com/watch?v=jII0AcgU_5c
LLVM Seminar - Northeastern+MIT
Security Research and Development with LLVM - Andrew Reiter
- clang-llvm-tutorial
- Clang and LLVM Tutorial Examples (AST Interpreter, Function Pointer Analysis, Value Range Analysis, Data-Flow Analysis, Andersen Pointer Analysis, LLVM Backend)
- https://github.com/lijiansong/clang-llvm-tutorial/
- Custom Alias Analysis in LLVM
Introduction to LLVM: Building simple program analysis tools and instrumentation
- LLVMPlayground: Small sample programs that use LLVM and Clang APIs.
- Nick Sumner's Examples
- slides: https://www.cs.sfu.ca/~wsumner/teaching/886/llvm.pdf
- llvm-demo: A simple example of how LLVM can be used to gather static or dynamic facts about a program.
- callgraph-profiler-template: A template for an introductory project on dynamic analysis using LLVM
- clang-plugins-demo: A simple example of defining custom plugins for Clang and the Clang Static Analyzer
- llvm-dataflow-analysis: (very simple) static intraprocedural dataflow analyses
- overflower-template: Template for a project to teach basic static dataflow analysis using LLVM
- path-profiler-template: A template for a path profiling project using LLVM
Quarkslab blog
- Writing a basic clang static analysis check
Introduction: LLVM IR
Instrumentation
Creating an LLVM Sanitizer from Hopes and Dreams
Instrew: Leveraging LLVM for High Performance Dynamic Binary Instrumentation
- Loom: LLVM instrumentation library
PolyTracker: An LLVM-based instrumentation tool for universal taint analysis.
QBDI (QuarkslaB Dynamic binary Instrumentation): A Dynamic Binary Instrumentation framework based on LLVM
sbt-instrumentation: Configurable instrumentation of LLVM bitcode
Lifting
Lifting: Disassembly, Decompilation, Recompilation, Reverse Engineering
- ANVILL Decompiler Toolchain
- ANVILL forges beautiful LLVM bitcode out of raw machine code
- https://github.com/lifting-bits/anvill
decomp: Compositional Decompilation using LLVM IR
llvm-mctoll
McSema: Framework for lifting x86, amd64, and aarch64 program binaries to LLVM bitcode
- Rellic: produces goto-free C output from LLVM bitcode
- Rellic is an implementation of the pattern-independent structuring algorithm to produce a goto-free C output from LLVM bitcode.
- https://github.com/lifting-bits/rellic
- Rellume — Lifts x86-64 to LLVM IR
- https://github.com/aengelke/rellume
- Rellume is a lifter for x86-64 machine code to LLVM IR with focus on the performance of the lifted code. The generated LLVM IR can be compiled and executed again, for example using LLVM's JIT compiler, ideally having the same (or even better) performance as the original code.
- Remill: Library for lifting of x86, amd64, and aarch64 machine code to LLVM bitcode
- reopt: A tool for analyzing x86-64 binaries.
- https://github.com/GaloisInc/reopt
- Reopt is a tool under development for decompiling and recompiling code. It works by mapping binaries into LLVM byte code, using the LLVM optimization passes to optimize the LLVM, and then combining the newly generated into the binary to generate a new executable.
- reopt-vcg: an in-progress Lean4 prototype LLVM/x86 equivalence checker for programs optimized by reopt.
RetDec: a retargetable machine-code decompiler based on LLVM
- revng: a static binary translator
- revng is a static binary translator. Given a input ELF binary for one of the supported architectures (currently MIPS, ARM and x86-64) it will analyze it and emit an equivalent LLVM IR. To do so, revng employs the QEMU intermediate representation (a series of TCG instructions) and then translates them to LLVM IR.
- https://rev.ng/
- https://github.com/revng/revng
Passes
Building, Testing and Debugging a Simple out-of-tree LLVM Pass
- llvm-pass-tutorial: A step-by-step tutorial for building an LLVM sample pass
llvm-tutor: Basic LLVM passes for learning the API
- Writing an LLVM Optimization
- 2020; Jonathan Smith
- A brief overview of LLVM, its pass and IR frameworks and API, and a tutorial on how to write legacy and modern optimization passes.
- https://www.youtube.com/watch?v=MagR2KY8MQI
- https://old.reddit.com/r/cpp/comments/hopfg3/writing_an_llvm_optimization_my_quarantine/
- float-compare-pass: Fork of LLVM for demonstrating optimization pass development
Legacy Pass Manager
New Pass Manager
- How Compilers Work: Introduction to LLVM Passes
- C++ on Sea 2022
- Andrzej Warzynski
- https://www.youtube.com/watch?v=mfTYfzKqi5s
- LLVM’s New Pass Manager
New PM: taming a custom pipeline of Falcon JIT
The LLVM Pass Manager, Part 2
Passes in LLVM, Part 1
- Porting Burst to the New LLVM Pass Manager
Writing LLVM Pass in 2018
Writing Pass Instrumentations for the New PassManager
Readings
- A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World
- Communications of the ACM, Vol. 53 No. 2, 2010
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, Dawson Engler
- https://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltext
- Analysing the Program Analyser
- Continuous Reasoning: Scaling the Impact of Formal Methods
- Logic in Computer Science (LISC) 2018; Peter O'Hearn
- https://research.fb.com/publications/continuous-reasoning-scaling-the-impact-of-formal-methods/
- From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis
- IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM) 2018
- Mark Harman, Peter O'Hearn
- https://research.fb.com/publications/from-start-ups-to-scale-ups-opportunities-and-open-problems-for-static-and-dynamic-program-analysis/
- How to Prevent the next Heartbleed
- 2014; David A. Wheeler
- https://dwheeler.com/essays/heartbleed.html
- Lessons from Building Static Analysis Tools at Google (2018)
Pointer analysis: haven't we solved this problem yet?
- Righting Software
- IEEE Software, May 2004
- https://www.microsoft.com/en-us/research/publication/righting-software/
- Source Code Analysis: A Road Map
- Future of Software Engineering (FOSE) 2007
- David Binkley
- https://dl.acm.org/citation.cfm?id=1254713
- Static versus dynamic analysis---an illusory distinction?
Background
- Programming Languages: Application and Interpretation
- Type Safety in Three Easy Lemmas
- On the Relationship Between Static Analysis and Type Theory
- Soundness and completeness: with precision
- What is soundness (in static analysis)? - http://www.pl-enthusiast.net/2017/10/23/what-is-soundness-in-static-analysis/
Brown CS: CSCI 1730: Programming Languages
OPLSS (Oregon Programming Languages Summer School)
Programming Language Implementation Summer School (PLISS)
- Approachable PL Papers for Undergrads
- SSA book - http://ssabook.gforge.inria.fr/latest/
Intermediate Representations in Imperative Compilers: A Survey
Modern Intermediate Representations (IR)
Testing Intermediate Representations for Binary Analysis
- CS 7194 - Great works in Programming Languages - https://www.cs.cornell.edu/courses/cs7194/2019sp/
Type Systems - Neel Krishnaswami
- http://jschuster.org/blog/2016/11/29/getting-started-in-programming-languages/
- Programming Languages: Application and Interpretation by Shriram Krishnamurthi
- SIGPLAN Awards - http://www.sigplan.org/Awards/
- Great Works in Programming Languages
- Collected by Benjamin C. Pierce
- https://www.cis.upenn.edu/~bcpierce/courses/670Fall04/GreatWorksInPL.shtml
- 10PL: 10 papers that all PhD students in programming languages ought to know, for some value of 10
- Northeastern University Programming Research Lab
- https://github.com/nuprl/10PL
- Best of PLDI 2004
- Classic Papers in Programming Languages and Logic
- Karl Crary
- https://www.cs.cmu.edu/~crary/819-f09/
What Type Soundness Theorem Do You Really Want to Prove?
Type Theory Behind Glasgow Haskell Compiler Internals
- Principles of Programming Languages
- Lectures - Matthias Felleisen
- https://felleisen.org/matthias/4400-s20/lectures.html
- learn-programming-languages
- Resources for the working programmer to learn more about the fundamentals and theory of programming languages.
- Jean Yang
- https://github.com/jeanqasaur/learn-programming-languages
Background: Notation
- A practitioner’s guide to reading programming languages papers
- Basic Mechanics of Operational Semantics
- PLMW @ ICFP 2020; David Van Horn
- https://www.youtube.com/watch?v=exhwykjH_z4
Crash Course on Notation in Programming Language Theory
- How do I read type systems notation
It's Time for a New Old Language
What are we thinking when we present a type theory?
Background: Semantics
Programming Language Semantics: It’s Easy As 1,2,3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK