df_resources.md
source link: https://gist.github.com/johnhw/f8565f989bdcd1a3bc8e16443ab24db6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
A complete list of books, articles, blog posts, videos and neat pages that support Data Fundamentals (H), organised by Unit.
Formatting
If the resource is available online (legally) I have included a link to it. Each entry has symbols following it.
- ⨕⨕⨕ indicates difficulty/depth, from ⨕ (easy to pick up intro, no background required) through ⨕⨕⨕⨕⨕ (graduate level textbook, maths heavy, expect equations)
- indicates a particularly recommended resource; is a very strongly recommended resource and you should look at it.
General
- SciPy lecture notes introduces all of the scientific Python infrastructure and lots of overlap with Data Fundamentals (H) ⨕⨕⨕
- Learning AI if you suck at math ⨕⨕
Mathematical notation
- Mathematical Notation: A Guide for Engineers and Scientists by Edward R. Scheinerman covers all of the mathematical notation (and more) that we will use in a very concise form. ⨕
- Deep learning notation covers much of the same terminology and symbols. ⨕⨕
- Math As Code: A cheatsheet for Mathematical Notation A really nice explanation of mathematical notation in terms of simple code (in Javascript, but easily applicable) ⨕⨕
Excerpt from Math as Code:
The big Greek Σ
(Sigma) is for Summation. In other words: summing up some numbers.
∑i=1100i
Here, i=1
says to start at 1
and end at the number above the Sigma, 100
. These are the lower and upper bounds, respectively. The i to the right of the "E" tells us what we are summing. In code:
var sum = 0
for (var i = 1; i <= 100; i++) {
sum += i
}
The result of sum
is 5050
.
Python
If you don't know any Python, you will need to learn some.
You will need to know:
- basic syntax: expressions and function calls
- printing
- lists
- dictionaries
- basic iteration (for, while)
- functions, parameters
- (maybe) list comprehensions
You will not need to know:
- classes
- exceptions
- file handling
- or anything more advanced
References
- Python cheat sheet A quick reference card.
- learnxinyminutes Python 3 A very concise reference
- Python for data science cheat sheetA quick reference card with a data science focus.
- "Think Python!" by Allen Downey A full textbook on Python. Easy to read.
- Try the online tutorials at LearnPython
Jupyter
We'll be using Jupyter for everything in DF(H). While it's not hard to learn, there are some guides:
Cheat sheets and API references
Quick references for getting stuck and coding things up. This covers NumPy and Matplotlib, the two key software libraries we use in DF(H).
Unit 1: Vectorized computation I
Unit 2: Vectorized computation II
Articles on floating point
Advanced NumPy
Unit 3: Visualisation
Aesthetics
Uncertainty
- [The Hacker's Guide to uncertainty visualisation] (https://erikbern.com/2018/10/08/the-hackers-guide-to-uncertainty-estimates.html) ⨕
- Understanding the Box plot Very thorough discussion of what Box plots are and how they should be used. ⨕⨕
Matplotlib
Visualisation
Example visualisations
- Randal Olson's blog has many, many examples of good visualization, mainly using Python for graph preparation. ⨕
Books
- Layered Grammar of Graphics (long, but detailed) ⨕⨕⨕
- The Grammar of Graphics, Leland Wilkinson, Second ed. ⨕⨕⨕⨕
- How to Lie with Statistics Darrel Huff (short, easy to read, worth reading) ⨕
- Information Visualization: Perception for Design Colin Ware: a serious book on advanced visualisations.⨕⨕⨕
- The "Tufte" books
- The Visual Display of Quantitative Information by Edward Tufte⨕⨕⨕
- Visual Explanations: Images and Quantities, Evidence and Narrative by Edward Tufte⨕⨕⨕
- Envisioning Information by Edward Tufte⨕⨕⨕
Unit 4: Computational Linear Algebra I
Primers
High-dimensional spaces
This can be mind-bending. Some further reading and viewing:
Videos
Texts
Books
- Introduction to Applied Linear Algebra freely available. Stephen Boyd and Lieven Vandenberghe⨕⨕⨕
- Coding the Matrix Phillip N. Klein An excellent and thorough introduction to linear algebra through Python programming⨕⨕⨕
- Linear Algebra Done Right, Sheldon Axler a more pure mathematics perspective ⨕⨕⨕
Unit 5: Computational Linear Algebra II
Eigenvectors
Beyond the course
The SVD
Books
- The Matrix Cookbook Kaare Brandt Petersen and Michael Syskind Pedersen. If you need to do a tricky calculation with matrices, this book will probably tell you how to do it.⨕⨕⨕⨕⨕#
- Introduction to Linear Algebra Gilbert Strang The standard textbook on linear algebra⨕⨕⨕⨕
- A First Course in Numerical Methods Uri M. Ascher and Chen Greif⨕⨕⨕⨕
Unit 6: Numerical Optimization I
- On the Origin of Circuits covers genetic algorithms ⨕
- Khan academy: Multivariable calculus, particularly "Thinking about multivariable functions", "Derivatives of multivariable functions" and "Applications of multivariable derivatives"
- Why have Sex? Information Acquisition and Evolution ⨕⨕⨕⨕
Books
- When least is best: How Mathematicians Discovered Many Clever Ways to Make Things as Small (or as Large) as Possible by Paul J. Nahin An interesting and mathematically thorough description of the history of optimisation from a mathematical standpoint.⨕⨕⨕⨕
- The Blind Watchmaker Richard Dawkins An excellent popular science book on how evolution (genetic algorithms in the wild) can work, including some early computer simulations.
Unit 7: Numerical Optimization II
Gradient descent
Automatic differentiation
Pareto optimality
Unit 8: Probability & Stochastics I
Probability
Bayesian thinking and Bayes' rule
Beyond the course
These provide a formal basis for probability theory, if you feel more comfortable having a rigorous mathematical basis. These go way beyond the course.
Books
- Probability and statistics cookbook ⨕⨕⨕ Like the Matrix Cookbook, this provides a dense, quick reference to many problems in statistics and probability.
- Think Bayes, Allen B. Downey light, Python focused⨕⨕
- All of Statistics: A Concise Course in Statistical Inference Larry Wasserman *Outstanding; the best of these books, but somewhat maths heavy.*⨕⨕⨕⨕⨕
- Chapters 2 and 3 of Information Theory, Inference, and Learning Algorithms by David Mackay⨕⨕⨕⨕
- **A First Course in Probability ** by Sheldon Ross (standard textbook on probability) ⨕⨕⨕
Beyond the course
- Probability theory: the logic of science by E. T. Jaynes an excellent but controversial and very technical book⨕⨕⨕⨕⨕
- Information Theory, Inference and Learning Algorithms, David Mackay Also excellent and covers many interesting relation between probability, information and learning⨕⨕⨕⨕⨕
- Introduction to statistical learning (outstanding introduction to statistical learning, including a book, video and course notes) ⨕⨕⨕⨕
Unit 9: Probability & Stochastics II
Books
- Bayesian methods for Hackers a full "book" on Bayesian methods and inference ⨕⨕⨕⨕
Unit 10: Digital Signals and Time Series
- Sampling, Quantization and Encoding (short introduction to sampling and quantization) ⨕⨕
- DSP for the Braindead (not actually for the braindead, in fact much more advanced than we cover here!) ⨕⨕⨕
Books
- The Scientist and Engineer's Guide to Signal Processing http://dspguide.com/ (free, online book) ⨕⨕⨕
- Digital Signal Processing, A Computer Science Perspective, Jonathan (Y) Stein A great introduction for CS students, but fantastically expensive.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK