TOC
-
- Table of Contents
-
Entire books or other large collections of notebooks on a topic
-
Scientific computing and data analysis with the SciPy Stack
- General topics in scientific computing
- Social data
- Psychology and Neuroscience
- Machine Learning, Statistics and Probability
- Physics, Chemistry and Biology
- Economics and Finance
- Earth science and geo-spatial data
- Data visualization and plotting
- Mathematics
- Signal and Sound Processing
- Natural Language Processing
- Pandas for data analysis
- General Python Programming
- Miscellaneous topics about doing various things with the Notebook itself
- Reproducible academic publications
- Data-driven journalism
- Whimsical notebooks
- Videos of IPython being used in the wild
- Accessing and programing a IBM quantum computer via notebooks
- Software Architecture
Awesome Jupyter Notebooks
Related links:
🔗 note/Awesome Jupyter Notebooks
🔗 app/Jupyter Notebook Apps
Important contribution instructions: If you add new content, please ensure that for any notebook you link to, the link is to the rendered version using nbviewer, rather than the raw file. Simply paste the notebook URL in the nbviewer box and copy the resulting URL of the rendered version. This will make it much easier for visitors to be able to immediately access the new content.
Note that Matt Davis has conveniently written a set of bookmarklets and extensions to make it a one-click affair to load a Notebook URL into your browser of choice, directly opening into nbviewer.
Table of Contents
-
Entire books or other large collections of notebooks on a topic
-
Scientific computing and data analysis with the SciPy Stack
- General topics in scientific computing
- Social data
- Psychology and Neuroscience
- Machine Learning, Statistics and Probability
- Physics, Chemistry and Biology
- Economics and Finance
- Earth science and geo-spatial data
- Data visualization and plotting
- Mathematics
- Signal, Sound and Image Processing
- Natural Language Processing
- Pandas for data analysis
- General Python Programming
- Miscellaneous topics about doing various things with the Notebook itself
- Reproducible academic publications
- Other publications using the Notebook
- Data-driven journalism
- Whimsical notebooks
- Videos of IPython being used in the wild
- Accessing an IBM quantum computer via notebooks
- Software Architecture
Entire books or other large collections of notebooks on a topic
Introductory Tutorials
- First things first, how to run code in the notebook. There is also a general collection of notebooks from IPython. Another useful one from this collection is an explanation of our rich display system.
- A great matplotlib tutorial, part of the fantastic Lectures on Scientific Computing with Python by J.R. Johansson.
- The code of the IPython mini-book by C. Rossant, introducing IPython, NumPy, SciPy, Pandas and matplotlib for interactive computing and data visualization.
- Python Tutorial by Rajath Kumar M P
- Growing Neat Software Architecture from Jupyter Notebooks, a primer by Guillaume Chevalier on how to build clean software using notebooks.
Programming and Computer Science
- Automata and Computability using Jupyter, an entire course, based on forthcoming book published by Taylor and Francis; book title: "Automata and Computability: Programmer's Perspective", by Ganesh Gopalakrishnan, Professor, School of Computing, University of Utah, Salt Lake City. [in English, has Youtube videos]
- Introduction to Programming (using Python), an entire introductory Python course written by Eric Matthes. This post explains the educational context in an Alaskan high school where Eric is a teacher.
- Numeric Computing is Fun A series of notebooks created to help educate aspiring computer programmers and data scientists of all ages with no previous programming experience.
- Python for Developers, a complete book on Python programming by Ricardo Duarte. Note the book also exists in Portuguese, website translated into English
- CS1001.py - Extended Introduction to Computer Science. Recitations from Tel-Aviv University introductory course to computer science, assembled as IPython notebooks by Yoav Ram.
- Exploratory Computing with Python, a set of 15 Notebooks that cover exploratory computing, data analysis, and visualization. No prior programming knowledge required. Each Notebook includes a number of exercises (with answers) that should take less than 4 hours to complete. Developed by Mark Bakker for undergraduate engineering students at the Delft University of Technology.
- Understanding evolutionary strategies and covariance matrix adaptation, from the Advanced Evolutionary Computation: Theory and Practice course by Luis Martí.
- Code Katas in Python, a collection of algorithmic and data structure exercises covering search and sorting algorithms, stacks, queues, linked lists, graphs, backtracking and greedy problems.
- Jupyter notebook activities for Part IA of the computing course (Michaelmas Term) in the Engineering Tripos at University of Cambridge, by Garth Wells.
- Introduction to Python for Computational Science and Engineering (Hans Fangohr): Textbook for beginners, broken into one Jupyter Notebook per chapter. Can be executed and interacted with online using Binder.
Statistics, Machine Learning, and Data Science
- Predicting PewDiePie's daily subscribers using Linear Regression, a notebook which explains the implementation of Linear Regression from scratch, by Tanu Nanda Prabhu, author and editor at Towards data science.
- Top P-thon Libraries Used In Data Science, this notebook explain the important library used in data science, by Tanu Nanda Prabhu, author and editor at Towards data science.
- Web scraping using Python with BeautifulSoup and Requests libraries, a notebook which explains scraping the data from the internet from scratch, by Tanu Nanda Prabhu, author and editor at Towards data science.
- Exploratory data analysis in Python, a notebook which explains the steps to perform Exploratory data Analysis in python from the scratch, by Tanu Nanda Prabhu, author and editor at Towards data science.
- An introductory notebook on uncertainty quantification and sensitivity analysis developed for the [Workshop On Uncertainty Quantification And Sensitivity Analysis For
Cardiovascular Modeling](https://sathercenter.berkeley.edu/peder-sather-grant/2016-grantees/)
by Leif Rune Hellevik, Vinzenz Eck and Jacob T. Sturdy. - Python Data Science Handbook Supplemental Materials, a collection of notebooks by Jake VanderPlas to accompany the book.
- Data Cleaning using Python with Pandas Library, a Date Science notebook which clearly explains Data Cleaning using Python with Pandas Library at a beginner level, by Tanu Nanda Prabhu.
- Social Network Analysis: From Graph Theory to Applications with Python. A tutorial on network creation, analysis, information flow and influence maximization with Networkx by Dima Goldenberg.
- "ISP": Introduction to Statistics with Python, a collection of notebooks accompanying the book of the same name, by Thomas Haslwanter.
- Notebooks for the exercises in Andrew Ng's online ML course, Spark and TensorFlow, as well as extra material on other tools from the scipy stack, by John Wittenauer.
- AM207: Monte Carlo Methods, Stochastic Optimization: a complete course by Verena Kaynig-Fittkau and Pavlos Protopapas from Harvard, with all lecture materials and homework sets as notebooks.
- An introduction to Bayesian inference, this is just chapter 1 in an ongoing book titled Probabilistic Programming and Bayesian Methods for Hackers Using Python and PyMC, by Cameron Davidson-Pilon.
- Doing Bayesian Data Analysis: Python/PyMC3 code for a selection of models and figures from the book 'Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan', Second Edition, by John Kruschke (2015).
- Learn Data Science, an entire self-directed course by Nitin Borwankar.
- IPython Cookbook by Cyrille Rossant, a comprehensive guide to Python for Data Science. The code of the 100 recipes is available on the GitHub repository.
- An introduction to machine learning with Python and scikit-learn (repo and overview) by Hannes Schulz and Andreas Mueller.
- A progressive collection notebooks of the Machine Learning course by the University of Turin (with exercises).
- Clustering and Regression, part of the UC Berkeley 2014 Introduction to Data Science course taught by Michael Franklin.
- Neural Networks, part of a collection on machine learning by Aaron Masino.
- An introduction to Pandas, part of an 11-lesson tutorial on Pandas, by Hernán Rojas.
- Data Science and Big Data with Python by Steve Phelps.
- Big Data for beginners A collections of notebooks on Hadoop, MapReduce, Spark.
- The Statsmodels Project has two excellent collections of examples: in their official documentation and extra ones in their wiki. Too many there to directly duplicate here, but they provide great learning materials on statistical modeling with Python.
- Machine Learning with the Shogun Toolbox. This is a complete service that includes a ready-to-run IPython instance with a collection of notebooks illustrating the use of the Shogun Toolbox. Just log in and start running the examples.
- Python for Data Analysis, an introductory collection from the CU Boulder Research Computing Group.
- The Kaggle bulldozers competition example, one of a set on tutorials on exploratory data analysis with the copper toolkit by Daniel Rodríguez/
- Understanding model reliability, part of a complete course on statistics and data analysis for psychologists by Michael Waskom.
- Graphical Representations of Linear Models, an illustration of the Seaborn statistical visualization library, that also includes Visualizing distributions of data and Representing variability in timeseries plots. By Michael Waskom.
- Desperately Seeking Silver, one of the homework sets for Harvard's CS 109 Data Science course.
- The classic 'An Introduction to Statistical Learning with Applications in R' by James, Witten, Hastie, Tibshirani (2013), has not one but two collections of notebooks to accompany the book with Python (instead of the book's default R examples). One by Jordi Warmenhoven and one by Matt Caudill.
- Python Notebooks for StatLearning Exercises, Python implementations of the R labs for the StatLearning: Statistical Learning online course from Stanford University taught by Profs Trevor Hastie and Rob Tibshirani.
- Applied Predictive Modeling with Python, Python implementations of the examples (originally written in R) from a famous introductory book, Applied Predictive Modeling, by Max Kuhn and Kjell Johnson.
- A collection of four courses in foundations of data science, algorithms and databases from multiple faculty at Columbia University's Lede Program.
- SciPy and OpenCV as an interactive computing environment for computer vision by Thiago Santos, a tutorial presented at SIBGRAPI 2014.
- Kalman and Bayesian Filters in Python, by Roger Labbe.
- Adaboost for digit classification, by Shashwat Shukla. A complete implementation of Adaboost in Python, with code for digit recognition.
- An example machine learning notebook, by Randal. S. Olson, part of a collection in Data Analysis and Machine Learning.
- Pandas .head() to .tail(), an in-depth tutorial on Pandas by Tom Augspurger.
- Apache SINGA tutorial. A Python tutorial for deep learning with SINGA.
- Data Science Notebooks, a frequently updated collection of notebooks on statistical inference, data analysis, visualization and machine learning, by Donne Martin.
- ETL with Python, a tutorial for ETL (Extract, Transfer and Load) using python petl package, loading to MySQL and working with csv files by Dima Goldenberg.
- the-elements-of-statistical-learning, a collection of notebooks implementing the algorithms, reproducing the graphics found in the book "The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani and Jerome Friedman and summary of the textbook.
- Classification on raw time series in TensorFlow with a LSTM RNN, by Guillaume Chevalier.
- Signal forecasting with a Sequence-to-Sequence (seq2seq) Recurrent Neural Network (RNN) model in TensorFlow, by Guillaume Chevalier.
- A Coding Kata using Notebooks in Google Colab: Achieve Clean Machine Learning From Dirty Code.
- Interactive Machine Learning Experiments - collection of notebooks that use convolutional neural networks (CNNs), recurrent neural networks (RNNs) and multilayer perceptrons (MLPs) to solve basic machine learning tasks like objects detection and classification, sequence-to-sequence predictions etc.
Mathematics, Physics, Chemistry, Biology
- A single-atom laser model. This is one of a complete set of lectures on quantum mechanics and quantum optics using QuTiP by J.R. Johansson.
- 2-d rigid-body transformations. This is part of Scientific Computing in Biomechanics and Motor Control, a complete collection of notebooks by Marcos Duarte.
- Astrophysical simulations and analysis with yt: a collection of example notebooks on using various codes that yt interfaces with: Enzo, Gadget, RAMSES, PKDGrav and Gasoline. Note: the yt site currently throws an SSL warning, they seem to have an outdated or self-signed certificate.
- Working with Reactions, part of a set of tutorials on cheminformatics and machine learning with the rdkit project, by Greg Landrum.
- CFD Python: 12 steps to Navier-Stokes. A complete set of lectures on Computational Fluid Dynamics, from 1-d linear waves to full 2-d Navier-Stokes, by Lorena Barba.
- Pytherm - Applied Thermodynamics. Lectures on applied thermodynamics using Python and the SciPy ecosystem, by ATOMS.
- AeroPython: Aerodynamics-Hydrodynamics with Python, a complete course taught at George Washington University by Lorena Barba.
- Practical Numerical Methods with Python, a collection of learning modules (each consisting of several IPython Notebooks) for a course in numerical differential equations taught at George Washington University by Lorena Barba. Also offered as a "massive, open online course" (MOOC) on the GW SEAS Open edX platform.
- Get Data Off the Ground with Python by Lorena Barba: Learn to interact with Python and handle data with Python; assumes no coding experience and creates a foundation in programming applied to technical contexts. With an accompanying online course.
- Take Off with Stats in Python by Lorena Barba: Hands-on data analysis using a computational approach and real-life applications. With an accompanying online course.
- Tour the dynamics of change and motion by Lorena Barba: Tour of the dynamics of change and motion using computational thinking with Python. With an accompanying online course.
- pyuvvis: tools for explorative spectroscopy, spectroscopy library built for integration ipython notebooks, matplotlib and pandas.
- HyperPython: a practical introduction to the solution of hyperbolic conservation laws, a course by David Ketcheson.
- An Introduction to Applied Bioinformatics: Interactive lessons in bioinformatics, by Greg Caporaso.
- Colour science computations with colour, a Python package implementing a comprehensive number of colour theory transformations and algorithms supported by a dedicated collection of IPython Notebooks. More colour science related IPython Notebooks are available on colour-science.org.
- The notebooks from the Book Bioinformatics with Python Cookbook, covering several fields like Next-Generation Sequencing, Population Genetics, Phylogenetics, Genomics, Proteomics and Geo-referenced information.
- Learning Population Genetics in an RNA world is an interactive notebook that explains basic population genetics tools and techniques by building an in silico evolutionary model of RNA molecules.
- An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study. This notebook fully reproduces the research published in this paper. The notebook uses mostly python but includes some bash and R as well and is relevant for researchers in bioinformatics and public health.
- Lung Cancer Post-Translational Modification and Gene Expression Regulation. This Python notebook uses the Jupyter-widget Clustergrammer-Widget to visualize hierarchical clustering of gene expression and post-translational modification data from 37 lung cancer cell lines as an interactive heatmap. The notebook is part of the research project from this paper.
- Materials Science in Python using pymatgen. A series of python notebooks using the pymatgen package and materials project API for materials science.
- Applied Elasticity: A collection of notebooks used to determine solutions to some classical 2D elasticity problems. These were mostly live coded during class hours by Jeevanjyoti Chakraborty as part of the course "Applied Elasticity" in the Mechanical Engineering Department of the Indian Institute of Technology Kharagpur.
Earth Science and Geo-Spatial data
- EarthPy, a collection of IPython notebooks with a focus on Earth Sciences, from whale tracks to the flow of the Amazon.
- Python for Geosciences, a tutorial series aimed at the Earth Sciences community, by Nikolay Koldunov.
- Find graffiti close to NY subway entrances, one of a rich collection of notebooks on large-scale data analysis, by Roy Hyunjin Han.
- Logistic models of well switching in Bangladesh, part of the "Will it Python" blog series (repo) on Machine Learning and data analysis in Python. By Carl Vogel.
- Estimated likelihood of observing a large earthquake on a continental low‐angle normal fault and implications for low‐angle normal fault activity, an executable version of a paper by Richard Styron and Eric Hetland published in Geophysical Research Letters, on earthquake probabilities.
- python4oceanographers, a blog demonstrating analyses in physical oceanography from resource-demanding numerical computations with functions in compiled languages to specialized tidal analysis to visualization of various geo data using fancy things like interactive maps.
- Machinalis has a public repo with material support for geospatial-data processing related blog posts. It includes notebooks about Object Based Image Analysis and irrigation circles detection.
- seismo-live is a collection of live Jupyter notebooks for seismology. It includes a fairly large number of notebooks on how to solve the acoustic and elastic wave equation with various different numerical methods. Additionally it contains notebooks with an extensive introduction to data handling and signal processing in seismology, and notebooks tackling ambient seismic noise, rotational and glacial seismology, and more.
- Geo-Python is an introduction to programming in Python for Bachelors and Masters students in geo-fields (geology, geophysics, geography) taught by members of the Department of Geosciences and Geography at University of Helsinki, Finland. Course lessons and exercises are based on Jupyter notebooks and open for use by any interested person.
Linguistics and Text Mining
- Workshop on text analysis by Neal Caren.
- Detecting Algorithmically Generated Domains, part of the Data Hacking collection on security-oriented data analysis with IPython & friends.
- Mining the Social Web (3rd Edition). A complete collection of notebooks accompanying Matthew Russell and Mikhail Klassen's book by O'Reilly.
Engineering Education
- Introduction to Chemical Engineering Analysis by Jeff Kantor. A collection of IPython notebooks illustrating topics in introductory chemical engineering analysis, including stoichiometry, generation-consumption analysis, mass and energy balances.
- Sensors and Actuators by Andres Marrugo. A collection of Jupyter notebooks in the form of lecture notes and engineering calculations for the course IMTR 1713 Sensors and Actuators taught at the Universidad Tecnológica de Bolívar.
Scientific computing and data analysis with the SciPy Stack
General topics in scientific computing
- Algorithms in IPython notebooks, by Sebastian Raschka
- Comparing the performance of Python compilers - Cython vs. Numba vs. Parakeet, by Sebastian Raschka
- A Crash Course in Python for Scientists, by Sandia's Rick Muller.
- A gentle introduction to scientific programming in Python, biased towards biologists, by Mickey Atwal, Cold Spring Harbor Laboratory.
- Python for Data Science, a self-contained mini-course with exercises, by [
Joe McCarthy](https://interrelativity.com/joe). - First few lectures of the UW/Coursera course on Data Analysis. (Repo) by Chris Fonnesbeck.
- CythonGSL: a Cython interface for the GNU Scientific Library (GSL) (Project repo, by Thomas Wiecki.
- Introduction to numerical computing with numpy by Steve Phelps.
- Using Numba to speed up numerical codes. And another Numba example: self-organizing maps.
- Numpy performance tricks, and blog post, by Cyrille Rossant.
- IPython Parallel Push/Execute/Pull Demo by Justin Riley.
- Understanding the design of the R "formula" objects by Matthew Brett.
- Comparing different approaches to evolutionary simulations. Also available here to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of
reveal
converter. By Yoav Ram. - The Traveling Salesperson Problem by Peter Norvig.
- A git tutorial targeted at scientists by Fernando Perez.
- Running MATLAB in an IPython Notebook, using pymatbridge.
- Interactive Curve-Fitting The
lmfit
package provides a widget-based interface to the curve-fitting algorithms in SciPy. - A visual guide to the Python Spark API for distributed computing by Jeff Thompson
- A tutorial on Map-Reduce programming with Apache Spark and Python by Steve Phelps.
- CodeCombat gridmancer solver by Arn-O. This notebook explains how to improve a recursive tree search with an heuristic function and to find the minimum solution to the gridmancer.
Social data
- Survival Analysis, an illustration of the lifelines library, by Cam Davidson Pilon.
- A reconstruction of Nate Silver's 538 model for the 2012 US Presidential Election, by Skipper Seabold (complete repo).
- Data about the Sandy Hook massacre in Newtown, Conneticut, which accompanies a more detailed blog post on the subject. Here are the notebook and accompanying data. By Brian Keegan.
- More on gun violence analysis with Wikipedia data.
- An analysis of the Gaza-Israel 2012 crisis.
- Ranking NFL Teams. The full repo also includes an explanatory slideshow. By Sean Taylor.
- Automated processing of news media and generation of associated imagery.
- An analysis of national school standardized test data in Colombia using Pandas (in Spanish). By Javier Moreno.
- Getting started with GDELT, by David Masad. GDELT is a dataset containing more than 200-million geolocated events with global coverage for 1979 to the present. Another GDELT example from David, that nicely integrates mapping visualizations.
- Titanic passengers, coal mining disasters, and vessel speed changes, by Christopher Fonnesbeck
- A geographic analysis of Indonesian conflicts in 2012 with GDELT, by herrfz.
- Bioinformatic Approaches to the Computation of Poetic Meter, by A. Sean Pue, C. Titus Brown and Tracy Teal.
- Analyzing the Vélib dataset from Paris, by Cyrille Rossant (Vélib is Paris' bicycle-sharing program).
- Using Python to see how the Times writes about men and women, by Neal Caren.
- Exploring graph properties of the Twitter stream with twython and NetworkX, by F. Perez (complete gist repo with utilities here.)
- Kaggle Competition: Titanic Machine Learning from Disaster. By Andrew Conti.
- How clean are San Francisco's restaurants?, a data science tutorial that accompanies a blog post from Zipfian Academy.
- NYT gender wage gap and US crime by state.
- Predicting usage of the subway system in NYC, a final project for the Udacity Intro to Data Science Course, by Asim Ihsan.
- An exploratory statistical analysis of the 2014 World Cup Final, by Ricardo Tavares. Part of a notebook collection on football (aka soccer) analysis.
- San Francisco's Drug Geography, a GIS analysis of public crime data in SF, by Lance Martin.
- Geographic Data Science is an entire course by Dani Arribas-Bel to learn to access, munge, and analyse spatial data on social phenomena.
- Analysis and visualization of a public OKCupid profile dataset using Python and Pandas by Alessandro Giusti includes many colorful data visualizations.
Psychology and Neuroscience
- Cue Combination with Neural Populations by Will Adler. Intuition and simulation for the theory (Ma et al., 2006) that through probabilistic population codes, neurons can perform optimal cue combination with simple linear operations. Demonstrates that variance in cortical activity, rather than impairing sensory systems, is an adaptive mechanism to encode uncertainty in sensory measurements.
- Modeling psychophysical data with non-linear functions by Ariel Rokem.
- Visualizing mathematical models of brain cell connections. The effect of convolution of different receptive field functions and natural images is examined.
- Python for Vision Research. A three-day crash course for vision researchers in programming with Python, building experiments with PsychoPy and psychopy_ext, learning the fMRI multi-voxel pattern analysis with PyMVPA, and understading image processing in Python.
- Loading and visualizing fMRI data, part of the Functional connectivity with NiLearn course by Gaël Varoquaux.
Machine Learning, Statistics and Probability
- A tutorial introduction to machine learning with sklearn, an IPython-based slide deck by Andreas Mueller.
- Introduction to Machine Learning in Python with scikit-learn by Cyrille Rossant, a free recipe from the IPython Cookbook, a comprehensive guide to Python for Data Science.
- An introduction to Predictive Modeling in Python, by Olivier Grisel.
- Face Recognition on a subset of the Labeled Faces in the Wild dataset, by Olivier Grisel.
- An Introduction to Bayesian Methods for Multilevel Modeling, by Chris Fonnesbeck.
- Introduction to Bayesian Networks by Kui Tang
- Bayesian data analysis with PyMC3 by Thomas Wiecki.
- A collection of examples for solving pattern classification problems, by Sebastian Raschka.
- Introduction to Linear Regression using Python by Kevin Markham
- Machine learning in Python, a series based on Andrew Ng's Coursera class on machine learning. Part of a larger collection of data science notebooks by John Wittenauer.
- Probability, Paradox, and the Reasonable Person Principle, by Peter Norvig.
- How Likely Would You Give A Five-Star Review on Yelp? -- Getting Your Hands Dirty with scikit-learn, by Xun Tang. Complimentary slides.
- Geodemographic Segmentation Model, by Filipa Rodrigues
Physics, Chemistry and Biology
- Writing A Genome Assembler with blasr and (I)Python, by Jason Chin.
- Multibody dynamics and control with Python and the notebook file by Jason K. Moore.
- Manipulation and display of chemical structures, by Greg Landrum, using rdkit.
- The sound of Hydrogen, visualizing and listening to the quantum-mechanical spectrum of Hydrogen. By Matthias Bussonnier.
- Particle physics at the Large Hadron Collider (LHC): using ROOT in an LHCb masterclass: Notebook 1 and Notebook 2 notebooks by Alexander Mazurov and Andrey Ustyuzhanin at CERN.
- A Reaction-Diffusion Equation Solver in Python with Numpy, a demonstration of how IPython notebooks can be used to discuss both the theory and implementation of numerical algorithms on one page, by Georg Walther.
- Comparing different approaches to evolutionary simulations. Also available here to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of
reveal
converter. By Yoav Ram.
Economics and Finance
- Replication of the highly-contentious analysis of economic growth by Reinhart and Rogoff, by Vincent Arel-Bundock, full repo here. This is based on the widely-publicized critique of the original analysis done by Herndon, Ash, and Pollin.
- fecon235 for Financial Economics series of notebooks which examines time-series data for economics and finance. Easy API to freely access data from the Federal Reserve, SEC, CFTC, stock and futures exchanges. Thus research from older notebooks can be replicated, and updated using the most current data. For example, this notebook forecasts likely Fed policy for setting the Fed Funds rate, but market sentiment across major asset classes is observable from the CFTC Commitment of Traders Report. Major economics indicators are renormalized: for example, various measures of inflation, optionally with the forward-looking break-even rates derived from U.S. Treasury bonds. Other notebooks examine international markets: especially, gold and foreign exchange.
- Fixed Income: A Structured Bond- Interactive scenarios, Sequential repayment of a bond using interactive widgets and Python in Jupyter, by Mats Gustavsson.
Earth science and geo-spatial data
- Exploring seafloor habitats: geographic analysis using IPython Notebook with GRASS & R. This embeds a slideshow and a Web Spinning Globe (Cesium) in the notebook. By Massimo Di Stefano.
- Geo-Spatial Data with IPython. Tutorial by Kelsey Jordahl from SciPy2013.
Data visualization and plotting
- Plotting pitfalls: common problems when plotting large datasets, and how to avoid them. By James A. Bednar.
- US Census data and NYC Taxi data visualized using datashader.
- A Notebook with an interactive Hans Rosling Gapminder bubble chart from Plotly.
- Data and visualization integration via web based resources. Using NetCDF, Matplotlib, IPython Parallel and ffmpeg to generate video animation from time series of gridded data. By Massimo Di Stefano.
- [21 Interactive, D3 Plots from matplotlib, ggplot for Python,
prettyplotlib, Stack Overflow, and seaborn](https://nbviewer.ipython.org/gist/msund/7ac1203ded66fe8134cc). - Visualizing complex-valued functions with Matplotlib and Mayavi, by Emilia Petrisor.
- bqplot is a d3-based interactive visualization library built entirely on top of that
ipywidgets
infrastructure. Checkout the pythonic recreation of Hans Rosling's Wealth of Nations. - A D3 Viewer for Matplotlib Visualizations, different from above by not depending on Plot.ly account.
- Bokeh is an interactive web visualization library for Python (and other languages). It provides d3-like novel graphics, over large datasets, all without requiring any knowledge of Javascript. It also has a Matplotlib compatibility layer.
- HoloViews lets you construct visualizations very concisely in the notebook.
- Winner of the 2014 E. Tufte Slope Graphs contest, by Pascal Schetelat. The original contest info on Tufte's site.
- matta, d3.js-based visualizations in the IPython Notebook, by Eduardo Graells-Garrido.
- Clustergrammer Interactive Heatmap and DataFrame Viewer This Python notebook shows a simple example of how to visualize a matrix file and Pandas DataFrame as an interactive heatmap (built using D3.js) using the Jupyter Widget Clustergrammer (see paper).
- The Jupyter Widget Ecosystem - SciPy 2019 Tutorial on ipywidgets - a collection of 40 notebooks.
Mathematics
- Linear algebra with Cython. A tutorial that styles the notebook differently to show that you can produce high-quality typography online with the Notebook. By Carl Vogel.
- Exploring how smooth-looking functions can have very surprising derivatives even at low orders, combining SymPy and matplotlib. By Javier Moreno.
- A Collection of Applied Mathematics and Machine Learning Tutorials (in Turkish) and its English Translation By Burak Bayramli.
- Function minimization with iminuit, an introductory companion to their hard core tutorial. By the iminuit project.
- The Discrete Cosine Transform, a brief explanation and illustration of the math behind the DCT and its role in the JPEG image format, by Jim Mahoney.
- Chebfun in Python, a demo of PyChebfun, by Olivier Verdier. PyChebfun is a pure-python implementation of the celebrated Chebfun package by Battles and Trefethen.
- The Matrix Exponential, an introduction to the matrix exponential, its applications, and a list of available software in Python and MATLAB. By Sam Relton.
- Fractals, complex numbers, and your imagination, by Caleb Fangmeier.
- A SymPy tutorial, by Andrey Grozin.
- Introduction to Mathematics with Python, a collection of notebooks aimed at Mathematicians with no/little Python knowledge. Notebooks can be selected to serve as resources for a workshop. By Vince Knight.
Signal and Sound Processing
- Simulation of Delta Sigma modulators in Python with deltasigma, Python port of of Richard Schreier's excellent MATLAB Delta Sigma Toolbox, by Giuseppe Venturini. Several demonstrative notebooks on the package README.
- PyOracle: Automatic analysis of musical structure, by Greg Surges.
- A Gallery of SciPy's Window Functions for quick visual inspection and comparison
by Jaidev Deshpande - Poisson Image Editing - Seamless Cloning by Dhruv Ilesh Shah is a notebook that achieves Seamless Image Cloning by employing the Poisson Solver in the iterative form.
- Blind Source Separation - Cocktail Party Problem by Dhruv Ilesh Shah & Shashwat Shukla is a notebook that achieves blind source separation, on audio signals in an attempt to approach the Cocktail Party Prblem. The problem has been tackled in two different methods - the FOBI and fastICA.
Natural Language Processing
- Python Programming for the Humanities by Folgert Karsdorp & Maarten van Gompel.
- News Categorization using Multinomial Naive Bayes by Andres Soto Villaverde.
- Using random cross-validation for news categorization by Andres Soto Villaverde.
- Named Entity Recognition in French biomedical text by Andrés Soto Villaverde
- Named Entity Recognition in French biomedical text (Part 2) by Andrés Soto Villaverde
Pandas for data analysis
Note that in the 'collections' section above there are also pandas-related links, such as the one for an 11-lesson tutorial.
- Python Pandas DataFrame Basics, this notebook explains the basic concepts of a pandas data frame from scratch for beginners with examples, by Tanu Nanda Prabhu.
- A 10-minute whirlwind tour of pandas, this is the notebook accompanying a video presentation by Wes McKinney, author of Pandas and the Python for Data Analysis book.
- Manipulating the data with Pandas using Python, this notebook explains various operations and methods of Pandas library from the scratch with the help of an example, by Tanu Nanda Prabhu.
- Time-series analysis with Pandas.
- Financial data analysis with Pandas.
- Clustering of smartphone sensor data for human activity detection using pandas and scipy, part of Coursera data analysis course, done in Python (repo).
- Log analysis with Pandas, part of a group presented at PyConCa 2012 by Taavi Burns.
- Analyzing and visualizing sun spot data with Pandas, by Josh Hemann. An enlightening discussion of how naive plotting choices subtly influence our interpretation of data.
- Advanced analysis of Apache logs, by Nikolay Koldunov.
- Statistical Data Analysis in Python, by Christopher Fonnesbeck, SciPy 2013. Companion videos 1, 2, 3, 4
General Python Programming
- HowtogetstartedcodinginPython?, this notebook explains how to become a good python programmer, by Tanu Nanda Prabhu, author and editor at Towards data science
- Python Strings from Scratch !!!, this notebook explains Python Strings from basic to advance level, by Tanu Nanda Prabhu
- Python Tuples from Scratch !!!, this notebook explains Python Tuples from basic to advance level, by Tanu Nanda Prabhu
- Python Dictionary from Scratch !!!, this notebook explains Python Dictionary from basic to advance level, by Tanu Nanda Prabhu
- Python Lists from Scratch !!!, this notebook explains Python Lists from basic to advance level with the help of an example, by Tanu Nanda Prabhu.
- Learning to code with Python, part of an introduction to Python from the Waterloo Python users group.
- Introduction to Python for Data Scientists by Steve Phelps (part of a larger collection on Data Science and Big Data).
- Python Descriptors Demystified, an in-depth discussion of the descriptor protocol in Python, by Chris Beaumont.
- A collection of not so obvious Python stuff you should know!, by Sebastian Raschka.
- Key differences between Python 2.7.x and Python 3.x, by Sebastian Raschka.
- A beginner's guide to Python's namespaces, scope resolution, and the LEGB rule, by Sebastian Raschka.
- Sorting CSV files using the Python csv module, by Sebastian Raschka.
- Python 3 OOP series by Leonardo Giordani: Part 1: Objects and types, Part 2: Classes and members, Part 3: Delegation - composition and inheritance, Part 4: Polymorphism, Part 5: Metaclasses, Part 6: Abstract Base Classes
- How to Aggregate Subscriber's Interest using the 3 methods: (1) Python Dictionary, (2) Apache PySpark - GroupBy Transformation, and (3) Apache PySpark - ReduceBy Transformation by Abbas Taher.
Notebooks in languages other than Python
These are notebooks that use one of the IPython kernels for other languages:
Julia
The IPython protocols to communicate between kernels and clients are language agnostic, and other programming language communities have started to build support for this protocol in their language. The Julia team has created IJulia, and these are some Julia notebooks:
- Fractals 3 ways, by Jeff Bezanson.
- The Design Impact of Multiple Dispatch, a detailed explanation of Julia's multiple dispatch design, by Stefan Karpinski.
- A tutorial on making interactive graphs with Plotly and Julia.
- Numerical tours in Julia
- Functional Geometry by Shashi Gowda
- JuliaOpt notebooks, a collection of optimization-related notebooks.
-
Coursework using IJulia notebooks:
- Métodos Numéricos Avanzados (2015-2), Luis Benet and David P. Sanders
- Métodos Monte Carlo, David Sanders
- Linear Partial Differential Equations: Analysis and Numerics, Steven G. Johnson
- Julia tutorial for Computational Molecular Biology, Younhun Kim and Matthew Reyna
-
Other collections of IJulia notebooks:
- Jiahao Chen
- Christoph Ortner
- Crossing Language Barriers with Julia, Scipy, and IPython, presented at EuroSciPy '14 by Steven G. Johnson.
Haskell
There exists a Haskell kernel for IPython in the IHaskell project.
- IHaskell Demo Notebook
- Homophone reduction, a solution to a cute problem involving treating English letters as generators of a large group.
- Gradient descent typeclass, a look at how arbitrary gradient descent algorithms can be represented with a typeclass.
OCaml
iocaml is an OCaml kernel for IPython
Ruby
Similar to the Julia kernel there exists also a Ruby kernel for IPython.
The interactive plotting library Nyaplot has some case studies using IRuby:
Perl
- An example showcasing full use of the display protocol with the IPerl kernel.
F#
C#
- Xamarin Workbooks Create a rich C# workbook for Android, iOS, Mac, WPF, or Console, and get instant live results as you learn these APIs.
Javascript
- Two IJavascript notebooks that demonstrate how to use D3 to do computations and send a SVG back and play with a virtual DOM
Miscellaneous topics about doing various things with the Notebook itself
- Blogging With IPython in Blogger, also available in blog post form, full repo here. By Fernando Perez.
- Blogging With IPython in Octopress, by Jake van der Plas and available as a blog post. Other notebooks by Jake contain many more great examples of doing interesting work with the scientific Python stack.
- Blogging With IPython in Nikola, also available in blog post form by Damián Avila.
- Custom CSS control of the notebook, this is part of a blog repo by Matthias Bussonnier.
- IPython display hookery: tools to help display visual output from various sources, a gist by @deeplook.
- Importing IPython Notebooks as Modules by Min RK.
- Getting Started With Jupyter Notebooks for Teaching and Learning by Tony Hirst at OpenLearn
-
Toward Data Science blogs:
Reproducible academic publications
This section contains academic papers that have been published in the peer-reviewed literature or pre-print sites such as the ArXiv that include one or more notebooks that enable (even if only partially) readers to reproduce the results of the publication. If you include a publication here, please link to the journal article as well as providing the nbviewer notebook link (and any other relevant resources associated with the paper).
- Revealing ferroelectric switching character using deep recurrent neural networks. Github page where code is located. Jupyter Paper. Raw Data
- Discovery of Gravitational Waves by the LIGO collaboration. That page, from the LIGO Open Science Center, contains multiple notebooks for various datasets corresponding to different events; this binder lets you run the code right away. More details on the GW150914 event as well as the original main Physical Review Letters paper, "Observation of Gravitational Waves from a Binary Black Hole Merger".
- Characterizing Strain Variation in Engineered E. coli Using a Multi-Omics-Based Workflow, by Brunk et al.
- Predicting Coronal Mass Ejections Using Machine Learning Methods by Monica Bobra and Stathis Ilonidis (Astrophysical Journal, 2016). An IPython notebook, which reproduces all the results, has been permanently deposited in the Stanford Digital Repository.
- The Paper of the Future by Alyssa Goodman et al. (Authorea Preprint, 2017). This article explains and shows with demonstrations how scholarly "papers" can morph into long-lasting rich records of scientific discourse, enriched with deep data and code linkages, interactive figures, audio, video, and commenting. It includes an interactive d3.js visualization and has an astronomical data figure with an IPYthon Notebook "behind" it.
- Reply to 'Influence of cosmic ray variability on the monsoon rainfall and temperature': a false-positive in the field of solar-terrestrial research by Benjamin Laken, 2015. Reviewed article will appear in JASTP. The IPython notebook reproduces the full analysis and figures exactly as they appear in the article, and is available on Github: link via figshare.
- An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study, by Zichen Wang and Avi Ma'ayan. (F1000Research 2016, 5:1574). An IPython notebook was used to perform the proposed RNA-Seq pipeline using public gene expression data of human cells after Zika virus infection. The computational pipeline is also version controlled and Dockerized available here.
- The probability of improvement in Fisher's geometric model: a probabilistic approach, by Yoav Ram and Lilach Hadany. (Theoretical Population Biology, 2014). An IPython notebook, allowing figure reproduction, was deposited as a supplementry file.
- Stress-induced mutagenesis and complex adaptation, by Yoav Ram and Lilach Hadany (Proceedings B, 2014). An IPython notebook, allowing figures reproduction, was deposited as a supplementry file.
- Automatic segmentation of odor maps in the mouse olfactory bulb using regularized non-negative matrix factorization, by J. Soelter et al. (Neuroimage 2014, Open Access). The notebook allows to reproduce most figures from the paper and provides a deeper look at the data. The full code repository is also available.
- Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss, by A. Gross et al. (Nature Genetics 2014). The full collection of notebooks to replicate the results.
- Dog and human inflammatory bowel disease rely on overlapping yet distinct dysbiosis networks, by Vázquez-Baeza et al. (Nature microbiology 2016). The full collection of notebooks to replicate the results.
- powerlaw: a Python package for analysis of heavy-tailed distributions, by J. Alstott et al.. Notebook of examples in manuscript, ArXiv link and project repository.
- Collaborative cloud-enabled tools allow rapid, reproducible biological insights, by B. Ragan-Kelley et al.. The main notebook, the full collection of related notebooks and the companion site with the Amazon AMI information for reproducing the full paper.
- A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data, by C.T. Brown et al.. Full notebook, ArXiv link and project repository.
- The kinematics of the Local Group in a cosmological context by J.E. Forero-Romero et al.. The Full notebook and also all the data in a github repo.
- Warming Ocean Threatens Sea Life, an article in Scientific American backed by a notebook for its main plot. By Roberto de Almeida from MarinExplore.
- Extrapolating Weak Selection in Evolutionary Games, by Wu, García, Hauert and Traulsen. PLOS Comp Bio paper and Figshare link.
- Using neural networks to estimate redshift distributions. An application to CFHTLenS
by Christopher Bonnett paper(submitted to MNRAS) - Mechanisms for stable, robust, and adaptive development of orientation maps in the primary visual cortex by Jean-Luc R. Stevens, Judith S. Law, Jan Antolik, and James A. Bednar. Journal of Neuroscience, 33:15747-15766, 2013. [Notebook1]
(https://ioam.github.io/topographica/_static/gcal_notebook.html), Notebook2. - Accelerated Randomized Benchmarking, by Christopher Granade, Christopher Ferrie and D. G. Cory. New Journal of Physics 17 013042 (2015), arXiv, GitHub repo.
- Dynamics and associations of microbial community types across the human body, by Tao Ding & Patrick D. Schloss. Notebook replicating results.
- Variations in submarine channel sinuosity as a function of latitude and slope, by Sylvester, Z., Pirmez, C., Cantelli, A., & Jobe, Z. R.
- Frontoparietal representations of task context support the flexible control of goal directed cognition, by M.L. Waskom, D. Kumaran, A.M. Gordon, J. Rissman, & A.D. Wagner. Github repository -Main notebook
- pyparty: Intuitive Particle Processing in Python, Adam Hughes Notebook to Generate the Published Figures Also, check out the pyparty tutorial notebooks.
- Indication of family-specific DNA methylation patterns in developing oysters, Claire E. Olson, Steven B. Roberts
doi: https://dx.doi.org/10.1101/012831. Notebook to generate results in the paper. - Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof, Jiahao Chen and Alan Edelman, HPTCDL'14. Website and notebook
- Transcriptome Sequencing Reveals Potential Mechanism of Cryptic 3’ Splice Site Selection in SF3B1-mutated Cancers by Christopher DeBoever et al. There are several notebooks to replicate results and make figures.
- A Workflow for Characterizing Nanoparticle Monolayers for Biosensors: Machine Learning on Real and Artificial SEM Images, Adam Hughes, Zhaowen Liu, Maryam Raftari, Mark. E Reeves. Notebooks are linked in Table 1 in the text.
- AtomPy: An Open Atomic Data Curation Environment for Astrophysical Applications, by C. Mendoza, J. Boswell, D. Ajoku, M. Bautista.
- Visualizing 4-Dimensional Asteroids, in Scientific American (by Jake VanderPlas)
- Challenges and opportunities in understanding microbial communities with metagenome assembly, accompanied by IPython Notebook tutorial, by Adina Howe and Patrick Chain.
- Structure of a shear-line polar low (2016) by Sergeev, D. E., Renfrew, I. A., Spengler, T. and Dorling, S. R. Q.J.R. Meteorol. Soc. doi:10.1002/qj.2911. Accompanied by Notebooks to generate the published figures.
- Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps by Zachary R. Sailer and Michael J. Harms published in Genetics, March 2017 . All figures can be reproduced by the set of notebooks in this Github repo.
- Summary Analysis of the 2017 GitHub Open Source Survey by Stuart Geiger. Preprint in SocArXiv, June 2017. doi:10.17605/OSF.IO/ENRQ5. Paper is derived from a notebook converted to LaTeX with nbconvert. Notebook and materials at: OSF, GitHub, nbviewer
- The weirdest SDSS galaxies: results from an outlier detection algorithm, by D. Baron and D. Poznanski. Notebooks to replicate.
- Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data, by Nicolas Fernandez et al. Notebooks: Fig. 3, Fig. 4, Fig. 5
- Sociology: An investigation of Social Class Inequalities in General Cognitive Ability in Two British Birth Cohorts. Preprint in SocArXiv, December 2017. doi: 10.17605/OSF.IO/SZXDM. Notebook and materials at: OSF, GitHub, nbviewer.
- An on-chip architecture for self-homodyned nonclassical light, quant-ph ArXiV preprint, Nov 2016, by Fischer et al. A supporting notebook for all calculations included in the ArXiV submission.
- A nested sampling code for targeted searches for continuous gravitational waves from pulsars, gr-qc ArXiV preprint, May 2017, by Pitkin et al. Complete repo with supporting notebooks and sources on GitHub.
- HyperTools: A Python toolbox for visualizing and manipulating high-dimensional data, stat.OT ArXiV preprint by Heusser et al. A repo with companion notebooks is available, that links to the library itself, HyperTools.
- Nonsinusoidal beta oscillations reflect cortical pathophysiology in Parkinson's disease, in Journal of Neuroscience by Cole et al. A repo with companion notebooks with all necessary data is available to reproduce all figures.
- Cycle-by-cycle analysis of neural oscillations, in bioRxiv by Cole & Voytek. A repo with companion notebooks with all necessary data is available to reproduce all figures. This repo also links to the related useful library, neurodsp, which contains notebooks of tutorials.
- pyKNEEr: An image analysis workflow for open and reproducible research on femoral knee cartilage, a preprint by S. Bonaretti et al. Jupyter notebooks are used as a graphical user interface for medical image processing and analysis. The paper is interactive, with links to data, software, and documentation throughout the text. Every figure caption contains links to fully reproduce graphs.
Data-driven journalism
- The Need for Openness in Data Journalism, by Brian Keegan.
- St. Louis County Segregation Analysis , analysis for the article The Ferguson Area Is Even More Segregated Than You Probably Guessed by Jeremy Singer-Vine.
- Size of thesis and dissertations in Quebec, by Jean-Hugues Roy (in French).
Whimsical notebooks
- XKCD-styled plots created with Matplotlib. Here is the blog post version with discussion. By Jake van der Plas.
- Van Gogh's Starry Night with ipythonblocks, part of Matt Davis' ipythonblocks. This is a teaching tool for use with the IPython notebook that provides visual elements to understand programming concepts.
- Conway's Game of Life. Interesting use of convolution operation to calculate the next state of game board, instead of obvious find neighbors and filter the board for next state.
- pynguins. Using jupyter notebook, python, and numpy to solve Board Game "Penguins on Ice".
- "People plots", stick figures generated with matplotlib.
- Reveal converter mini-tutorial, also available in blog post form. Do you want to make static html/css slideshow straight from the IPython notebook? OK, now you can do it with the reveal converter (nbconvert). Demo by Damián Avila.
- Personal IPython Weight Notebook. Plot your loss of weight with prognosis and motivation features.
- Streaming Double Pendulum Simulation in IPython NB.
- Porque Charles Xavier debe cambiar a Cerebro por Python, a study in data and gender in the Marvel comics universe, by Mai Giménez and Angela Rivera.
- Functional Geometry: a deconstruction of the MC Escher woodcut Square Limit, an IJulia notebook by Shashi Gowda.
- Solving physical puzzles with a Jupyter Noteboook.
Videos of IPython being used in the wild
Of course the first thing you might try is searching for videos about IPython (1900 or so by last count on Youtube) but there are demonstrations of other applications using the power of IPython but are not mentioned is the descriptions. Below are a few such:
- Video on how to learn Python featuring IPython as the platform of choice for learning!
- This video shows IPython being used in the scikit-learn project
- He does not show IPython in use but his IPython sticker is clear for the entire video: Planning and Tending the Garden: The Future of Early Childhood Python Education
- Wes McKinney's speech on Python and data analysis features IPython as does his book Python for Data Analysis
- This video shows Plotly and IPython in use at a Montreal Python meetup.
Accessing and programing a IBM quantum computer via notebooks
- Github notebook example (scroll down) illustrating how to use Qiskit and access the IBMQ quantum computers.