Me, Myself, and I

Hi, I'm Matthias. You may not her it while reading this, but I have a relatively strong French Akeussent.

What do you like to be known for ?

I mostly like to be known for writing tools that get out of your way. Which is a bit of a bummer as as a Scientist I'm mostly judged on my publication while I prefer to enable others to do their work more efficiently and give them ideas.

One of the software I'm the most proud of is IPython, I did not create it, Fernando, but I think the current version I maintain with Thomas and Min is one of the best tools for Python programmers and Python scientist since Slice Bread.

IPython is also the canonical kernel implementation for Jupyter, and the jupyter notebook. I'm really happy to have written a large part of the notebook codebase, and I'm quite happy with a lot of small details in the notebook UI that just go out of your way and make do the right thing. I know that most people don't notice, so I'm pretty happy when people like Wes McKinney do comments during their tutorial like "I don't know who implemented that, but thank you".

What's you background ?

My story is a little unusual. You probably don't know much about the French education System, and even if you did I went through a pretty complicated path. I went to what is called "Classes Préparatoires", where I learn Physics, Chemistry, Engineering, Math, and a bit of Computing. (PCSI, PC* if you want to know). I really enjoyed being challenged and trying to solve the problems I knew I did not knew the answers to. (When you know you know the answer it's kinda demotivating).

I got accepted to ENS-Cachan, where I learned a lot about experimental physics and in a lesser way chemistry. Got really interested in Particle Physics, and quantum mechanics, ended up doing an internship at SLAC, alone in front of a computer from 4 month. It was enough to convince me that I definitively do not want to write software in a corner to be potentially published a few years later. I prefer to have something I can control (or at least comprehend) mostly and have is more at human-scale. So I changed my way to Fluid Mechanics (in which I got my Master Degree).

In the meantime I took the equivalent of a sabbatical year and passed the "Agrégation of Physics". According to wikipedia:

In France, the agrégation (French pronunciation: [aɡʁeɡasjɔ̃]) is the most prestigious and selective civil service competitive examination for the public education system. The laureates are known as agrégés, an the candidates agrégatifs.

More practically at least in my case, it was a year were I had a chance to no only learn in depth about a wide range of subject in Physics, Electronic and Chemistry but also gain knowledge about their history, and how they are taught.

I ended up becoming a laureate of Aggregation, if that have any meaning, but having the chance to have an exam where the subject is 3 words "measure in physics" and have 4 hours to not only come up with a 1h lecture and experiments was probably one of the best and worse at the same time experience of my life.

After that I ended up stating a PhD in biophysics. Basically messing up with my own microscope, lasers, and cell component to measure mechanical properties of stuff (Actin gels mechanics, you can read it here if you like). Having a relatively controlled experiment where I could influence almost anything from the raw material to the end product and the processes used all along the pipeline was great. That approximatively here that I realized that a large part that is missing in science is people that know how and care to builds good tools. I ended up graduating after 3 years, but that's not the most interesting part of my PhD.

Early on, (end of 2010), I started to use and contribute to IPython. Having a non-negligible background in messing up with computers, I realized that matlab was not the best choice to manage and maintain my custom tooling for 3 years, so I decided to refactor thing in Python with was not yet my language of choice, but close to what I though would be a great fit.

I got involved with IPython, and starting to contribute regular improvement/fixes as I encountered them on my day to day workflow. While the initial time to spend writing good, modular, tested code (and spending the time to contribute fixes upstream) looked like a large chunck of time, it ended up being beneficial in the long run. I was able in the last third of my PhD to run analysis or test hypothesis in a mater of hours instead of days, by still performing experiment.

My involvement with IPython allowed me to become a core developer, travel to UC Berkeley, and attend a couple of conferences (like SciPy), I would not have attended otherwise. This made me aware of how the tools I was developing for mostly myself had impact on thousand of people. I understood better that if I spend an hour making 60 scientist do their job 1 minute faster, then my global impact on Science is positive.

Also right now my background is beige for the top half of the wall.

More Casual CV:

awards and fellowship

  • Moore Data Fellow – Moore Data-Driven Discovery
  • AXA Research Fund Doctoral Fellowship 2010-2014

Publication and Presentation

Peer reviewed article

  • Active mechanics in living oocytes reveal molecular-scale force kinetics – Biophysical Journal – 2016 – Ahmed, Wylie; Fodor, Etienne; Almonacid, Maria; Bussonnier, Matthias; Verlhac, Marie-Helene; Gov, Nir; Visco, Paolo; van Wijland, Frederic; Betz, Timo;
  • Nonequilibrium dissipation in living oocytes – EPL (Europhysics Letters) – 2016 – Fodor, Étienne; Ahmed, Wylie W; Almonacid, Maria; Bussonnier, Matthias; Gov, Nir S; Verlhac, M-H; Betz, Timo; Visco, Paolo; van Wijland, Frédéric;
  • Cell-sized liposome doublets reveal active tension build-up driven by acto-myosin dynamics – Soft Matter – 2016 – Caorsi, V; Lemière, J; Campillo, C; Bussonnier, M; Manzi, J; Betz, T; Plastino, J; Carvalho, K; Sykes, C;
  • Jupyter Notebooks—a publishing format for reproducible computational workflows – Positioning and Power in Academic Publishing: Players, Agents and Agendas – 2016 – Kluyver, Thomas; Ragan-Kelley, Benjamin; Pérez, Fernando; Granger, Brian; Bussonnier, Matthias; Frederic, Jonathan; Kelley, Kyle; Hamrick, Jessica; Grout, Jason; Corlay, Sylvain;
  • Quantification of collagen contraction in three-dimensional cell culture – Methods in cell biology – 2015 – Kopanska, Katarzyna S; Bussonnier, Matthias; Geraldo, Sara; Simon, Anthony; Vignjevic, Danijela; Betz, Timo;
  • Active diffusion positions the nucleus in mouse oocytes – Nature cell biology – 2015 – Almonacid, Maria; Ahmed, Wylie W; Bussonnier, Matthias; Mailly, Philippe; Betz, Timo; Voituriez, Raphaël; Gov, Nir S; Verlhac, Marie-Hélène;
  • Active mechanics reveal molecular-scale force kinetics in living oocytes – arXiv preprint arXiv:1510.08299 – 2015 – Ahmed, Wylie W; Fodor, Etienne; Almonacid, Maria; Bussonnier, Matthias; Verlhac, Marie-Helene; Gov, Nir S; Visco, Paolo; van Wijland, Frederic; Betz, Timo;
  • Mechanical detection of a long-range actin network emanating from a biomimetic cortex – Biophysical journal – 2014 – Bussonnier, Matthias; Carvalho, Kevin; Lemière, Joël; Joanny, Jean-François; Sykes, Cécile; Betz, Timo;
  • Actin gel mechanics – University Paris VII archives – 2014 – Bussonnier, Matthias;
  • The Jupyter/IPython architecture: a unified view of computational research, from interactive exploration to communication and publication. – AGU – 2014 – Ragan-Kelley, M.; Perez, F.; Granger, B.; Kluyver, T.; Ivanov, P.; Frederic, J.; Bussonnier, M.;
  • Nuclear positioning powered by F-actin flows in oocytes – MOLECULAR BIOLOGY OF THE CELL – 2013 – Almonacid, M; Bussonnier, M; Betz, T; Luksza, M; Queguiner, I; Voituriez, R; Gov, N; Verlhac, MH;
  • IPython: components for interactive and parallel computing across disciplines – AGU – 2013 – Perez, F.; Bussonnier, M.; Frederic, J. D.; Froehle, B. M.; Granger, B. E.; Ivanov, P.; Kluyver, T.; Patterson, E.; Ragan-Kelley, B.; Sailer, Z.;

Talks

  • Jupyter and the IPython kernel, hidden and awaiting implementation featuresPyData Meetup Paris , FR– Nov 2017 – M. Bussonnier
  • Jupyter, Kernels and ProtocolsJupyterConJupyterCon – 2017 – M. Bussonnier, P. Ivanov
  • Building Bridges: Stopping Python 2 support in libraries without damagesPyBay – 2017 – M. Bussonnier – slides (source) – video
  • Ending Py2/Py3 compatibility in a user friendly manner – PyCon – 2017 – M. Bussonnier, M. Pacer, T. Kluyver – slidesvideo
  • Documentation and Continuous Integration in Python with Sphinx and Travis CI – The Hacker Within, Berkeley – 2017 – Nelle Varoquaux, Chris Holdgraf, Matthias Bussonnier
  • Continuous integration, Documentation and Travis CI – Docathon Conference, Berkeley – 2017 – Matthias Bussonnier – slides
  • Git and GitHub – The Hacker Within, Berkeley – 2017 – Ciera Martinez and Matthias Bussonnier
  • Keynote: Jupyter : an insider story – JupyterDay Boston , Invited Keynote, about the history of Jupyter and the challenges of managing a growing open source project
  • Project Jupyter Overview – PyBay – 2016 – By Jamie Whitacre, and Matthias Bussonnier
  • Xonsh: Put some Python in your shell PyBay – 2016 – Matthias Bussonnier and the Xonsh Core team – video slides
  • Python Metaprogramming and Conversion to Python 3 – 2016 – The hacker Within, Berkeley – Ryan Pavlovsky wand Matthias Bussonnier
  • Jupyter, from data gathering to publications – PLoS: Lunch and Learn, talks and Posdcast – 2016 – Matthias Bussonnier
  • Talks Python to me : Episode #44: Project Jupyter and IPython – Podcast – 2016 – Min RK, Michael Kennedy, Matthias Bussonnier
  • Jupyter: A tool for Open Science – UC Merced – 2016 – Invited presentation for the UC Merced Applied Math Department
  • Jupyter, State of Real–Time collaboration – SciPy – 2015– Matthias Bussonnier and Kester Tong – video
  • Jupyter: A tool for datascience at scale – LBL Labtech – 2015 – Presentation about the current and future state of Jupyter at the Lawrence Berkeley National Laboratory LabTech conference.
  • IPython: protocol, kernels and new features – EuroSciPy – 2014 – By Thomas Kluyver and Matthias Bussonnier
  • Jupyter/IPython notebook, a tool for data science – NSLS–II workshop at Brookhaven National Lab. – 2013 – Matthias Bussonnier

Workshop conference: organizer / lecturer / instructor / helper

  • Jupyter Tutorial – PyCon – 2017 – Michael Bright, Matthias Bussonnier
  • Jupyter Advanced Topics Tutorial – SciPy – 2015 – Matthias Bussonnier, Jonathan Frederic and Thomas Kluyver – video
  • Python Bootcamp Fall 2016 – Berkeley - 2016
  • Software Carpentry Worksops – Multiple Location – Multiple years
  • IPython Advanced tutorial - EuroSciPy - 2013 - Min Ragan-Kelley, Matthias Bussonnier

  • Docathon - Berkeley - 2017 – website

  • The Hacker Within – Berkeley – once a week during academic semester – 2015-2017

Open Source, Scientific computing, and Data Science.

I joined the IPython project as a Core developer during the early day of my PhD in 2011. I helped it evolved from an interactive Command line interface into the full feature web-based notebook interface which is now known as the broader Project Jupyter and used by millions of user and help a large range of discovery. I am happy to have played played a key role in many of the current design. One of my key insight was developing nbconvert and host it as nbviewer, a service allowing to view online Jupyter notebook that grew to more than 100k visit a week, and is now integrated into well known platform like GitHub, FigShare, Authorea, and a growing number of publishing platform.

Developing tools in particular as open-source make it hard to collect direct metrics. As users of the tools can install it from the freely available and redistributable source code there is no download count. As these tools are also widely use in sensitive environment they do not use beacons to signal their usage nor require a license key, so usage is hard to track. It's also relatively rare to see software cited in a paper. Tools like depsy.org try to aggregate software impact as one metric and put IPython in the 100\% percentile. We can also use various proxy measures to understand that tools I developed have impact in all disciplines, in education, and in industry.

Jupyter Notebooks on GitHub

There are approximatively 800,000 jupyter notebooks publicly available on GitHub. I collaborated with GitHub to provide rendering of notebooks in 2015, which are now first class citizen to be available for their 12 Million users.

User Base

The user base of the Jupyter Notebook and IPython is estimated to be 5 Million User, with respectively 160,000 , and 420,000 download for the month of February 2017.

Jupyter known deployments

The Jupyter notebook can be deployed at scale in a variety of context ranging from small teams in research laboratory, to several hundreds in industry and education. While I do not have exact number of deployments, there exists a number of well known public deployments:

  • UC Berkeley, is serving a campus-wide curriculum in Data Science with about 800 students and is scheduled to scale to the entire incoming freshman class every year.
  • Cal Poly San Luis Obispo and the Bryn Mawr College are teaching a couple of classes using Jupyter/IPython.
  • At Lawrence Berkeley National lab, the National Energy Research Scientific Computing Center (NERSC), as a gateway to supercomputing resources available to over 5,000 researchers who use the facilities.
  • Part of the Data analysis platform at CERN provide a deployment for CERN Scientist
  • The Wikimedia Foundation, has a public deployment of Jupyter notebooks that can be access with any Wikipedia/Wikimedia user and have direct access to the Wikimedia datasets.
  • Anybody with a Microsoft account can access https://notebooks.azure.com which provides a Jupyter environment
  • IBM Data Scientist Workbench is integrated with Jupyter
  • Multiple companies from the Fortune 100 deploy Jupyter at the center of their data science and analytics departments

Nbviewer

The Jupyter project now host the NbViewer service (nbviewer.juyter.org) I developed and provides it as an online service to render notebook for easy sharing. As of February 2017 the monthly page view is about 800k for 300k single user. Nbviewer is at the base of many further improvement of the Jupyter platform and is at the base of many of the Jupyter usage developed in next section.

Remarkable uses of Jupyter/IPython

Following are a couple of example showing how the tools I develop are influencing publishing, and the publication workflow.

The discovery of gravitation waves at the beginning of 2016 by the LIGO interferometer was stunning. A surprise of this discovery was that it was accompanied by a set of Jupyter notebook to reproduce the analysis pipeline. Their notebook can be viewed on nbviewer, or run on binder

Probabilistic Programming & Bayesian Methods for Hackers is a book that was initially publish on the online nbconvert platform, got a large success, and is now available as a hard cover book.

Regex golf is a Jupyter notebook by Peter Norvig that initially appeared early after on the nbviewer platform I developed, after several years of efforts it lead to the development of O Reill'y Oriole, where the original regex golf example can be live edited and execute in the browser along a video explanation by Peter Norvig himself.

Nature cover highlight and accompanying live demo. In November 2014 Nature News published "Interactive notebooks: Sharing the code" featuring Interviews and a live interactive notebook, hosted on nature.com and showing the potential to make executable academic publications. This demo broke the traffic estimate from the nature editor and served more than 20,000 notebook in the first moth post publication.