JupyterCon - Display Protocol
This is an early preview of what I am going to talk about at Jupyter Con
Leveraging the Jupyter and IPython display protocol¶
This is a small essay to show how one can make a better use of the display protocol. All you will see in this blog post has been available for a couple of years but noone really built on top of this.
What I'm going to show below, is that one is not limited to these – you can alter the representation of any existing object without modifying its source – and that this can be used to alter the view of containers, with the example of lists, to make things easy to read.
Modifying objects reprs¶
This section is just a reminder of how one can change define representation for object which source code is under your
control. When defining a class, the code author needs to define a number of methods which should return the (data, metadata)
pair for a given object mimetype. If no metadata is necesary, these can be ommited. For some common representations short methods name ara availables. These methond can be recognized as they all follow the following pattern _repr_*_(self)
. That is to say, an underscore, followed by repr
followed by an underscore. The star *
need to be replaced by a lowercase identifier often refering to a short human redable description of the format (e.g.: png
, html
, pretty
, ...), ad finish by a single underscore. We note that unlike the python __repr__
(pronouced "Dunder rep-er" which starts and ends wid two underscore, the "Rich reprs" or "Reprs-stars" start and end with a single underscore.
Here is the class definition of a simple object that implements three of the rich representation methods:
- "text/html" via the
_repr_html_
method - "text/latex" via the
_repr_latex_
method - "text/markdown" via the
_repr_markdown
method
None of these methonds return a tuple, thus IPython will infer that there is no metadata associated.
The "text/plain" mimetype representation is provided by the classical Python's __repr__(self)
.
class MultiMime:
def __repr__(self):
return "this is the repr"
def _repr_html_(self):
return "This <b>is</b> html"
def _repr_markdown_(self):
return "This **is** mardown"
def _repr_latex_(self):
return "$ Latex \otimes mimetype $"
MultiMime()
All the mimetypes representation will be sent to the frontend (in many cases the notebook web interface), and the richer one will be picked and displayed to the the user. All representations are stored in the notebook document (on disk) and this can be choosen from when the document is later reopened – even with no kernel attached – or converted to another format.
External formatters and containers¶
As stated in teh introduction, you do not need to have control over an object source code to change its representation. Still it is often a more convenient process. AS an example we will build a Container for image thumbnails and see how we can use the code written for this custom container to apply it to generic Python containers like lists.
As a visual example we'll use Orly Parody books covers, in particular a small resolution of some of them so llimit the amount of data we'll be working with.
cd thumb
let's see some of the images present in this folder:
names = !ls *.png
names[:20], f"{len(names) - 10} more"
in the above i've used an IPython specific syntax (!ls
) ton conveniently extract all the files with a png extension (*.png
) in the current working directory, and assign this to teh names
variable.
That's cute, but, for images, not really usefull. We know we can display images in the Jupyter notebook when using the IPython kernel, for that we can use the Image
class situated in the IPython.display
submodule. We can construct such object simply by passing the filename. Image
does already provide a rich representation:
from IPython.display import Image
im = Image(names[0])
im
The raw data from the image file is available via the .data
attribute:
im.data[:20]
What if we map Images
to each element of a list ?
from random import choices
mylist = list(map(Image, set(choices(names, k=10))))
mylist
Well unfortunately a list object only knows how to represent itself using text and the text representation of its elements. We'll have to build a thumbnail gallery ourself.
First let's (re)-build an HTML representation for display a single image:
import base64
from IPython.display import HTML
def tag_from_data(data, size='100%'):
return (
'''<img
style="display:inline;
width:{1};
max-width:400px;
padding:10px;
margin-top:14px"
src="data:image/png;base64,{0}"
/>
''').format(''.join(base64.encodebytes(data).decode().split('\n')), size)
We encode the data from bytes to base64 (newline separated), and strip the newlines. We format that into an Html template – with some inline style – and set the source (src
to be this base64 encoded string). We can check that this display correctly by wrapping the all thing in an HTML
object that provide a conveninent _repr_html_
.
HTML(tag_from_data(im.data))
Now we can create our own subclass, hich take a list of images and contruct and HTML representation for each of these, then join them together. We define and define a _repr_html_
, that wrap the all in a paragraph tag, and add a comma between each image:
class VignetteList:
def __init__(self, *images, size=None):
self.images = images
self.size = size
def _repr_html_(self):
return '<p>'+','.join(tag_from_data(im.data, self.size) for im in self.images)+'</p>'
def _repr_latex_(self):
return '$ O^{rly}_{books} (%s\ images)$ ' % (len(self.images))
We also define a LaTeX Representation – that we will not use here, and look at our newly created object using previously defined list:
VignetteList(*mylist, size='200px')