GSoC 2019: Support for Jupyter notebooks has evolved in Cantor

Hello everyone, it's been almost a month since my last post and there are a lot of changes that have been done since then.

First, what I called the "minimal plan" is arleady done! Cantor can now load Jupyter notebooks and save the currently opened document in Jupyter format.

Below you can see how one of the Jypiter notebooks I'm using for test purposes (I have mentioned them in previous post) looks in Jupyter and in Cantor.

As you can see, there aren't many differences in the representation of the content except of some minor differences in the rendering of the markdown code.

For the comparison, I also prepared some previews of the same fragments of the notebooks, opened in Jupyter and in Cantor.
This is a fragment from Understanding evolutionary strategies and covariance matrix adaptation notebook.

As the next example, we show a screenshot of A Reaction-Diffusion Equation Solver in Python with Numpy notebook.

As the final example, we show a screenshot of Rigid-body transformations in a plane (2D) notebook.

To be more detailed and concrete on what is currently supported in Cantor, below is the list of objects that can be imported:

Markdown cells

With mathematical expressions
With attachments

Code cells

With text (including error messages) and image results)

Raw NBConvert cells

Cantor is able to handle almost all content specified by Jupyter notebook format, except of some metadata information about the notebook in general and about its cells, information about the used "kernel" (support for this will be added soon) and results of another types (for example latex or html outputs), which are more difficult to implement because of the lack of good and complete documentation of them.

When saving the project in Jupyter's format, Cantor handles almost all of its native entry types like markdown entries, text entries, code entries and image entries. For the remaining "page break entry" in Cantor it is still to be worked out how to map this element to Jupyter's structures.

Despite quite a good progress made, there is still a lot place and potential for improvements. Besides some technical issues arising when dealing with the import of another format and mapping its sturcture to the native structures of your application, which is very natural actually for all applications I guess, there is currently also currently problem with perfromance of the renderer used for mathematical expressions in Cantor. Openning of large documents (either in Cantor's native format or Jupyter notebooks) having a lot of formulas takes considerable amount of time because of the bad renderer implementation in Cantor. This heavily influence the user experience and I plan to start working on this soon.

So, there are some work for done before Cantor will support what I call the "maximum plan". With this I understand the ability to garantee the conversion between two formats when openning or saving projects to happen without any substantial loss of information relevant and critical for the consumption of the project file.

To achieve this, I want now to invest more into testing with more notebooks and closing the remaining gaps but also into writing automatic tests for Cantor covering this new functionality in Cantor. The latter are important to also prevent any kind of regressions introduce during bug fixing activities in the next weeks. This is something for the next week.

In the next post I plan to show a working test system and how Cantor are passing its tests.

GSoC 2019

Saturday, June 22, 2019

Support for Jupyter notebooks has evolved in Cantor

2 comments: