Grafoscopio: Iceberg metaphor for writing and data visualization

Offray Luna

This post describes a project I have been working on in Smalltalk as an alternative way to approach open/garage/citizen science & research by bulding a tool for "deep" / emergent data narratives and visualization, or what I call the "iceberg metaphor". It starts with some community context about the blog post itself and then introduces the project and my advances and difficulties so far.

This is a post which started as a mail in the Moose project mailing list, but got no answers. So I'm trying another aproach: Write longer details on my blog to point interested readers here, and combine it with particular questions in the mailing list. Seems that striking the proper balance between broader context for asking questions is the difference between 4 kind of reponse patterns: a) quick reponses, b) "more details please", c) longer mail takes longer to answer and d) longer mail gets not answer at all. Let's see how works with this new combination.

Last June or July I started a project with an alternative approach to writing and visualization. It was my bet to build a tool to express some long standing ideas about how to write in a non-linear fashion and beyond the white page metaphor of common word processors, focussing mainly on words and structure, instead of typography, margins and all the unnecesary stuff you start to care of, while you're trying to explore/express your ideas in "black on white". Also I would like to have interactive writing where I can explore ideas by computation, simulation and visualization.

Previously I had tried IPython and Leo for that, and I wrote some ideas about how to mix them, but Leo and IPython were not maleable enough, without deep understanding of their inner workings and/or by combining several technologies: python, qt, xml, json, javascript, html, zeromq, server programming, client programming, etc, with different paradigms and ways to think: markup, serialization and scripting languages, imperative and object oriented programming, etc.

So I thought in another way to combine the programmable tree like document of Leo and interactivity of IPython without such cognitive burden, by choosing a more uniform platform. My medium to explore that ideas was Moose, a platform for software and data visualization made on top of Pharo Smalltalk and the related ecosystem, for example by using the lightweight serialization language STON for document storage/representation and citezen for the integration with zotero bibliographies. I started to make prototypes and even I got a small fund, thanks to the people from the starting HiTec Lab [3], to write an academic draft article in the context of how this new kinds of metaphors for writing can be used to bootstrap open/citizen/garagen science & research.

I'm advancing at making my interactive notebook for open/garage/citizen research & science in Moose called Grafoscopio (SmallHub repository, Fossil repository). This are baby steps towards an alternate approach to what Andrei, Jan and Doru, from Moose community, are trying for documentation [1][2], that begins with a document tree inside the Pharo image and "projects" the files to the file system (Grafoscopio produces markdown by traversing the tagged tree and then latex and html, via pandoc and pdf via pdftex / luatex). In this way you can have a complete interactive document (some few pages or book size) inside the image, stored/shared/versioned in STON with arbitrary levels of deepness and only think about files when you are exporting to pdf or intermediate formats. (Also I think that this solution is easier to run on Windows that Pillar Book Skeleton.)

Markdown (pandoc's variant) express only the "surface" of the writing, while deeper structure is stored in the image and STON. I call this the "iceberg metaphor" for writing and data visualization, so instead of WYSIWYG (What You See Is What You Get) of common word processors you have WYSIOTSOWYH (What You See Is Only The Surface Of What You Have) ;-) which is what happens most of the time and the common word processing and file oriented metaphors can not capture/express. New metaphors for writing are needed if we want to bootstrap more open/inclusive ways to do science and research, coming not only from academia but also from non canonical places like garages, maker/hacker spaces and from citizens in general and not only from scientist.

Tags are the way to introduce modal behaviour/interface in grafoscopio. So if I tag a node, the look and feel is going to change accordingly. You can see this in the following small video of how interface looks/behaves today (for a better playback look the video on Chrome/Chromium. Firefox shows a pixelated video):

As you can see, I can tag a node as code (código) and the node view and behaviour gets updated after revisiting the node, providing auto-completion and syntax highlighting. There are also tags for "transmedia" content that arrange the layout for showing the original content and the transmediated one. I have thought about tags for "data" which will contain data sources, queries, tables of values and visualizations. Even could be tags for showing a mind map and start to add nodes inside a visualization instead of operating the tree directly (in a similar fashion to xmind).

The way to share look and feel for tags is still a pending issue. It could work like now, by extending the source code of grafoscopio to add new tags, but It could happen with grafoscopio "documents". So a document could define a new tag, its GUI and behaviour and populate the class browser. Of course this could lead to security concerns, but I imagine a set of repositories working in a similar way to Arch Linux repositories, with proper separation of trusted users and a community curated repositories for non verified tags. This scenario would happen if someday grafoscopio has a big enough community, but for the moment is working fine as is (I think it will be pretty modest and small).

This ideas about tags were announced since the first academical draft paper on grafoscopio, and were the evolution of a "vocabulary of special words" I developed while writing my PhD thesis in Leo (I am still writing it), but they were implemented only recently with the Transmedia Hackathon we made with Adriana (we call it "Transmediatón") on January 16 and 17, this year and it solved a way of expressing/storing several interactions in a single tree [4].

I'm trying to evolve grafoscopio organically in this way, by using it myself and in the research and consultancy I'm doing and by sharing and asking in the Moose community about this process. After all this, I have some ideas to explore next, that is where this post started, like:

Updating instantly the node view/behaviour without the need to revisit the node.
Geting the code executed, inside a code node, so documents will become interactive (like in IPython).
My idea in the future is to make the node view of the tree at the right side a complete playground instead of a workspace, with their emergent lateral panels when an object is selected. For more information about playgrounds in Moose you can look at this introductory blog post or the several videos at here http://gt.moosetechnology.org/.
At the very end of the video you see the trace of an error that is raised when I select and empty part of the tree. This is the most annoying behaviour and can create lost of information if you have write a node and not saved the tree just before clicking and empty tree space. I need to solve this bug and is urgent. I need to solve issules with automatic backups and fossil integration. This is a priority. Information should not be lost... never!
At this moment all external transformations from markdown to pdf and latex are managed by pandoc using pdftex and xetex, as I said, but I would like a better integration, without calling it from shell, and start to test a more minimalist/programmable LaTeX engine vía luatex.
There is still no web site for grafoscopio and I would like to create one for the next event related to it (may be the next hackathon on transmedia or data journalism) to deliver a more solid and friendly release. Ideally the website for grafoscopio would be done on grafoscopio itself, but time will tell.

It has been a pleasure to work with Pharo and Moose for this project. Is a very empowering platform to express ideas about how interactive documentation could be done, among other things and the community is welcoming and awesome! My thanks goes also to the folks of HackBo, specially the ones attending to the Indie Web Science workshops. It has been a place to get inspiration, tinker with this ideas and to share with friends.

[3]	HiTec is not the acronym for High Technology, despite of what it evokes. Is more about Hipermedia Tecnologies... kind of a bitter sweet coincidence.

[4]

May be there was somekind of similar idea on Leo nodes, but I never could see this working. I know for sure that Leo can show information in nodes in several ways by the use of "@-directives", but I use them mostly for syntax highlighting and not much for redefining the interface or behaviour inside the tree (adding/enabling plugins by editing files was kind of "too geeky" for my taste).

Comentarios