In this article, I present a new open-source software tool for exploring and gaining insights from 3D data visualizations,
plotly-portal-viewer. It leverages machine learning-powered face tracking software and visual perception science to provide a convincing illusion of real depth using nothing more than a typical mobile device or laptop with a webcam.
“Doomed from the start.” That’s how Leland Wilkinson describes the task of visualizing and understanding inter-relationships among multiple variables with traditional tools (The Grammar of Graphics, 2nd Edition. 2005. Springer. New York). “Inter-relationships” here refers the situation when the relationship between two variables is not constant but instead varies based on another factor. Take Type 2 Diabetes, for example. There is a documented relationship between a dietary pattern high in processed meats and refined carbohydrate foods and the risk for diabetes, but it doesn’t apply equally to everyone. You probably know someone who eats this way but seems immune to the consequences. One part of the reason for this is there is another factor at play: genetics.
When researchers at the Harvard School of Public Health divided people up based on genetic risk factors for diabetes, they found that those genetic factors mediated the impact of diet on risk for diabetes. For people with the least genetic risk, diet appeared to have no impact on diabetes incidence, but as the genetic risk score rose for different individuals so too did the apparent impact of diet. This straightforward case of inter-relationship is often called an “interaction effect,” and the team of Dr. Frank Hu were certainly aware of the challenges of communicating it with traditional visualizations as they opted for this 3D bar chart in their research paper. However, this also highlights why Wilkinson considered the endeavor doomed: while the concept cannot be communicated without 3D, using a 3D chart in a 2D medium comes with significant limitations.
Why Not 3D?
Three dimensional data visualization is inherently limited because, at least for the time being, we still use 2D media like screens and paper to view and communicate those visualizations, so we’re not really consuming 3D information but instead a 2D projection of it. Consider the above bar chart. Overall, it is clear that the tallest bars tend to be towards the right and back of the chart. That is, risk for diabetes increases with the interaction of worse dietary pattern and higher genetic risk. So far, so good, but now try picking out the details. Look at one of the bars in the center, “Q2” or “Q3” for diet and “10-11” for genetics, and try to figure out what odds ratio it corresponds to. Because of perspective distortion, it is very difficult to determine the exact height of the bars that are not adjacent to the axis guides in a 3D plot.
The Best of Both Dimensions
The reasons why a picture of a 3D plot is more difficult to understand than the world around us is a lack of depth perception, and the ways that we intuitively build an understanding of the depth, or 3D structure, of our surroundings are through stereopsis and motion parallax. Stereopsis is binocular vision, two nearby viewpoints (i.e. each of two eyes) see slightly different versions of the surroundings and the positions of objects closer to the viewer vary more between the two perspectives than do objects which are further away, as illustrated below.
Motion parallax uses on the same perspective difference information as stereopsis, but is based on how perspective changes as we move our heads in relation to your surroundings, including small subconscious movements. The two effects combine to combat optical illusions and provide a 3D understanding of our environments. There is currently available consumer virtual reality technology that simulates both of these effects with high fidelity to create convincing illusions of 3D worlds. They employ a combination of delivering two different images to your two eyes through a head-mounted display to mimic stereopsis and adjusting that perspective with accurate positional tracking in space to mimic motion parallax. With this tech, 3D visualizations can be delivered in 3D, and the previously discussed limitations disappear. In my view, which is conflicted due to financial interest in a company in this field, virtual reality is undeniably the best solution for understanding high dimensional data. Unfortunately, not everyone has access to the hardware at present, so I decided to explore more accessible options for improving 3D comprehension.
The goal of portal view is to partially bridge the gap from 2D projections to virtual reality by simulating motion parallax. To accomplish this, I run a computer vision face tracking model on the camera feed from a PC webcam or mobile device front-facing camera. This provides information about the position of a viewer’s face relative the screen, and I use that to transform the perspective of the 3D plot render to match.
The result is, if you want to see what is behind a cluster of data points, you can simply lean to the side and look behind it just as if your screen was a window into a truly 3-dimensional plot.
A certain proprietary face tracking tech has been gaining popularity lately, but I’m using the tracking.js library instead for increased access, it works on virtually all devices not just the latest most expensive ones, and privacy, all of the processing is performed locally so you don’t have to hand over your facial data to a tech giant just to try it out.
Speaking of trying it out, let’s get to the demo. In The Grammar of Graphics, Wilkinson provided an example of what he called “configurality”, a general case of multivariate relationships that cannot be portrayed in 2D, using 1990 population data from the UN Databank. I’ve recreated the visualization with the most recent data from 2012.
The chart below portrays the relationships between birth rates and death rates, the determinants of population growth or decline, with total health expenditures by country. From the initial perspective, you can see what appears to be a straightforward relationship of decreasing birth rate by increasing health expenditures with diminishing returns for expenditures over $2,000 per person and birth rates under 10 per 1,000. However, the relationship becomes much more interesting with the configurality of those factors with death rate. Click the “Toggle Portal View” button and enable your camera to explore further.
Now the complexity of the data emerges, forming a sort of spiral or, as described by Wilkinson, ram’s horn as rising healthcare expenditures correspond to rising death rates among countries with low birth rates. The persistence of this phenomenon noted in the data from 1990 through to the 2012 data used here suggests this interrelationship is more likely some real effect rather than an anomaly. However, this particular visualization isn’t suitable for a comparative policy analysis. For this purpose, factors like life expectancy and infant mortality might better normalize differences between populations.
If you’re a fellow visualization practitioner, you can add the Portal View feature to your own work. I have released this feature as a plugin you can add to any 3D plots created with Plot.ly. Get the code and documentation on GitHub, and leave comments below with links to your creations.