Multidimensional Data Visualization with THREE.js
Carl-Martin Keerberg, Oliver Vinkel
About this project
Our initial goal was to create a three-dimensional bar plot that takes numerical data as structured input from a .csv file and converts that data so that instead of viewing it as a table of numbers, it can be read as a 3D environment that is bundled with camera controls to view the output at varying angles instead of settling for a still. We implemented such a system using the THREEJS WebGL library.
The project ended up being simpler than anticipated, therefore some additional types of plots were added later on, namely the violin and scatter plot. Details about each are found below.
List of features and properties
- Receives input from local .csv files
- Normalizes dataframes to fit data in a comparable scale
- Plots user defined factors as a bar plot, violin plot or scatter plot
- Enables exploring visualized data in 3D space using Orbit Controls
- Displays datapoints when the mouse hovers over the plot (bar, scatter)
- Threedimensional grid for the bar plot to aid with visual comparison of different inputs. Similarly there are axes for the scatter plot
- Factor hotswitching
Instructions
The project can be opened at
https://carlmartinkeerberg.com/DataViz/
You can use your touchpad or preferably a mouse and the mouse wheel together with the CTRL key to pan, zoom and rotate the view around the plot.
By default the project will load the data contained in our DEMO dataset. This contains fixed (and fictional) values representing weight, height, age, IQ. If you wish to test with a dataset of your own, see below on how to set up the .csv file properly.
The dataset has to be laid out according to the following structure:
ID | factor A | factor B | factor C |
---|---|---|---|
Observation 1 | 10 | 20 | 10 |
Observation 2 | 30 | 50 | 5 |
Any common separator should work, but a semicolon is the safest option. The number of factors or observations is not limited. All values except the IDs of observations need to be numerical. The first column has to be dedicated to IDs.
The 'Upload' button prompts the user to select a local .csv file. The 'Demo' button will load the sample dataset. Plotting is triggered by uploading a dataset, selecting a demo dataset or updating the selected factors.
Input example:
Different plots
We opted to visualize data in three ways. The first one is the most feature-packed bar plot, which is great for getting a quick overview of how the input values fluctuate. In the bar plot each bar represents a row in the csv file. On the right you choose which axes represent which factors. Up to three can be chosen at a time. In the example below the changes in the Y axis represent fluctuations in weight. X axis represents fluctuations in volume and Z is for density.
If you wish to focus on just one factor, then you can apply that factor to all axes and judge variations based on the size of the bars.
There is also the violin plot which is used when focusing on just one factor. When viewing a bar plot you would compare how far two bars are from the grid line to get an estimate of the delta in their values, then a violin plot can do the same in a more readable way by turning input values into discs of varying radiuses, layering them on top of one-another and connecting the edges to create a kind of tower. Because it evaluates just one factor, it loses the bar plot's option of inspecting the other parameters of each input.
Finally there's the scatter plot, which in practicality is similar to the bar plot, but is preferable when determining clusters in your dataset, namely lines that have similar values. The scatter plot displays the lines of input data as bubbles in space. In a normal use case scenario you'd define all three factors and observe the spacing between them and their distances from the origin point (0,0).
Process
If I had to name any complications then it would be simply the somewhat low level nature of the tools used. THREE.js was viable for this task, but sometimes it felt like it's classes could use more built-in features. One instance where I had need of more functionality was when creating the axes for our plots. Initially there weren't supposed to be grids, just three axes of different lengths, but changing the lengths of the axes was not possible with the built in class. In fact there was barely anything customizable ( couldn't even change the width of the lines :/ ). The long axes were affected by frustum culling which didn't want to turn off no matter what, so they were constantly flickering in and out of view. I opted to build my own lines, which ended up becoming the grid, which seemed more useful for our purposes.
There were some other issues that would've required untraditional approaches, but overall the project ended up becoming what was planned, if a bit more.