Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Report_Wijaya
1. Exploring data visualization in a virtual, three
dimensional space.
ABSTRACT
Data visualization tools have long since been in
high demand by industries ranging from those in
medicine to those in the humanities. This is due to
the fact that irrespective of the industry, our world
is built on data. This raw data is often unstructured
and overwhelming in terms of sheer volume of
numbers, letters, and entities that can be
meaningless without context and analysis. Data
visualization facilitates this analysis by helping to
extract meaningful relationships, trends, and
patterns that exist within the data. Though
effective two-dimensional data visualization tools
exist today, the options for those desiring tools for
hyper-dimensional information is lacking. This is
due to the fact that projecting hyper-dimensional
data onto a two-dimensional screen is simply
inconvenient and ineffective. Thus, complete
visualization of this class of data requires a three-
dimensional and interactive environment. In this
project, we explored using Virtual Reality, through
the combined systems of Unreal Engine and the
HTC Vive, to help satisfy this demand. Through
user studies and an exploration of the virtual
environment we created, we have reached a
number of conclusions with respect to the future
of data visualization and the viability of a virtual
reality enabled data visualization tool of the
future. In doing so, we built a navigable, fully
immersive “gallery” of data with natural
interactive capabilities.
Kimberly Wijaya, Jordan Hank,
Brendon Go
CS294W: Writing Intensive Research Project on the
Internet of Things, Virtual Assistants, and Virtual
Reality
2. INTRODUCTION
MOTIVATION
There is no doubt that much of computer science is fundamentally linked to data. But effective
analysis and manipulation of this data is contingent on well-designed data visualization tools.
Despite the ubiquitous demand for such systems, creation of these tools is nontrivial. There are a
number of design and user experience issues, even in two dimensional visualization software,
that are difficult to address. Perhaps one of the largest limitations of these tools today is that the
visualization of three dimensional data is often challenging to do in two dimensions. Three
dimensional tools are significantly rarer than their two dimensional counterparts, and those that
do exist are limited in scope in terms of their interactivity. This is largely due to the fact that these
tools are by necessity, particularly if they are software based, limited to two dimensional spaces:
the screen of your technological device. This environment simply doesn’t allow for a truly three
dimensional experience. With the advent of virtual reality headsets and systems, however, this
has changed.
KEY IDEA
Virtual reality is one of the fastest growing fields of computer science today. Breakthroughs in
portability and usability have made it more and more accessible by the general public for a
much lower cost. Virtual reality can be defined as a three dimensional computer generated
environment that is immersive in that users are able to explore, interact, and manipulate iti. In
essence, it is an environment that a user may interact with in a seemingly real or physical way.
As a result, it serves as the perfect canvas for users to have a truly interactive and immersive
experience when manipulating their hyper-dimensional data. This type of data would no longer
need to be confined to a two dimensional space. Since the ability to manipulate hyper-
dimensional data is desired by a number of industries, including non-technical ones, it is
imperative that the barrier of entry to utilizing a virtual reality, data-driven application – which is
historically quite high – is lowered. Users from any field should be able to simply supply their data,
formatted in a specific way, in order to use the tool.
HIGH LEVEL DESCRIPTION
Our project solves the aforementioned problem in three ways:
ACCESSIBILITY
One of the issues with both the development and use of virtual reality is its high barrier of entry.
Often, data and code have to be written in a particular, platform-specific way in order to work.
This has resulted in its use being limited to those working in technical fields. But the need for
hyper-dimensional data visualization tools exists across industries – everything from the sciences
to the humanities. To change this, we have decided to have a simple but standardized way to
allow users to supply our project with data. They would not need to know anything about
3. software engineering or programming – they could simply copy and paste their data from
where it was originally stored (most often, excel spread sheets). More concrete details of this
formatting will be discussed later in this paper.
RENDERING AND PARSING INTO A THREE DIMENSIONAL ENVIRONMENT
Our three dimensional platform will be created by using the HTC Vive in coordination with Unreal
Engine 4. This device is a virtual reality headset that, when paired with its motion-tracking
handheld controllers and room sensors, allows users to navigate through virtual worlds naturally,
while also vividly manipulating and interacting with objects with increasing precision. As a result,
this device works perfectly for our purposes.
INTERACTION WITH THIS REPRESENTATION OF DATA
Our program can be viewed, at the high level, as a gallery of data. A room will be spawned
when the user starts up the program. This room is dependent on the information that was
gathered from our parser, which was given the input file provided by the user. Data will be
rendered in “cubbies” in the wall and can be represented as scatter plots, bar graphs, or etc.
Users are able to walk around the room and get close ups of any of the representations of data
that they would like. Additionally, they are able to use the HTC Vive controllers to manipulate the
data: to translate or rotate the data. This allows users to not only see their data up close, but to
also see how their data may relate not only to itself in a different representation, but also to the
other sources of data that they may have provided us.
SUMMARY OF CONTRIBUTIONS
There is still much work to to be done when it comes to a polished, effective data visualization
tool in virtual reality, but we believe that our project is a solid foot in the door. The objective of
this project was to serve as a proof of concept that it is possible for a non-technical person to
give us the information they already have and have us convert that into a format that is
compatible with a virtual reality platform.
The first step to this process was delineating a concrete but simple formatting for data that
requires little to no technical knowledge. This format will be discussed in detail later in this paper,
but it consisted of well placed white-space and a column focused approach to data
formatting. This would be saved in a spreadsheet, which would be placed in a folder specified
by our instructions. This ease of use helps to break the barrier that currently exists for users to take
advantage of meaningful applications in virtual reality. In fact, it allows users to copy and paste
the data they already have and feed it into our system – a similar paradigm to what already
exists for the two-dimensional systems that these users most likely already use. Thus, not only is the
barrier of entry for this program objectively simple, the transition from the current paradigm to
this one is simple as well.
Our parser currently takes advantage of this standardized formatting through use of regular
expressions and splitting. This information is then translated into a format that the virtual reality
4. engine is able to understand. We developed a generalizable algorithm that can take raw data
and the type of representation desired, and have it rendered in that form in a virtual space.
This rendering, in itself, required a number of models and arithmetic to ensure that the
arrangement of data is meaningful and reflective of what the user wants. In addition to this, we
had to implement a natural way for users to manipulate and interact with this data. This meant
implementing the ability for users to rotate and translate the points. We made sure that these
interactions could be done in a natural way – and we will discuss these gestures in more detail
later in this paper.
To put it simply, our contribution to the field is the ability for both non-technical and technical
users to use the same paradigm currently in use to project their hyper-dimensional data into a
three dimensional, virtual space. In this space, users are able to interact and manipulate their
data in natural and meaningful ways.
RELATED WORK
Because Virtual Reality is a relatively new field, not much work has already been done
pertaining to data visualization in virtual spaces. Here, we will touch upon a few applications
and exploratory projects that have focused on this topic. It should be noted that despite the
short length of this list of examples, there is a burgeoning interest in a breakthrough in this topic
and we fully expect more and more people to explore this field.
IMMERSIVE AND COLLABORATIVE DATA VISUALIZATION USING VR PLATFORMSii
A group of researchers at the California Institute of Technology, in coordination with the Jet
Propulsion Laboratory in Pasadena, CA and the University Federico II in Italy explored how virtual
reality platforms can help with the visualization of information in the era of “Big Data.” This era is
particularly important because there is a dramatic growth not only in the volume of data, but
also in its complexity. Multidimensional data is no longer rare: data sets are now often a
combination of numerical measurements, time series, images, and etc. Vectors can have tens of
thousands of dimensions in order to describe certain phenomena. The two-dimensional tools in
place today are simply not advanced enough to handle these types of data.
Their main argument is that “big data science” is no longer simply about the data – but instead
about the discovery and understanding of meaningful patterns hidden in the data. Thus,
visualization serves as the means through which we can merge the quantitative content of the
data and human intuition in order to uncover these patterns.
Accordingly, they have found that VR leads to better discovery in domains whose primary
dimensions are spatial. In addition, it has been shown to “increase situation awareness and
vividness, interactivity and media richness, as well as the proprioceptive or kinesthetic properties
of remote experiences…and enhance collaborative teleoperation scenarios.” iii
5. Their working product is called iViz and is a multiplatform application that can be run both as a
standalone one and one linked to a web browser. Users are able to explore the data alone or
have a shared view associated with another person who navigates through the data. The user
interface allows users to select and shuffle which data parameters are mapped to which
graphical “axis” – these axes include XYZ positions, colors, shapes, etc.
In the creation of this application, they discovered a number of things. Firstly, effective data
visualization remains a key challenge in the era of big data. Secondly, VR as a technology gives
us a significant, cost-effective route to solving this problem: its rapid development is paid for by
the entertainment industry, while offering us an opportunity to allow scientists to have visual data
exploration abilities with the portability of their laptops, and the same capabilities (but
dramatically lower cost) of the existing multi-million-dollar cave-type installations.
MUSE: MULTIDIMENSIONAL USER-ORIENTED SYNTHETIC ENVIRONMENTiv
MuSe was developed by Creve Maples in coordination with University of California, Berkeley and
Sandia Labs. It is, in essence, a multi-sensory, three dimensional learning engine. Though many
tend to connect this technology with Virtual Reality, Dr. Maples prefers his own term – anthropo-
cybersynchronicity – “the coming together of human and machine.” v
The room in which MuSe can be used is one with a big-screen television, a video projector, and
a monitor connected to a sophisticated workstation. While the previous work we mentioned had
its focus on portability, this has its focus on sheer comprehensiveness. MuSe’s breakthrough lies in
its use of its knowledge of how people take in information – including pattern recognition, trend
analysis, and anomaly detection. By taking advantage of this information, it is able to help users
unconsciously sort through the very large amount of sensory input available and pull out these
information patterns. The shell of this program can be wrapped around data, models, or
simulations. It serves more as a framework than a standalone application.
Simply stated, MuSe boasts a highly interactive environment that is able to map information to
various human senses, allowing users to experience and experiment with their data in natural
and groundbreaking ways.
SCIENCESIMvi
ScienceSim is, in essence, a “virtual land-grant” program offered by Intel. It uses OpenSim, a very
popular virtual-world platform used by people around the world, because of its ease of access
and established framework. Intel owns acres of virtual land on its servers and allows users to
simulate meetings with researchers from around the world. Researchers are able to “rent” the
virtual acres of land for six months at a time for no cost. The possible uses of this space are
endless.
For example, Aaron Duffy, a biology doctoral student, used 16 acres of this land in order to
explore population genetics. He used the acres to mathematically model the reproduction of
fern plants and the effect of these plans on the surrounding environment. The simulation served
6. as an effective, simplified model of what may happen on earth. Environmental parameters –
altitude, humidity, etc – could all be controlled, and so patterns and trends that arose could be
certain to have come from the variables being manipulated. The speed of research, too, could
also be accelerated – in this world, Duffy could make a day cycle pass in five minutes, allowing
him to observe decades of genetic evolution in only a few days.
Not only does this platform allow users to see their data and its patterns, but it also helps in the
collection of the data itself.
CAVE AUTOMATIC VIRTUAL ENVIRONMENT (CAVE)vii
CAVE is an immersive virtual reality environment that is an advanced visualization solution. It is
room-sized and uses three-dimensional computer graphics and high resolution projectors to
create a complete sense of presence in a virtual environment. Multiple users are able to be fully
immersed in the same space at the same time. This setup thus allows users to analyze and
interpret data – particularly spatially-related data – as well as intuitively navigate these
environments. This can be done both individually and collaboratively.
The realness of the simulation is unparalleled. Standard CAVE systems are able to display
approximately one million pixels per wall, while the Mechdyne CAVE is able to deliver up to
sixteen million.
Rather than being geared towards video games or flight simulation, this system was motivated
by scientific visualization and is thus a particularly useful tool for this field. It was built for the
SIGGRAPH 92 Showcase, which allowed scientists to interactively present their research. Since
then, this system has evolved into, essentially, a virtual reality theater.
The main drawback of the CAVE is its very high cost of entry. An individual scientist working on
research would most likely be unable to afford the opportunity to use this setup. In addition, this
setup requires extensive equipment and a particular arrangement of space. As a result, it is not
portable. This means a scientist cannot simply put on a head mounted display and view their
data. They would need to go to a specific location, have their information in a specific format,
and be prepared to pay a significant amount of money to access this multi-million-dollar
software.
THREE-DIMENSIONAL SPACE AND MEMORYviii
A study done at University of California, Irvine, by Craig Stark and Dane Clemenson, members of
UCI’s Center for the Neurobiology of Learning and Memory determined that three-dimensional
environments help memory formation. Exposing animals to a more stimulating environment –
referred to as environmental enrichment – stimulates “neuroplasticity and improve hippocampal
funciton and performance on hippocampally mediated memory tasks.” ix This concept was
readily understood prior to the study.
The purpose of the study was to determine whether this remained true when the stimulating
environment is a virtual one. To study this, Stark and Clemenson had three groups of participants
undergo three different “training” paradigms: one had no training, one trained on a two
7. dimensional game, Angry Birds, and another trained on a three dimensional game, Super Mario
3D World. After two weeks of such a training, it was found that those who underwent the three
dimensional training saw an increase in mnemonic discrimination ability. Individual, increased
performance in hippocampal-associated behaviors were correlated only with the three
dimensional training and not with the two dimensional one. This thus suggested that individuals
that explore three-dimensional virtual environments are able to retain memories better. This helps
to motivate our project since it implies that by exploring and viewing their data in a three
dimensional environment, as in our case, users will be able to retain the information and patterns
they detect with better accuracy.
DESCRIPTION
As stated in previous sections, our project can be divided into three main modules: the input
data formatting; the translation of this data into a customizable gallery of data; and the ability
for users to manipulate and interact with this data.
ACCESSIBILITY
There was a lot of discussion from the very beginning of our project on what our input format
should be like. We knew it had to be standardized and generalizable enough that not only
would it be applicable across input files and industries, but also within each file itself. We had
originally envisioned our input format to have a similar syntax as that of HTML, since it is perhaps
one of the most easily adoptable markup languages in practice today. After further research
and discussions with researchers who would potentially like to use the program, we decided that
even HTML is unnecessarily complicated for our purposes.
As a result, we decided to instead take advantage of white space as our delineators rather than
using tag features as in HTML. Whitespace would divide the input text file simply and cleanly. We
wanted our input format to allow users to simply copy and paste the data – which, after talking
to several researchers, seems to most often be contained in spreadsheets like excel or csv files –
into any text editor. Whitespace and pre-determined headers would allow our parser to
differentiate between what type of data the user would like us to render; how they would like it
visualized; and where in the gallery they would like it to appear. The number of columns for the
section would depend on what type of data they are working with.
8. To understand this better, let us consider a concrete example (Figure 1):
This file is what a user would input in
order for the program to create a
map on the floor, on which bars of
different colors will appear on the
specified coordinates.
The white space between this set of
data and that before it is what
differentiates the two.
The key word “Floor” informs the
program that the following data will
be projected onto the floor. The user
could have written “Wall” instead in
order to inform the program to
project it onto one of the cubbies
found on one of the four walls.
The next line states the type of
visualization or representation the
user would like to have for the data
set. We currently support “scatter”
for scatter plots, “bar” for bar
graphs, and “globe” for the three
dimensional model of a globe with
location-based bars. On this same
line, the user may also specify
whether they would like the graph of choice to be projected onto some sort of image – for
example, bars on a map of the world. In this example, they would like to create bar graph on
top of an image of the map of the United States. This image must be placed in the same folder
as this file in order for it to be rendered in the virtual space.
A simple syntax and key word cheat sheet follows.
DiVE Syntax and Keywords
Line Type Keywords
0 Whitespace n
1 Location Floor
Wall
2 Graph Type Scatter
Bar
Globe
2 Floor Image (optional) [a-zA-Z0-9_&-]*
3…..N-1 Data Bar, no image:
x-coord y-coord z-coord size color
Bar, image:
Figure 1: Example of an input data file
9. Scatter
x-coord y-coord z-coord radius color
Globe
N Whitespace n
PARSING AND RENDERING FOR A THREE DIMENSIONAL ENVIRONMENT
The text file described in the previous section is then parsed within our program and translated
into a number of classes that exist in a hierarchy. At the highest level, we have a RoomManager
class which is the entry point of the program and is responsible for parsing the text file.
We decided to have a Room class that encapsulates one story of the gallery. A room contains
four walls and a floor. Each wall, in turn, has one to three cubbies – which are, in our case,
considered “floors” in the wall in order to simplify the movement from the cubby in the wall to
the floor of the gallery.
Thus, the walls of the rooms are each instantiated objects of type Wall, and the floor of the
gallery, as well as the cubbies in the walls, are Floor objects. The Floor objects each have data
points, which are each a Data object, as well as a FloorMesh, used only if the user specified an
image in the text file for that particular wall and floor pair.
These Data objects allow us to generalize our data representations, so that the same object can
encapsulate scatter or bar graphs. The ways in which we can interact with the data and how it
can be rendered are dependent on the type of data being represented.
Essentially, a RoomManager handles Rooms. Each Room can contain a single Floor and four
Walls, each with one to three Floors within them. Each Floor can contain any number of Data
objects, as well as a FloorMesh if an image is specified.
INTERACTION WITH THE VIRTUAL ENVIRONMENT AND DATA
Once the gallery has been spawned, the user is able to move around the room and interact
with the data they have chosen to render. In order to explore this virtual environment, the user
must use the HTC Vive, a virtual reality headset developed by HTC and Valve Corporation. The
head mounted display allows users to look around the gallery and truly see the data before
them as the first-person camera in the virtual world follows the direction and orientation of the
headset. The two controllers are responsible for the actual manipulation of the environment and
appear as three dimensional, realistic models in the virtual world in order to make the use of
them easy to track and comfortable.
The left controller’s track pad controls movement. Applying any pressure on the x-axis of the left
controller’s track pad causes the camera to move east or west, while applying pressure to its y-
axis causes the camera to move north or south. The left trigger, when pressed, fires a line-tracing
algorithm that returns the first actor that is hit. If the object that is hit is a Floor object, or if its root
10. component is a Floor object, then we translate all of the data encapsulated in the container
one cm in the positive x direction. This translation stops when the user releases the left trigger.
The right controller’s track pad controls rotation. Applying any pressure to the eastern
hemisphere of the trackpad causes the camera to rotate clockwise, while applying pressure to
the western hemisphere causes the camera to rotate counterclockwise. It is the combined use
of this trackpad and the aforementioned left trackpad that allows users to seamlessly explore
the virtual environment. The right trigger, when pressed, also fires a line-tracing algorithm that
returns the first hit actor. The same process starts for rotation as with its translation counterpart.
Pressing the left controller’s side button also fires off a line-tracing algorithm. If the data hit is
found within a cubby, then it will be switched with what is currently present on the floor of the
gallery.
TAKEAWAYS
DATA VISUALIZATION IS IMPORTANT
In the end, much of our world is composed of data. But raw data, regardless of its volume, tends
to be meaningless without analysis. One way to analyze data is through the use of data
visualization tools. Data visualization is what turns numbers and letters into meaningful
representations in order to facilitate the recognition of patterns, trends, and outliers. There are a
number of tools that perform these tasks, but this number diminishes quickly as the dimensions of
the data being represented increases.
Thus, it is important that more focus be put into building tools that are sophisticated enough to
handle our increasingly complex data. Many researchers we have talked to, particularly those
involved in medical research, as in the case of the Early Life Stress and Pediatric Anxiety Program
(ELSPAP) group in the Stanford School of Medicine, have discussed the fact that they have
thousands of records, each with thousands fields, and the tools that exist to visualize these
records are few and far between. In order to be able to understand how age, sex, and early
traumatic or stressful experiences may help the development of a child, these researchers need
to be able to visually see how these variables relate to one another. But this hyper-dimensional
data cannot be envisioned effectively in two dimensions. With a program like ours, however,
these researchers would be able to see how their data relates not only to one another, but how
variables of a single data point relate to one another.
VIRTUAL REALITY ISN’T A BE-ALL END-ALL SOLUTION
Virtual reality has allowed us to create and interact with environments that we could not before.
Yes, three dimensional data visualization tools already existed prior to the introduction of VR. But
these tools projected these three-dimensional entities onto a two-dimensional space, and so the
11. interactivity and true three dimensional experience simply could not exist in these platforms.
Though virtual reality solved this problem, it has many of its own problems as well.
MOVEMENT
Using controllers as a means of movement rather than actually walking around the room is a
double edged sword. On the positive side, it allows anyone to use the program, regardless of
what kind of space they have. On the negative side, the movement can be jerky and unnatural
in both how the movement is initiated and how the movement is translated in the world.
The HTC Vive controller’s trackpads are very sensitive and the slightest pressure – similar to a
trackpad on your laptop – will cause movement. Accidental touches can be dizzying and
inconvenient. Additionally, the movement does not communicate with the head mounted
display. The X and Y axes of the trackpad handle the east-west and north-south movements of
the camera, respectively, but with respect to the center of mass of your “character” in the
virtual world, and not the orientation of the headset itself. Thus, if you were to turn your head to
the right and then use the trackpad, you would not move in the direction that you think you
would be. You would instead be moving west with your head turned to the right, rather than
northwest of your original location.
Additionally, virtual reality sickness is a very real phenomenon. When testing our project, a
considerable number of people began to feel a bit nauseous by the movement, rotation, and
delay. This delay became more and more noticeable as we filled the gallery with increasing
amounts of complex data.
FULL IMMERSION
We wanted this program to allow users to feel as if they were physically surrounded by their
data, and able to manipulate it with natural movements based upon what they are already
used to. By this, we mean, we had originally intended for the gestures for rotation, translation,
and zooming to be very similar to those used on smartphones – swiping, pinching, etc. This
turned out to be very cumbersome and inefficient to do in Unreal Engine with the HTC Vive. It
became clear that the necessary calculations to implement these gestures would not be worth
the user experience. Using the buttons and triggers on the controller, instead, would make the
transitions and actions more seamless at the expense of more natural movements.
Due to this use of buttons and triggers, however, the experience feels less like a physical, realistic
gallery in which you can touch and mold your data, and instead more like a game world that
contains your data. This isn’t necessarily a negative result – there is just as much interactivity and
manipulation of data as we had originally intended. In a way, pressing triggers and buttons
rather than having to pull, pinch, and wave the controllers while keeping them at a certain
orientation so that they still point at the object of interest is much less cumbersome and makes
better use of space. Thus, it becomes a question of what is more important: the ease and
convenience of use or the natural and realistic nature of it.
SOCIABILITY
12. Perhaps one aspect of the project that is missing due to the fact that the tools simply do not exist
right now is the social component. Multiple users are not able to interact seamlessly in this
environment. The customizable gallery and its interactivity is only truly available to the person
who is wearing the headset and holding the controllers. Yes, others may see it on the monitor,
but the full experience is given only to the person who has access to the headset. We were
unable to hook up multiple HTC Vive units to the same computer or to different computers
displaying the same gallery such that they would feel as if they were exploring the gallery with
another person.
This is perhaps one of the larger drawbacks of this system. The technology simply hasn’t caught
up with this aspiration. Some companies are working on this problem: Microsoft, for example, is
working on a multi-person virtual reality platformx; Pantomime is a start up enabling people to
share virtual worlds using headsets and smartphonesxi; and Facebook is working on a “social VR”
to connect two or more real people in a virtual worldxii. We believe that the ability to explore
galleries with other users is crucial should an industry-grade version of this program ever come to
fruition. Being able to interact and manipulate your data with the help and observation of other
team members is essential to understanding and detecting trends. Often, what one user may
not be able to see in the data will be plainly obvious to another. It is through observation and
discussion of what is displayed in the gallery that will help to make the three dimensional and
interactive environment all the more worthwhile.
NOT ALL DATA IS MADE THE SAME
While we were still planning our project idea and pitching this one, we were confronted with
many questions. Perhaps the most popular question was: why bother? What can this project do
that existing tools cannot already do? This is a very valid question. It is due to this question that
we began to understand that data simply is not all made the same. Some types of data are
enhanced by our program, and others are not.
Two-dimensional data is captured and visualized perfectly well in existing tools. Our program will
not enhance this experience. Adding three dimensional pie charts or line graphs would be
gimmicky and unnecessary in virtual reality. Being able to interact with a pie chart or with a line
graph in a three dimensional space will most likely fail to reveal patterns and trends that are not
already discernible on a two dimensional screen or poster.
Hyper-dimensional data, however, is enhanced by this program. This is due to the fact that a
two dimensional plane cannot accurate capture all the information that is being offered by
these hyper dimensional data points. Data is more than just coordinates in space. Everything
from colors, shapes, relations to each other, and etc are important, and this simply cannot be
conveyed on a piece of paper. Relative distances between points in all three axes is just as
important as the absolute locations of these points on the graph. Seeing the data points in three
dimensions allows us to see its spread and find potentially significant clusters and outliers.
The ability to walk around and into your data is also particularly helpful, and is most realistically
accomplished in a program like ours. In this way, a user can “become” a data point, and see
how all the other data exists with respect with that point. We can see the spread of data from
the view of a particular outlier, or see how the spread exists around the mean of the distribution.
13. The ability to view data from any direction, in any orientation, allows us to see trends and
patterns that would be indiscernible otherwise. Similarly, the ability to rotate, translate, and
manipulate data allows us to see how it flows – to see how the data points move with respect
with one another and what significance that may hold. In essence, for these types of hyper-
dimensional data, our program allows users to not only see their data, but experience it in a way
that was not possible before.
USER EXPERIENCE IS PARAMOUNT
It took us a very long time to come to the conclusions that we did in the previous section. We
knew, intuitively, that viewing higher dimensional data in a higher dimensional environment
would be more effective that viewing in a limited space. But we could not pinpoint exactly how.
This was solved and revealed to us through user studies.
We have had the pleasure of working with a variety of researchers in discovering how they
would use our program should it accomplish all of the goals we had originally set out. A
particularly revealing user study that we conducted came from interviews with members of the
Early Life and Stress Pediatric Anxiety Program (ELSPAP)xiii lab in the Stanford School of Medicine.
This program seeks to explore and treat patients for disorders such as separation anxiety, panic
disorder, agoraphobia, post-traumatic stress disorder, and obsessive compulsive disorder. The
research division of this program collates data from thousands of patients to study how early
stress and traumatic experiences affect not only the occurrence of these disorders in patients,
but also their potential treatments. The number of variables in these type of studies is staggering:
from demographical information (age, grade, sex, location, etc) to assessments (t-values and
raw scores for anxiety, hyperactivity, social skills, locus of control, etc). As a result, researchers
have been restricted to keeping excel sheets for these records, which makes it daunting to go
through the data to find patterns and trends. In addition, it can be difficult to see how one
subject may compare to another due to the linear ordering and column-separated format of
records. Through use of our prototype, researchers can now visually see how data points and
subjects compare to one another upon the various axes they input. Many of those that we
asked believed that this program would enhance their ability to understand and learn from the
data collected from their studies due to their sheer volume and complexity.
In addition, they commented on the fact that this program would be particularly helpful when
talking to investors and sponsors of studies. Often, these people are not experts in the fields of
the projects and studies that they are potentially sponsoring. Thus, having excel sheets of
thousands of records with thousands of fields are meaningless to them. Having a “gallery” of
data – where this data can be everything from information on a single subject to a collation of
every single subject examined in a particular study – would change this. These investors would
be able to see not only the sheer volume of data collected and extrapolate, for themselves,
how much can be learned from this, but also the trends in the data themselves. Thus, they would
be more inclined to invest in future studies – or to appreciate and understand the progress that is
being made in a study that is already being conducted.
14. FUTURE WORK
Though we believe that our project is a solid proof of concept for the arguments and goals
presented in this paper, there are a number of things that we would have liked to accomplish in
addition to those we have presented, should we have had more time.
At the moment, our project spawns a four-walled room with space for 13 representations for any
datasets. But there are plenty of studies that would require more than these spaces allotted.
One way to solve this problem would be to allow multiple stories in this gallery, where each story
could represent a particular aspect of the study. This solves two issues: one, any amount of data
can be rendered and displayed in this arrangement and two, each story could cater to a
specific topic. The latter addresses the fact that in the current arrangement, users have two
choices in this matter: they can either have a room filled with data from separate studies that
could be potentially confusing, or they can spawn multiple rooms, but have to remove
themselves from the environment, reset the data, and restart the program to view these different
rooms. Having multiple stories in a gallery, each catering to whatever dataset and aspect of the
study that the users would like, with easy navigation between stories, would be particularly
effective for solving all of these issues.
In terms of the aesthetics of this program, we would have found and created better models and
meshes for more aesthetically pleasing displays of data. Currently, the shapes and colors of the
bars and scatter plots still display additional dimensions as desired by the users. However, it
would be interesting to see whether adding additional textures could enhance the experience.
For example, a scatter plot about movies could have each point be covered in the movie’s
movie poster in order to see whether the use of colors in the poster helped boost box office
profits (which would be one of the data point’s x, y or z coordinates) on opening night.
One of the aspects of the project that motivated us towards choosing it as our project for the
quarter was the sheer number of ways we could interact with the data in the gallery. To this end,
there are many other interactions that we could have implemented in addition to those that we
already have. Zooming into certain points in the data could help to reveal patterns that are only
visible when focusing on a particular cluster in the distribution. Adding a time element to our
data could be especially interesting: for example, with the click of a button, users could be able
to see a bar graph showing election results for democrats and republicans over the last century
morph and change with time over the fifty states of America. Another interaction that we think
would be worthwhile is the ability to touch any particular data point in a graph, and have that
motion pull up a display of additional information. Going back to the movie example, touching
one point that represents a particular movie could bring up a textual display of the movie’s
Rotten Tomatoesxiv page so that the user can see how the movie was reviewed by critics and
gauge the relationship between that and its position in space.
Additionally, we chose to focus on this project’s potential application in the medical field
through our interviews and discussions with the members of the ELSPAP study in the Stanford
School of Medicine. We chose to work with this particular field because it was the very first,
intuitive example we could think of when it came to non-technology related, hyper-dimensional
15. data that could possibly benefit from our program. We would love to be able to explore other
industries: for example, talking to those involved in the current presidential election and seeing
whether this program could help to predict who will win not only the democratic nomination but
the overall presidential election itself.
As mentioned in a previous section, we would have also loved to incorporate a social aspect to
the exploration of this gallery of data. Being able to see your data while discussing what you see
with another person – who may see something entirely different not only due to their own
experiences, biases, and knowledge, but also due to the fact that they could be viewing the
exact same data in the exact same setting but in a totally different orientation or perspective –
would enhance the effectiveness and usefulness of this project immensely. Much work has to be
done in adding social aspects to virtual reality programs in general, but should an effective
means of simulating a social setting ever come to fruition, this project would benefit greatly from
it.
CONCLUSIONS
From being complete strangers to the field of virtual reality to where we are now, we can gladly
say that by the end of this quarter, we have learned just as much if not more from the field as we
hope that we have contributed to it. Not only have we grown to appreciate the true usefulness
of data visualization, but we now understand what it is about these tools that helps to enhance
the analysis of data. As a result, we were able to take advantage of these components in order
to build a project in virtual reality that builds upon these components in order to create a
realistic three-dimensional, virtual environment that allows users to interact with and manipulate
their data.
Because of the time constraints of this quarter, there are many additional components that we
were unable to implement in our project. But we hope that this prototype not only conveys the
fundamental goals we had set as we were planning this project, but also motivates others to
build upon our product to help to revolutionize data visualization through the use of virtual
reality.
ACKNOWLEDGMENTS
We would like to thank a number of people for their help in the creation, development, and
completion of both this project and this paper. First and foremost, we would like to thank
Professor Monica Lam for her genuine interest and contagious enthusiasm on the various topics
we had pitched for this quarter. We were originally conflicted on whether or not virtual reality
was a field we wanted to or even could dip our feet into, and her unwavering support and
advice helped us immensely. We are also thankful for our consistently helpful teaching assistants,
16. Michael Fischer and Giovanni Campagna. Michael’s support and willingness during office hours
to scour the documentation with us and Giovanni’s skepticism towards the usefulness of a virtual,
three-dimensional data environment helped to both motivate our project and execute it
effectively. This paper’s completion was undoubtedly helped by Mary McDevitt, our advisor from
the Technical Communication Program. Additionally, we would not have been able to even
consider creating this project without the help of the Computer Science department at Stanford
University, as well as HTC and the Valve Corporation, who gave us access to their fantastic
virtual reality headset, through which our project was possible. Finally, we would like to thank the
ELSPAP team for consulting with us on this project and helping us to understand how our project
could be most helpful.
REFERENCES
i "What Is Virtual Reality? - Virtual Reality." Virtual Reality Society. Virtual Reality Society, 24 Dec. 2015.
Web. 31 May 2016.
ii Donalek, C., Djorgovski, S., Davidoff, S., Cioc, A., Wang, A., Longo, G., Norris, J.S., Zhang, J., Lawler,
E., Yeh, S., 2014. Immersive and Collaborative Data Visualization Using Virtual Reality Platforms.
arXiv preprint arX***iv:1410.7670.
iii Donalek, C., Djorgovski, S., Davidoff, S., Cioc, A., Wang, A., Longo, G., Norris, J.S., Zhang, J., Lawler,
E., Yeh, S., 2014. Immersive and Collaborative Data Visualization Using Virtual Reality Platforms.
arXiv preprint arX***iv:1410.7670.
iv Maples, Creve, and Craig Peterson. "JuSE (Multidimensional, User-Oriented Synthetic Environment)2 A
FUNCTIONALITY-BASED, HUMAN- COMPUTER INTERFACES." The International Journal of Virtual Reality 1.1
(1995): 2-9. International Journal of Virtual Reality. International Journal of Virtual Reality. Web. 31 May
2016.
v Maples, Creve, and Craig Peterson. "JuSE (Multidimensional, User-Oriented Synthetic Environment)2 A
FUNCTIONALITY-BASED, HUMAN- COMPUTER INTERFACES." The International Journal of Virtual Reality 1.1
(1995): 2-9. International Journal of Virtual Reality. International Journal of Virtual Reality. Web. 31 May
2016.
vi "ScienceSim: A Virtual Environment for Collaborative Visualization and Experimentation." Intel®
Software. Intel, 9 Sept. 2011. Web. 31 May 2016.
vii Kenyon, Robert. "The Cave Automatic Virtual Environment: Characteristics and
Applications." Human-Computer Interaction and Virtual Environments 3320 (1995): 149-68. Web.
31 May 2016.
viii Clemenson, Gregory D., and Craig E.L. Stark. "Virtual Environmental Enrichment through Video
Games Improves Hippocampal-Associated Memory." The Journal of Neuroscience 35.49 (2015):
16116-6125. JNeurosci. Web. 31 May 2016.
17. ix Clemenson, Gregory D., and Craig E.L. Stark. "Virtual Environmental Enrichment through Video
Games Improves Hippocampal-Associated Memory." The Journal of Neuroscience 35.49 (2015):
16116-6125. JNeurosci. Web. 31 May 2016.
x Knight, Will. "Microsoft Researchers Are Working on Multi-Person Virtual Reality." MIT Technology
Review. MIT Technology Review, 12 Oct. 2015. Web. 31 May 2016.
xi Takahashi, Dean. "Pantomime Lets VR Headset and Smartphone Users Share a
Virtual space." VentureBeat. VentureBeat, 10 Feb. 2016. Web. 31 May 2016.
xii Wagner, Kurt. "Facebook Shows Us What It Means to Be 'Social' in Virtual Reality (Video)."Recode.
Recode, 13 Apr. 2016. Web. 31 May 2016.
xiii http://med.stanford.edu/elspap/elspap.html
xiv https://www.rottentomatoes.com/