For my final project in Adv. Computer Graphics I decided to pursue with my current interest in augmented reality. Through our discussions in class I got interested in experimenting with participatory augmented reality and being able to “share” a space. After proposing the idea I started building the infrastructure needed to experiment with multiplayer Augmented Reality experiences.
Early on, I recognized both ARKit and ARCore to be great at persistency and user experience and decided to combine the marker less approach ARKit offers with QR code detection (a more traditional computer vision application) using OpenCV to try and calculate the “offsets” in each client’s camera to the origin of his world matrix. This would be turned into a transformation matrix that I can then share between client and transform shapes so they appear at (roughly) the same spot in the real world, ok enough with the technical stuff this is how I imagined it would work:
I decided to use Apple’s Swift language for simplicity and built the project specifically for ARKit. Upon getting a stable version to work I documented a video of the engine with a simple cube placing experiment (Thanks to Roi for helping out)
Thinking about an AR Google Drive
After the class discussion on participatory Augmented Reality I got really interested in how we perceive our digital assets through 2D abstractions. I have been using Google Drive for quite some time and it always struck me how intuitive Google was able to make Google Drive feel even though storing images, audio and video files in a folder structure is unintuitive by it’s nature, these elements were once tangible. Through a discussion sparked in class after suggesting that, Ken mentioned he is able to find an item in his home even though he could potentially be one of 20,000 or more items and using a service like Google Drive means you are bound to loose files because of how insignificant the action of storing them in some folder is. It’s nothing compared to putting something on a specific shelf in your home.
Since one of the main uses of storing files on a cloud is sharing, once I was able to build the multiplayer system I started thinking about how this could be used for storing files in Augmented Reality and inviting people to explore the files with you.
I was able to prototype the idea of “uploading” the files to a specific location in space, here is a image of that:
I would like to combine the multiplayer engine with a prototype for organizing digital assets in augmented reality.
I would like to add sharing capabilities where people could enter the space and be exposed to the files your shared with them
For this experiment I chose to focus on spatializing information in Augmented Reality. The idea of spatializing information is not new in any sense and dates back to perhaps the invention of signage (or perhaps even earlier examples could be argued). With that being said, it seems advancements in accessibility of Augmented Reality consumption models, predominantly the release of Apple’s ARKit and Google’s ARCore, calls upon the need of different approaches and models when spatializing information, or to be precise, drawing digital information in physical space. Given our shared interest in that subject I collaborated with Anastasis Germanidis to produce a speculative experiment of using Twitter data in Augmented Reality.
Spatializing information in AR feels much more like cave glyphs than street signs, they are graphic, associative and story driven.
During the past couple of years I have been experimenting with VR quite a lot. Through creating fantasy-driven VR experiences, narrative ones and documentary, the feeling of ‘Mimicking life’ has always struck me to be an impossible goal when designing these experiences. The paradigm shift that AR suggests, is that at the core of the experience, you are the focus (iPod, iPhone, iLife). As content is ‘interacting’ with your environment in a place of your choosing, we become numb to our ‘spidey-sense’ of detecting the fiction from the non-fiction and buy fully into more hybrid experiences. A good analogy is that ‘realistic’ VR experiences feel like Mocumentary films while with AR it feels more like Documentary (to me!).
Twitter in Augmented Reality
First off, in order to contextualize the real world to the digital world we need a bridge that allows us to understand some (very little but still) of the taught process that we go through between seeing things and thinking about ideas (yes, this ties perfectly into Peirce’s theory of signs and semiotics in general). To do that, we started by looking into another one of Apple’s new and upcoming innovations CoreML. At it’s essence CoreML is an optimized engine for running machine learning (pretrained) models on iDevices. Apple also released quite a few pre-trained models themselves and so given our desire to classify objects from the real world we decided to use the Inception v3 model, which is trained to detect objects from images and classify them into a 1000 categories.
*We also found this example to be super useful when starting an ARKit/CoreML project
The art of association
Even though the machine learning model worked better then both of us have anticipated, it is nothing like our brain operates (sometimes I really am happy I took media studies in film school). Continuing on that, since our brain is such a phenomenal ‘associative computing engine’ we are able to bridge the gap with our own context of the scenario even when the machine learning classification is wrong. Which renders the question of what is wrong?
From index to tweet
Once we got the machine learning apparatus running it was time to get some data based on it. We hooked up to the Twitter API using a swift library and started parsing tweets. Adding some filters on the parsing process we were able to get to a decent point where the tweets are closely related to the classification category.
Once we had all the rather technical parts in place we started sketching a design that would work in delivering the message of this experiment.
We wanted it to feel natural, but also disruptive.
We added profile pictures of the tweets presented inside a sphere, roughly located next to the tweet, and used Twitter’s color palette to color the text and the user-name.
Where does the magic happen?
Personally I found small moments of magic when it almost felt like machine learning and augmented reality extended my perceptual senses and brought emotional impacts of objects on to the conscious surface, wow that was not very descriptive right? Perhaps an example would help, when looking at a fence in the subway station, the classification algorithm predicted I am looking at a prison. Since we disabled the user being able to see what the machine learning model classifies, it pulled the following tweet:
“You are the prisoner, the prison and the prison keeper. Only you hold the key to your freedom” – Ricky Mathieson
Another example of this magic occurred when looking at the coffee machine (a thing I spend quite a lot of time doing every day)
“Still life with coffee pot” – Man Ray (1962)
Which refers to Man Ray’s following painting:
Enough with the talking
To illustrate how this works out we made a video of using this experiment throughout one morning