Some words on accuracy


I consult and advise on a variety of topics from visual effects and animation, to A/M/VR, to IoT and cloud processing, and one thing comes up again and again – “The solution needs to be as accurate as it can be.” Whenever I come across this, I’m reminded of something that happened a long time ago, in my pre-digital life, how accurate is “as accurate as it can be”?

In 1982 after I had graduated high school, I went to guitar repair school in Springhill, TN – then a sleepy, rural town outside of Nashville. I learned a few things there: how to get into Nashville bars at 17, rosewood poisoning makes you really sick, and that accuracy is a function of, well … function.

The instructor gave the students little custom made rulers with all kinds of nifty guitar repair measurements on them for fret spacing, fingerboard radii, etc. They were custom made by a machinist friend of his, and he had a little story that went with them. When he asked his friend how much the rulers would cost, his friend asked him “How accurate do they need to be?”. He said “As accurate as you can make them?”. His friend replied “Then each one would cost $10,000.”, to which my instructor then asked: “How accurate is $20 worth of accurate?” His friend said “Probably accurate enough.”

That was 1982, and I’m not sure if $20 worth of accuracy is as accurate anymore. But the lesson taught us was that when repairing something, one must first ask about the needs of the repair. Is it a crappy guitar where replacing the frets would cost more than the price of an entire, better instrument? Or is it a priceless old Martin that needs to have the bridge reattached with meticulous care? Each requires the best work you can do, but the best work you can do for each, will be different.

IoT and AR rely heavily on GPS to get location data. GPS is (to most people) surprisingly inaccurate – only to within 10 meters, and it can also be very noisy. But in many cases it’s accurate enough – especially after filtering. We can, for instance, track the fleet of trucks through the city, or hold up our phones and see that the bar with the best happy hour is “over there” through the heads up display beer finder app. Now, if we need that app to replace the bar’s sign with a 3D list of drink specials which tracks flawlessly to it, we would require a different, more accurate location tracking technique.

It’s more than just a matter of choosing the right tools for the job however, it’s also scaling the approach of solutions to match the needs of the tasks at hand. “How accurate do they need to be?” $20 worth of accurate? 30 feet worth of accurate? Animated GIF accurate? Substitute your reality with another accurate? Escape the “uncanny valley” accurate? Or maybe 1 second of accurate? 3 minutes of accurate? The Nyquist theorem tells us that to accurately digitize an analog sound, we need to sample it at over twice the frequency of the highest frequency of the signal we want to represent. Before launching into a solution try asking yourself: “What is the metaphorical ‘nyquist frequency’ of the problem you’re trying to solve?”, and build the solution accordingly. Of course the interesting question becomes: “What is the nyquist frequency of repairing a Martin guitar?”

Film grain, persistence of vision, sensor tubes, and experienced resolution.

At various times in my career I’ve had to think about image resolution, film grain, how to pull a clean image from a noisy video, and orbiting gamma-ray observatory sensor tubes. The question is: how much digital resolution is needed to represent the analog resolution of motion picture film? Or maybe it’s more of a statement: How to problematize the question of – “how much digital resolution is needed to represent the analog resolution of motion picture film?”, or at least have fun thinking about it.

When I worked at the visualization lab at UCR, in the dark ages, I remember a physicist talking about a project he was working on for an orbiting gamma-ray observatory that was going to map background radiation from outer space. They wanted to do this at a very high resolution. The problem was that the gamma particle sensor tubes they used could only resolve to a couple degrees at a time – basically 1/180th of the sky in a circle. A 180 sample map is not very high resolution. But he had a trick. The orbiting platform would be incredibly stable and predictable, so they could take a bunch of overlapping coarse passes with such accuracy that his software would be able to synthesize a much higher resolution of the sky – to fractions of a degree. This is how things like synthetic aperture radar work.

A little while later a mathematician came into the lab and wanted a still from the X Files credit sequence “The Truth is Out There”, for a presentation slide, but we couldn’t get a clean screen grab from the VHS tape. It was weird – the text and the background were stable, and at playback speed the image seemed clean enough – it’s only when we’d freeze a frame that the text melted into a fuzzy blob of static. So then I thought about it – basically every frame is like a gamma particle collection tube, the TV screen isn’t moving and neither are the text or the background. What if I were to “blend” a number of these frames together? Would a get a much clearer result? The answer was yes. And not just because of an NTSC fields vs frames thing – the more frames I combined the clearer the text became. What that blending method was I’ll leave up to the reader – and if you figure out let me know because I’ve forgotten. The point was that accumulating noisy images over time emphasized their similarities – the text, and deemphasized the differences – the smearing static. I’ve also used a similar technique to accumulate/synthesize a high resolution still from multiple low resolution renders each with slight camera shifts.

Okay so back to film grain and the resolution needed to represent a movie film frame with a digital frame. Here’s the fun part to mull over. A digital image is a regular grid of pixels in rows and columns. A film image is an irregular matrix of physical grains of stuff holding onto dyes of different colors – they aren’t uniform in size and certainly not uniform in location – and this all changes, randomly for every frame. Your brain does a great job of blending all of this together over time, just like blending together the video frames. How much digital resolution is needed to represent movie film? I remember when a 2K image was going to be more than enough. Now we’re pretty sure we need 4K. I wonder what kinds of phantom resolutions happen in our minds from the accumulation of unpredictable tiny grains of color? What kinds of resolution apertures we can synthesise out of all that noise? How do we experience different presentations of visual resolution?

here’s a link with some neat diagrams illustrating synthetic aperture radar:


An Old Map and Declarative 3D

This is an early  “work in progress” visualization of an 18th century map drawn by Giambattista Nolli, using @A-Frame declarative HTML mark-up extensions for VR/3D in WebGL – with procedurally generated geometry and baked lighting in Houdini. Lots more to do and learn. Eventually it will be part of an AR promo piece but I couldn’t resist.


(better navigation in cardboard, varied building heights, and better global illumination)


A Tale of Two Frame Rates

I had the good fortune of attending a local SIGGRAPH chapter talk by Bruna Berford of Penrose Studio, regarding production methodology and how they approached animation in their beautiful and emotionally compelling VR experience “Alumette”. She presented a good view into the difficulties, challenges,and rewards of adapting to working in this new medium. And let me say that to my thinking, they are actually embracing the new technology fully as a narrative medium.

But that’s not what this post is about. This post is about a misconception that people in V/M/AR have around the concept of frame rate. Specifically the holy grail of a 90+ fps redraw rate. This is held up as a metric which must be achieved for a non-nauseating viewing experience and is usually stated as an across the board dictum. Alumette, however, threw a very nice wrench into that, one which points to something that I’ve tried to articulate in the past. There are two different frame rates going on here. And that difference is apparent in Penrose’s use of stop motion frame rates for its animation.

The first “frame rate” is the one is the one that’s usually meant and I think of that as being the perceptual, or maybe even proprioceptural frame rate. This is the frame rate that corresponds to how well the environment is tracking your body’s movements. For instance when you turn your head, or when you walk around in a room scale experience. This is the one that tells your lizard brain whether or not you are being lied to. But a lot of people, including seasoned veterans, stop here, assuming that the matter is settled, but I think there’s a second frame rate at work.

The second is what I would call the exterior frame rate. This is the frame rate of the displayed content. And in Alumette this was definitely not at 90+ fps. In fact it was at a consciously much “slower” and less constant frame rate because it was being animated on hard, held keys with no interpolation. This was to emphasize the poses of the animation. The result was an elegant reference to traditional stop motion animation, with all of the artistic start/stop and a wonderfully surreal sense of time. And the overall experience in VR was not so much watching a stop motion animation, but rather existing in space with one. It was pretty cool.

The “content” was running at what I night guess averaged to ~12 fps, but the display of it, and therefore more importantly my perception of the experience was at the magic 90+ fps. This is an important distinction – especially when it comes to content creation. Would 360 video at a lower playback rate, say 18 fps, give us that old Super8 home movie feel as long as the the video sphere it’s projected onto was moving seamlessly? Could a game engine environment be optimized to hold frames of animation at 30fps allowing temporally redundant data to limit draw calls or GPU memory writes?

Who will make the merchandise display cases for VR shopping?

Many are convinced that simultaneous, shared, social experiences in VR and other 3D immersive modalities are a foregone conclusion. Regardless of how deluded we might be in this, one thing becomes clear – in order for this to scale, we will need to have a consistent way of describing all of the stuff – much like how molecules are a consistent way of describing the real world. Luckily the virtual world is many orders of magnitude simpler than the actual physical world, and instead of the uncountable trillions of sub particle level interactions of matter, the virtual world needs only a truly astounding level of trackable events through a potentially manageable number of protocols and standards.

The problem is that even at many orders of magnitude simpler, the task of how to consistently describe “anything” so that we can share it, sell it, buy it, travel to it, hold it, toss it back and forth, etc. is still really, amazingly complicated. Much more complicated than say – the choice of game engine d’jour, OBJ or FBX or Collada, or whether or not you have a cool physics engine. But what really are the basics of virtual matter that need description so that they can be manipulated in the ways we expect? I was thinking about this and came up with a functional, if prosaic example to get me into a more pragmatic frame of mind than say – blasting space zombie outlaws.

Let’s assume we have simultaneous social immersive 3D experiences delivered over a common framework. And let’s say that within that space there are millions of stores. And in many of these stores is the virtual equivalent of a merchandise display case. And let’s say your company makes display cases for virtual environments. There are a lot of assumptions here for sure, and the “display case” here is really just a conceptual placeholder for whatever the virtual world might offer up as a kind of “durable good”. But let’s put all that aside for the moment and assume that your business is making virtual display cabinets.

In the real world, display cabinets have certain features that make them more suited for some purposes than others, and yours are very good and specialized. In your case they are jewelry cases that have buttons on top that let a shopper rotate the shelves around forward and back. You make high end cabinets that are very durable and come in standard sizes that fit in with other leading retail fixture manufacturers’ products. The doors operate smoothly allowing ample access for the sales associate to quickly retrieve even the most tiny items the customer might want. When a business orders cabinets from you, they pay for them, you ship them out, they are installed and exist physically in place. No one can really duplicate them beyond manufacturing a knock off product.

In a virtual world retail businesses will want display cabinets, and just like in the real world they won’t normally want to design and manufacture them themselves. They will expect to buy them and for them to just simply work. Customers will be able to easily peruse their options and make their choices. They may want to try things on, see how they match the color of their eyes before they buy them. Your display cases will have to use the same “trying on” mechanisms that the rest of the display cases in the store do, because the store will want to support the latest most accurate shopping reality capture avatar system available. Your display case needs to be installable within the store’s inventory control scheme, but also installable within the stores local cartesian coordinate frame. It needs to be addressable within their asset management system so that stock changes and merchandising decisions can be pushed to the cabinets from central databases. Your cabinets will need to be backward compatible with this stores stock and inventory system which is several versions out of date because “they like their system fine the way it is”, and they are a big customer so you need to keep their business.

And so let’s say now you’ve managed to make a a future proof, universally accessible and addressable, fully inter-functional display cabinet, backward compatible with old virtual mercantile standards, with compliant e-commerce security features, but you still have another issue. How do you make sure that the store isn’t making copies of your display cabinets and using them across all their wholly owned subsidiaries? Or selling them overseas to offset a flat Christmas sales season? Or being stolen by a nefarious shopper and resold on the lucrative display cabinet black market?

This is where it’s all about standards. All about the protocols that set out the expected behaviors and configurations that define and prescribe how all of these magical virtual interactions happen. It’s the subatomic glue that connects all the disparate experiences into coherent, navigable places, and continues to do so after the cowboys and star fighters have all gone home.

Inside out projection

I recently took place in a Dance/Hack at Kinetech Arts, here in San Francisco. Kinetech is a group of technologists and dancers around whom a whole host of people (me included) orbit and take part when we can. The Dance Hack is a yearly event where teams of dancers/technologists/musicians/visual artists, get together to create some expression of motion technology. This year it was also linked to similar events in London and Amsterdam. In addition to special programs and performances, Kinetech has an open studio every Tuesday.

One Tuesday I brought my Kodak PixPro 360 camera, which takes a dome image 360 degrees around and a little over 210 degrees horizon to horizon. It records the anamorphic projection onto a square. Dancer/Choreographer, Megan Meyer, was very interested in what it did. She was interested in how the spherical nature of the image related to depictions found on ancient pottery. So we decided to try and put something together for the Dance/Hack.


Greek pottery

The concept started out grand, of course – epic narrative, historical reference, comments on contemporary society, etc. The time constraints and the resource limitations meant scaling back the vision a bit.

I’ve been thinking about alternate projections since I was a kid and learned that Greenland wasn’t nearly as big as I’d been led to believe, that straws don’t actually bend in glasses of water, and that perspective is sort of skewing a cone into a cylinder. I am intrigued by the visual and narrative possibilities of full-dome projection, but the barrier to entry is very high – too high for the DIY spirit of the Dance/Hack. We needed a simpler, cheaper way of reconstructing the images onto some viewing surface that many people could view.

We started to think small. The idea started from Megan’s concepts of ancient pottery which, though now are priceless museum antiquities, were created to be utilitarian, domestic items. Could we make the experience happen on a domestic scale?

Kodak sp360 web site

The Kodak 360 images remind me of chrome-ball photos taken on set for visual effects filming. The angles of the projection are different, but visually the results are very similar. What if I invert that process? If I could project the image off of a reflective sphere, and onto a larger, translucent ball, the resulting image would rectify itself into a relatively undistorted final result. It worked in my head while we were drinking coffee and drawing on scratch paper – so what could possibly go wrong? (within tolerances)

Here’s the theory:

The Kodak sp360 takes images through a very wide fisheye lens – with an over 200 degree field of view. The idea was to invert the distortion of flattening the dome onto the picture plane by projecting the picture plane back off of a dome21022835408_688a34069d_z_d


Of course this worked great in my head and on paper. And in a perfect world with enough time and resources, it probably would have worked well in practice. But part of the fun of a hackathon is fielding the unexpected and working within the real world constraints of the day.



Here’s the reality:

To begin with we didn’t have a budget to buy a very round and reflective chrome ball like the kind used in VFX production, but Megan did find some very shiny silver christmas ornaments. We didn’t have a source for spheres made of a highly transmissive translucent material, but Ikea has a good deal on round paper lanterns.  Well – okay – we’re building a lamp, and the aesthetic direction is based around a wood and paper lantern. That’s great because the rig to hold the projector, the ornament, and the lantern was going to be built from wooden slats and dowels.

The key behind getting this all to work was establishing the correct alignment of projector beam to christmas ball, and then to the paper lantern. This seemed easy because it should just be a matter of stacking one directly on top of the other and then changing the distances between them to adjust for which section of the image hit where on the sphere. But I hadn’t counted on the built in convenience functionality of the little projector. Unlike the projectors I remember from the previous millennium, modern projectors have all kinds of corrections so that you don’t have to angle the projector or worry about “keystoning” the image.


This meant that all of that proper alignment was in practice shifted wildly off of the “axis” of projection. And, in fact, the image hitting the sphere wouldn’t even be a square. Most of the day was spent with a swiss army knife saw, a cordless drill, and masking tape, trying to shim and bend everything to hold the ornament in the right place. Luckily dowel rods are flexible.

Here’s the result:



It was a little clunky but had its own charm – like a cross between some wabi sabi rusticness and the tree from A Charlie Brown Christmas. But it worked, and definitely points to some interesting possibilities with bigger and better hardware.

What I find most intriguing about this projection is that inverts the notion of looking out and looking in. The camera “sees” out, but the projection allows us to look into that view. I really like this look into an impossible slice of perspective – a view into the infinite.



Spherical Stereo Camera for Immersive Rendering

At SIGGRAPH this year in LA, I was really impressed by a presentation that Mach Kobayashi (currently at Google) gave at the PIXAR Renderman User’s Group. It involved how to render 360 panorama stills for viewing in VR. He pointed out that the naive approach of placing two panorama cameras side by side would break as you looked away from the plane the cameras were in because the views would cross. The stereo version of a broken clock being right twice a day.


He had a great solution involving ray-tracing the pictures from the tangent vectors of a cylinder, where the cylinder width is the inter-pupillary distance. And it works great:

But I was left wondering “what happens if you look up or down?” The answer must involve a sphere. So I gave myself a little project: ray-trace from a sphere to get the stereo result from every angle. And it almost works – like 98% works.

The basic idea is to extend a basic spherical panorama camera by adding an offset to the ray origin position without adjusting the viewing direction. The image is rendered, one pixel at a time by sweeping the “camera” horizontally in “u” and vertically in “v”. Like traveling the Earth by stepping a little to the east and snapping a one pixel picture of the sky all the way around, and then taking a step North, taking a single pixel picture and repeating. Luckily the computer is faster and the 3d scene database remains still for the shutter.


Camera assembly scans around a sphere in u and v

I have some more work to do – like trying to integrate animating live “hero” elements in with the static stereo background to see if it still holds up to the eye, but this will do for now. Here’s a link to a test that’s made for vieing in Cardboard:  The icky pinching at the top of the sky is from an attractive but non spherical sky and cloud texture.

Side by side
right eye from spherical projection stereo camera

Experimenting with particles, volumes, refractive and reflective surfaces is next. It’s promising, but tricky – the amount of filtering and pixel sampling is going to take some dialing in.

But there’s a problem. And it’s the same problem that map makers have when they try to flatten the globe into a single plane – you end up getting a mismatch of sampling distances and densities toward the poles. In my renders this results in a little “S” shaped warping as cameras are pointing too far up or down. The results are still pretty cool, and maybe good enough for many applications, but far from perfect. The other issue is that it means that just as many samples circle the sphere at the poles as at the equator, and whether you want to look at it as too much information in some places or not enough in the others – the fact is that it’s not a very efficient use of rendering resources.

So I’m working on another approach that uses geodesic spheres to derive sample points, and rotating the sphere points without rotating the normals. Oh there will be problems with that too I’m sure. But this is when I get to fall back on my MFA in painting at let all my higher math colleagues solve the problems I make for them.




What contests are we winning?

Talking with people in the AR/VR world there’s a constant, silly question buzzing in the air like a gnat? “Who’s winning – VR or AR?”. It is an interesting question, not for what the question is asking but what asking it implies in the first place. Is this all a contest? With a winner and a loser? Have we become so obsessed with the “gamified marketplace of ideas” that we can’t actually be motivated without some implicit or explicit conflict or large plush prize? But what even is the conflict? What is there to lose in this contest of AR v VR?

The contest implies they are the same project, suggesting that their finish lines are the same finish line. They are drawing upon a lot of the same technology for sure, but so are mobile phones, connected thermostats, smart TVs and watches. The basic antagonism seems to revolve around posturing –  for the best head mounted display, or the most pure vision of what is meant by “immersive” or “reality”, or who is the reigning champ of the “ultimate experience”. And this would all be as ludicrous a sideshow as it sounds, except for the number and stature of people involved on both “sides” who act like it’s a serious debate. In fact it was an actual debate at this year’s Augmented World Expo, and only a marginally tongue in cheek one.

And I get how important it is for one hardware maker to be able to capture market share, get funding or get acquired. Or for a game publisher to drum up marketing collateral to prep for a release. What’s a little bothersome is how easily this marketing spun copy is eaten up by people who should know better and then regurgitated as a real, pressing issue, when the real pressing issue is that people need to move past the towel snapping and make more complete things that are actually worth doing.

Manufacturers of HMDs want to demonstrate that each has the better display resolution, the better optics, better hardware integration – this makes perfect sense. It’s like competing computer chip makers claiming theirs is best because of clock speed, number of cores or instruction sets – it’s reasonable.   The differences in comfort, tradeoffs between configurability and convenience, and comparable aesthetics are like the PC v Mac debate – okay, I get that. Arguing whether VR or AR  will “win”, or is “better”, is like someone arguing that a realtime, embedded OS is inherently better than an interactive one like Windows, or that a freight train is better than a cargo ship.

If we take the crassly entrepreneurial measure of money – then AR has already “won”. It has market share, it’s profitable in products now, it generates revenue. But really, it’s a silly debate – we’ve been augmenting and virtualizing reality for years : the transistor radio, books, air freshener, hell even the rearview mirror. Timothy Leary is laughing at us all right now because he “won” the contest 50 years ago – and without a computer. So what should you do when someone asks you “who will win AR or VR?”– I think I know what Dr Leary would do.

Is there a VR equivalent of the whip pan?

There are many ways that a virtual reality cinematic experience differs from a traditional one. Much of the discussion is around whether or not certain techniques translate from one world to the other, and what can be done psychologically, physiologically, or mechanically without inducing pain or nausea in the viewer. But there’s less discussion about the narrative structuring elements of traditional filming and editing techniques and what their immersive counterparts might be. There are numerous camera and editorial decisions that we have “learned” to understand in watching movies – the so called cinematic language. Things like the over-the-shoulder point of view shot, the two angle dialog shot, or the sequence that moves from a long establishing shot to a medium shot to a closeup shot to introduce a place and/or character. But I’m going to choose the whip pan as the best example to illustrate some basic differences between traditional and VR cinema.

But why the whip pan? Well because it’s a very useful and ubiquitous film transition, that functions on many levels, yet it is almost entirely physically useless in an VR experience because that much camera motion is disorienting and causes motion sickness.

So is there a VR analog of the whip-pan? And what is a whip pan? Simply put – it’s where the camera spins very quickly from one point of view to reveal another one. This winds up putting the ending view in juxtaposition to the starting view. A common example might be a tense POV shot where the camera whips around to reveal the threat – a monster, the enemy, or antagonist of some kind. Also, because one byproduct of moving a camera very fast is an exaggerated motion blur that obscures what’s being filmed, a whip pan can be used to hide a transition to a radically different camera angle or even a different setting or time. The whip pan wrenches our perception and attention from one understanding of the story to another.

To illustrate the difficulty in finding a VR analog, let’s look at a whip pan example applied to both traditional projected cinema and immersive cinema.

Traditional: We are watching a horror movie, the setting is a forest at night, the shot is a POV shot and we are walking forward slowly, scared, seeing only what’s lit directly by our flash light. We hear a noise from behind – a breaking twig – the camera whips around to exactly where the monster is, and then drops down to just miss an enormous claw aimed right at where our heads would have been.

Immersive: We are experiencing the horror movie in a first person role, moving through the forrest at a rate that’s not too disorienting to be uncomfortable or distracting, maybe our attention is following the flashlight beam. There’s a sudden noise behind us, a breaking twig, we turn in our seats to, maybe see the monster in time to maybe duck underneath its giant claw. Maybe our reaction times are too slow and we miss the action of the monster entirely but are thrown into the next segment of the narrative anyway. Maybe we’re left confused and lose interest.

So what is it that makes the traditional whip pan so effective, what did it facilitate, and what does it lend to the overall emotional effect of the shot? The whip pan is the reaction of the audience; we have been conditioned as an audience to interpret this motion as our own action, something which on some level we feel we have created and are in control of. Our experience is that we are reacting to the sound by actively turning toward it and facing it, when in fact that’s very far from the truth. In reality the “camera” presciently reacts to a sound created at precisely the right moment, at exactly the right speed to get to the perfect point at the next perfect moment to both reveal the horrifying threat and simultaneously and miraculously escape it. This is not an exercise of free will, this a controlling cinematic device with a very specific narrative purpose and outcome: disorient, respond, reveal, escape.

There are also practical components of a whip pan that are often just as important. People in VFX know that you can hide a multitude of sins in the motion blur. Even in non-effects films, editors can hide a number of unlikely or incongruous transitions by dissolving through a fast camera move. Very often the camera angle or placement from the A part of the pan is not desirable for the B part. Or maybe you want to pan from being tall and looking down to being short and looking up. This can be very effective at making a revealed threat seem more ominous, or it might simply be an accommodation for practical size differences. The point is that the confusion and the disorientation of the pan itself is used to hide incongruities and inconsistencies that occur in getting from your origin to your destination. Because we know to accept this motion blur as transition from one thing to another it can also be used to transition from one place or time to another – we begin the pan and everything is green, we end it and everything is covered in snow – we know that we went from summer to winter and probably understand that we had a whirlwind autumn.

The whip pan goes beyond a simple camera move and becomes a complex mechanism of narrative cinema. How do we give a VR audience experience the same emotional experience? Should we? It may be that the whip pan is too idiomatic a part of traditional film to have an easy, direct counterpart – like one of those German words that has to be translated into an entire sentence in English.  Maybe it’s a matter of breaking entirely from a first person POV experience to an angle where we see the entire resolution of the action?  Maybe trying to do that at all is missing the point entirely. Maybe VR is not about controlling the view, but rather controlling the environment. Maybe narrative control is an outdated authoritarian construct. Maybe it all just requires our experience of VR to mature, and – like those people who stopped running out of early movies from oncoming trains – we’ll stop throwing up and learn how to read to VR’s abrupt new camera moves.

Transparency in AR advertising

It can be difficult to sift through the hype around Augmented Reality technology, and that’s not made any easier by misleading promotional videos that don’t fairly represent the final user experience. Granted, trying to represent what an AR experience will be to someone who has never had one can be difficult, but the over promising of marketing can actually hurt an industry still trying to establish its legitimacy.

So I was happy to see that Microsoft, in it’s latest promotional material for Hololens showing medical uses of the glasses, actually had images that depicted the transparency of the final effect. In the Windows Central article “New HoloLens video demos usage in medicine, is more honest about field of view” ( ), we see that, in addition to a more honest field of view, that the image overlays are shown as being transparent. I’m glad, because up until now a lot of AR marketing material has the overlay images composited over the backgrounds as though they had opacity. Users expecting to see solid objects hovering in space would be justified in feeling bait-and-switched when all of the objects look ghostlike and ephemeral.

I hope that as AR hardware gets closer to consumer release that the accuracy of the marketing materials improves. The additive light technology is amazing and will be used for incredible things, but it won’t make the objects opaque – they won’t have that level of visceral tangibility. If users expectations are too high, their disappointment might match, and aside from high ROI research, technical, and industrial uses, the technology risks being seen as another fad.