The biggest misconception about vision? That you see reality
Human vision isn’t a direct recording of the world. That’s why virtual and augmented reality still struggle to replicate it.
You think you’re seeing reality. You’re not.
In a famous experiment, participants are asked to watch a video and count basketball passes between players. Focused on that task, many participants never notice a person in a gorilla suit walking directly through the scene.
The reveals something surprising: Human vision is not a continuous recording of the world. What we experience as reality is actually the brain’s best interpretation of it.
Most of the time, we never notice this process. But technologies such as virtual and augmented reality (AR and VR) are revealing just how much work the visual system performs behind the scenes. Even the most advanced headsets can create environments that look remarkably lifelike yet can still feel slightly off or even downright uncanny.
This gap is not just a technical limitation. It reflects something deeper about how human vision works.
At the , researchers—including Michele Rucci, Duje Tadin, and Barry Silverstein—are studying this process from three connected angles: how the eyes actively sample the world, how the brain filters and constructs meaning from that input, and how technology tries to reproduce the experience in AR and VR systems.
Their work reveals that what we experience as reality emerges from a complex interaction between our eyes, our brain, and the machines we build to imitate both.
Misconception #1: Your eyes record reality.

“The eye and a camera are similar in one narrow sense: Both acquire visual information. But the similarity quickly breaks down,” says , a professor in the .
While a camera is designed to record an image, the visual system is designed to gather the information we need to move through the world and respond to it.
In fact, Rucci says, if the eye functioned like a camera, it would be a surprisingly bad one. The optic nerve, which connects the eyes and the brain, contains only about one million fibers that transmit visual information. If those fibers were treated as camera pixels, this number would be comparable to a one-megapixel image, far less than the pixel counts of modern smartphone cameras, which have up to hundreds of megapixels.
Sharp detail is limited to the fovea, which is a small region near the center of the retina. Outside that region, visual information is much lower in resolution but is more sensitive to motion, which helps us detect movement in our surroundings even when we are not directly looking at it. Despite this, most people experience the world as if it is sharp and detailed everywhere at once.
If the eye functioned like a camera, it would be a surprisingly bad one.
This sense of completeness stems from the fact that human vision is not a passive recording process. Even when we think we are staring steadily at something, our eyes are never still.
“Our eyes are constantly moving, placing the fovea on whatever part of the scene is most relevant,” says Rucci, whose research examines active perception: how the visual system gathers information through eye, head, and body movements. Some of this work has focused on the tiny eye movements that occur during fixation, including a slow wandering motion called ocular drift and occasional small jumps called microsaccades. “We are normally unaware of these movements, but they are important.”
Our experience of a detailed visual world, then, is less like taking a photograph and more like moving a flashlight quickly through a dark room. Only a small portion of the scene is illuminated in detail at any given moment, but because the beam shifts so rapidly and continuously, the brain combines those glimpses into a coherent rendering of the scene.
Even more surprising? If our eyes are prevented from moving, our vision begins to break down. In experiments where images are artificially stabilized on the retina, the image fades from perception.
“In a camera, motion tends to blur an image,” Rucci says. “In the eye, however, small movements help structure the visual input, transforming spatial patterns in the world into temporal signals that retinal neurons can encode.”
Eye movements are, therefore, a fundamental part of how seeing works. Rather than delivering a fixed image to the brain, the eyes continually convert the world into a changing stream of visual information that the brain can interpret.
Misconception #2: You see everything in front of you.

If the eyes are a sampling system, the brain is where the construction happens.
A common misconception is that the brain processes everything the eyes deliver. In actuality, “the human brain operates under strict bandwidth constraints,” says , a professor of brain and cognitive sciences. “We can typically only hold five to seven chunks of information at a time in our working memory.”
Because our cognitive systems can handle only a limited amount of information at once, the visual system has evolved to be highly selective about what it passes along. The amount of information that reaches the eyes is enormous. At every stage of processing, the brain filters, prioritizes, and discards incoming information. But what the brain loses in detail, it gains in meaning—things like labels and object recognition.
For instance, imagine searching for your keys on a cluttered desk. Your eyes take in the pens, papers, charging cables, dust, shadows, and dozens of other details. Yet you don’t consciously process these details equally. If your keys have a blue keychain, blue objects become more noticeable. If you usually leave your keys on the right side of the desk, you are drawn there first.
“What we perceive is a combination of what enters the eye and the brain’s educated guess.”
Rather than preserving every detail of the visual scene, your brain builds a version of reality that helps you accomplish the task at hand.
Attention also plays a major role. Some information stands out because it is visually distinct. Other information stands out because it matches what a person is looking for, expecting, or thinking about. What we notice depends not only on what is present but on what our brain has decided is relevant, stemming both from the task at hand and prior experience.

“The brain’s goal is not to create an accurate picture of the world around us,” Tadin says. “The goal is to create something that’s useful for our survival, that gives us information when we need it, and that doesn’t waste brain resources on things we don’t.”
This helps explain why people can miss things that seem impossible to overlook, such as the person in the gorilla suit in the “invisible gorilla” experiment. Our eyes receive the information, but our attention is focused elsewhere.
Optical illusions reveal a different aspect of the same broader process.
“If there are gaps in perception, the brain makes the best guess at filling in the relevant information,” Tadin says. “What we perceive is a combination of what enters the eye and the brain’s educated guess.”
In other words, perception is not a perfect readout of reality; it is an informed interpretation. Rather than guessing randomly, the brain fills in missing information by drawing on context and our vast prior visual experience. It emphasizes what seems important and even allocates its resources unevenly.
The result is a remarkably efficient system, one that builds a useful model of reality rather than an exhaustive one.
Misconception #3: Better AR and VR just means better graphics.

By the time visual information reaches our awareness, it has already been shaped by constantly moving eyes and a brain that sifts, prioritizes, and fills in the gaps. Reproducing that experience turns out to be much more difficult than simply displaying an image.
The goal of AR and VR is often assumed to be creating a perfect digital replica of the world. In practice, however, AR and VR technologies are trying to do something that is arguably much more complicated: match the way humans perceive the world.
As Barry Silverstein’s work shows, even small mismatches in how visual information is delivered can break the illusion.
“The world is vast, and you can’t really predict where a human is going to see,” says Silverstein ‘84, a professor of practice at Թ’s and director of the Center for Extended Reality. “You need to provide, in general, more information than you’d like. But that becomes a problem because the eye moves and the head moves. Creating an optical system that moves along with the eye and the head, at this point, is beyond what the technology can do.”
“Perception is who we are. And it’s different for everyone.”
Like the brain, AR and VR systems must decide what information to show and what to leave out. Too much information creates computational overload. Too little information makes the experience feel incorrect. Eye tracking, such as the research conducted by Rucci, is a key piece of the puzzle. By identifying where a person is looking, future systems will devote their highest resolution and computing power only to the most relevant parts of a scene while simplifying the rest. But because our eyes never stop moving, those adjustments need to happen almost instantaneously for the technology to work convincingly.
Eye tracking is only one challenge. Another major challenge is accurately reproducing depth and focus. In human vision, where the eyes point and where the lenses focus are tightly linked. In virtual environments, those systems don’t always align. The eyes may converge on a virtual object that appears to be close while the lenses remain focused at a fixed distance. The result is a subtle mismatch between what the visual system expects and what it receives. That mismatch is part of why virtual environments can feel subtly unnatural, even when they look visually convincing. It can also lead to eye strain, headaches, and motion sickness.
“It’s hard to deliver the world the way that you see it, through any particular technology,” Silverstein says. “And doing that on the head is really complicated because you can only have so much technology comfortably worn by a person.”
Ultimately, human perception itself is not something technology can easily emulate.
As Silverstein puts it: “Perception is who we are. And it’s different for everyone.”
Misconception #4: Seeing is simple.

Tadin often asks students in his introductory class on perception: “Which is easier, math or vision?” Most students choose vision.
“Vision seems so effortless that it is easy to underestimate its complexity,” Tadin says. “But vision feels effortless only because so much of our brain’s computing power is dedicated to it. It is a really difficult process to replicate with computers or even AI. In contrast, calculators and computers can easily do challenging math problems.”
“Vision feels effortless only because so much of our brain’s computing power is dedicated to it. It is a really difficult process to replicate with computers or even AI.”
Building convincing AR and VR systems, therefore, isn’t simply about adding more pixels, brighter displays, or faster processors. The challenge is matching an extremely complex process that we barely notice happening. Seeing feels effortless, but it is anything but simple. Every moment, the eyes and brain work together to sample, filter, and interpret the world around us.
What feels like reality is, in many ways, an experience the brain is constantly constructing.