Summary: Researchers have developed MovieNet, an AI model inspired by the human brain, to understand and analyze moving images with unprecedented accuracy. Mimicking how neurons process visual sequences, MovieNet can identify subtle changes in dynamic scenes while using significantly less data and energy than traditional AI.
In testing, MovieNet outperformed current AI models and even human observers in recognizing behavioral patterns, such as tadpole swimming under different conditions. Its eco-friendly design and potential to revolutionize fields like medicine and drug screening highlight the transformative power of this breakthrough.
Key Facts:
- Brain-Like Processing: MovieNet mimics neurons to process video sequences with high precision, distinguishing dynamic scenes better than traditional AI models.
- High Efficiency: MovieNet achieves superior accuracy while using less energy and data, making it more sustainable and scalable for various applications.
- Medical Potential: The AI could aid in early detection of diseases like Parkinson’s by identifying subtle changes in movement, as well as enhancing drug screening methods.
Source: Scripps Research Institute
Imagine an artificial intelligence (AI) model that can watch and understand moving images with the subtlety of a human brain.
Now, scientists at Scripps Research have made this a reality by creating MovieNet: an innovative AI that processes videos much like how our brains interpret real-life scenes as they unfold over time.
This brain-inspired AI model, detailed in a study published in the Proceedings of the National Academy of Sciences on November 19, 2024, can perceive moving scenes by simulating how neurons—or brain cells—make real-time sense of the world.
Conventional AI excels at recognizing still images, but MovieNet introduces a method for machine-learning models to recognize complex, changing scenes—a breakthrough that could transform fields from medical diagnostics to autonomous driving, where discerning subtle changes over time is crucial.
MovieNet is also more accurate and environmentally sustainable than conventional AI.
“The brain doesn’t just see still frames; it creates an ongoing visual narrative,” says senior author Hollis Cline, PhD, the director of the Dorris Neuroscience Center and the Hahn Professor of Neuroscience at Scripps Research.
“Static image recognition has come a long way, but the brain’s capacity to process flowing scenes—like watching a movie—requires a much more sophisticated form of pattern recognition. By studying how neurons capture these sequences, we’ve been able to apply similar principles to AI.”
To create MovieNet, Cline and first author Masaki Hiramoto, a staff scientist at Scripps Research, examined how the brain processes real-world scenes as short sequences, similar to movie clips. Specifically, the researchers studied how tadpole neurons responded to visual stimuli.
“Tadpoles have a very good visual system, plus we know that they can detect and respond to moving stimuli efficiently,” explains Hiramoto.
He and Cline identified neurons that respond to movie-like features—such as shifts in brightness and image rotation—and can recognize objects as they move and change. Located in the brain’s visual processing region known as the optic tectum, these neurons assemble parts of a moving image into a coherent sequence.
Think of this process as similar to a lenticular puzzle: each piece alone may not make sense, but together they form a complete image in motion.
Different neurons process various “puzzle pieces” of a real-life moving image, which the brain then integrates into a continuous scene.
The researchers also found that the tadpoles’ optic tectum neurons distinguished subtle changes in visual stimuli over time, capturing information in roughly 100 to 600 millisecond dynamic clips rather than still frames.
These neurons are highly sensitive to patterns of light and shadow, and each neuron’s response to a specific part of the visual field helps construct a detailed map of a scene to form a “movie clip.”
Cline and Hiramoto trained MovieNet to emulate this brain-like processing and encode video clips as a series of small, recognizable visual cues. This permitted the AI model to distinguish subtle differences among dynamic scenes.
To test MovieNet, the researchers showed it video clips of tadpoles swimming under different conditions.
Not only did MovieNet achieve 82.3 percent accuracy in distinguishing normal versus abnormal swimming behaviors, but it exceeded the abilities of trained human observers by about 18 percent. It even outperformed existing AI models such as Google’s GoogLeNet—which achieved just 72 percent accuracy despite its extensive training and processing resources.
“This is where we saw real potential,” points out Cline.
The team determined that MovieNet was not only better than current AI models at understanding changing scenes, but it used less data and processing time.
MovieNet’s ability to simplify data without sacrificing accuracy also sets it apart from conventional AI. By breaking down visual information into essential sequences, MovieNet effectively compresses data like a zipped file that retains critical details.
Beyond its high accuracy, MovieNet is an eco-friendly AI model. Conventional AI processing demands immense energy, leaving a heavy environmental footprint. MovieNet’s reduced data requirements offer a greener alternative that conserves energy while performing at a high standard.
“By mimicking the brain, we’ve managed to make our AI far less demanding, paving the way for models that aren’t just powerful but sustainable,” says Cline. “This efficiency also opens the door to scaling up AI in fields where conventional methods are costly.”
In addition, MovieNet has potential to reshape medicine. As the technology advances, it could become a valuable tool for identifying subtle changes in early-stage conditions, such as detecting irregular heart rhythms or spotting the first signs of neurodegenerative diseases like Parkinson’s.
For example, small motor changes related to Parkinson’s that are often hard for human eyes to discern could be flagged by the AI early on, providing clinicians valuable time to intervene.
Furthermore, MovieNet’s ability to perceive changes in tadpole swimming patterns when tadpoles were exposed to chemicals could lead to more precise drug screening techniques, as scientists could study dynamic cellular responses rather than relying on static snapshots.
“Current methods miss critical changes because they can only analyze images captured at intervals,” remarks Hiramoto.
“Observing cells over time means that MovieNet can track the subtlest changes during drug testing.”
Looking ahead, Cline and Hiramoto plan to continue refining MovieNet’s ability to adapt to different environments, enhancing its versatility and potential applications.
“Taking inspiration from biology will continue to be a fertile area for advancing AI,” says Cline. “By designing models that think like living organisms, we can achieve levels of efficiency that simply aren’t possible with conventional approaches.”
Funding: This work for the study “Identification of movie encoding neurons enables movie recognition AI,” was supported by funding from the National Institutes of Health (RO1EY011261, RO1EY027437 and RO1EY031597), the Hahn Family Foundation and the Harold L. Dorris Neurosciences Center Endowment Fund.
About this AI research news
Author: Press Office
Source: Scripps Research Institute
Contact: Press Office – Scripps Research Institute
Image: The image is credited to Neuroscience News
Original Research: Open access.
“Identification of movie encoding neurons enables movie recognition AI” by Hollis Cline et al. PNAS
Abstract
Identification of movie encoding neurons enables movie recognition AI
Natural visual scenes are dominated by spatiotemporal image dynamics, but how the visual system integrates “movie” information over time is unclear.
We characterized optic tectal neuronal receptive fields using sparse noise stimuli and reverse correlation analysis.
Neurons recognized movies of ~200-600 ms durations with defined start and stop stimuli. Movie durations from start to stop responses were tuned by sensory experience though a hierarchical algorithm.
Neurons encoded families of image sequences following trigonometric functions. Spike sequence and information flow suggest that repetitive circuit motifs underlie movie detection.
Principles of frog topographic retinotectal plasticity and cortical simple cells are employed in machine learning networks for static image recognition, suggesting that discoveries of principles of movie encoding in the brain, such as how image sequences and duration are encoded, may benefit movie recognition technology.
We built and trained a machine learning network that mimicked neural principles of visual system movie encoders.
The network, named MovieNet, outperformed current machine learning image recognition networks in classifying natural movie scenes, while reducing data size and steps to complete the classification task.
This study reveals how movie sequences and time are encoded in the brain and demonstrates that brain-based movie processing principles enable efficient machine learning.