Web DeskOctober 1, 20235883 min
Creepy AI Extracts Audio from Muted Videos and Images

A team led by Kevin Fu, a professor specializing in electrical and computer engineering and computer science at Northeastern University, has developed a machine learning tool called “Side Eye” that can extract audio from a static image. This tool allows researchers to discern the gender of a speaker in a room where a photograph was taken, transcribe spoken words, and identify the location. It can also be applied to muted videos.

Side Eye works by harnessing image stabilization technology commonly found in smartphone cameras. Smartphone cameras use a lens suspension system with springs submerged in liquid to keep photos clear and focused even when the photographer’s hand is unsteady.

These cameras use sensors and an electromagnet to counteract movement by adjusting the lens in the opposite direction, stabilizing the image. When someone speaks close to the camera lens during a photo, it generates subtle vibrations in the springs, altering the path of light. By extracting audio frequencies from these vibrations using the rolling shutter technique, which is commonly used in photography, Side Eye can recover audio information.

While Side Eye is currently in a rudimentary stage and requires a significant amount of training data to improve, there are concerns about potential misuse. In the wrong hands, a more advanced version of this technology could become a significant cybersecurity threat. However, there are also positive applications, such as assisting law enforcement agencies in crime investigations by providing valuable digital evidence.

