Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.
In other experiments, however, they used an ordinary digital camera. Because of a quirk in the design of most cameras’ sensors, the researchers were able to infer information about high-frequency vibrations even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn’t as faithful as it was with the high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers’ voices, their identities.
Anyone that hasn’t heard of the “Great Seal Bug Story” of Cold War fame should definitely check it out.
Putting the apathy of mainstream consumers aside, there’s a much deeper problem with the whole unbundling strategy. It only works if it’s a fundamentally distinct behavior being segmented into a stand-alone application.
“Officers have tanks now. They have drones. They have automatic rifles, and planes, and helicopters, and they go through military-style boot camp training. It’s a constant complaint from what remains...”