• December 27, 2016

Blur Won’t Deter: Machine Learning IDs Pixelated Faces

Forget pixelation and blurs—hackers can descramble your security.

Securing sensitive information just got a whole lot more onerous. A team of researchers from the University of Texas at Austin and Cornell Tech has succeeded in descrambling pixelated and blurred images using mainstream machine learning techniques. This means that standard masking techniques used to hide faces or sensitive information such as house numbers or license plates in digital images may soon lose their effectiveness. The same holds true for obscuring sensitive chunks of text in redacted digital and online documents.

“We argue that humans may no longer be the ‘gold standard’ for extracting information from visual data,” the researchers say. “Trained machine learning models now outperform humans on tasks such as object recognition and determining the geographic location of an image.”

Seeing Better Than Humans

The team found they could train neural networks to decipher pixelated or blurred images by feeding them sample image sets for analysis. These networks then used their knowledge of the originals to crack cloaked images. Programming or the development of new image decoding technologies was unnecessary.

The researchers also discovered that the more words, faces, or objects the network is exposed to, the more proficient it gets at descrambling images. In some cases, they were able to achieve recognition success rates exceeding 80 and 90 percent. This compares to random guess rates of just 0.19 percent.

With advances in computer optics and artificial intelligence, machines can now see things humans can’t. The availability of tools to descramble these digital cloaking techniques is exploding rapidly. And because these research results were achieved with what are essentially off-the-shelf technologies, cybercriminals with rudimentary technical knowledge can easily fuel attacks by decoding shrouded information. There are even online tutorials available to teach the uninitiated on the machine learning methods used in the research.

As the volume of user-generated images and videos proliferate on social media, video-streaming, and photo-sharing sites, the capacity to descramble this content becomes critical. These digital visuals often not only leak sensitive information about users but also about bystanders accidentally captured in the frames. This can also include other identifiers such as physical objects, typed or handwritten text, and contents of computer screens.

Turbocharge Security Strategies

Privacy and security experts need to be aware that advances in machine learning and recognition tools dramatically increase security challenges. Technologies such as full encryption are not a viable solution because this technique blocks all forms of image recognition, nullifying its usability. But using black boxes to defend against these information leaks provides total coverage and leaves no image traces behind to decipher. Replacing faces or other content with random images before blurring or pixelating can also be an effective defense.

Bottom line: Security professionals need to understand the extent to which cloaked information can be reconstructed. As the study notes, accelerating advances in machine learning shifts the security power equation in favor of adversaries.

Like this story? Learn more about keeping security on lock and amplifying your security and risk management.