United States. Security cameras, smartphones, and speakers are just a few of the devices that will soon run more AI software to speed up image processing and speech tasks.
A compression technique known as quantization is paving the way by making deep learning models smaller to reduce computing and energy costs. But it turns out that smaller models make it easier for malicious attackers to trick an AI system into misbehaving, a concern since more complex decision-making is transferred to machines.
In a new study, researchers at MIT and IBM show how vulnerable compressed AI models are to adversary attack, and offer a solution: Add a mathematical constraint during the quantification process to reduce the odds of an AI falling prey to a slightly modified image and misclassifying what they see.
When a deep learning model is reduced from the standard 32 bits to a lower bit length, it is more likely to misclassify the altered images due to an error amplification effect: the manipulated image becomes more distorted with each additional layer of processing. In the end, the model is more likely to mistake a bird for a cat, for example, or a frog for a deer.
Models quantified at 8 bits or less are more susceptible to adverse attacks, the researchers show, with accuracy dropping from 30-40 percent to less than 10 percent as bit width decreases. But controlling Lipschitz restriction during quantization restores some resistance. When the researchers added the restriction, they saw small performance gains in an attack, with the smaller models in some cases outperforming the 32-bit model.
"Our technique limits error amplification and can even make compressed deep learning models more robust than full-precision models," says Song Han, an assistant professor in MIT's Department of Electrical and Computer Engineering and a member of MIT's Microsystems Technology Laboratories. "With proper quantization, we can limit error."
The team plans to further improve the technique by training it on larger datasets and applying it to a wider range of models. "Deep learning models need to be fast and secure as they move forward in a world of internet-connected devices," says study co-author Chuang Gan, a researcher at the MIT-IBM Watson artificial intelligence lab. "Our defensive quantification technique helps on both fronts."
By making AI models smaller so that they run faster and use less energy, Han is using AI to push the boundaries of model compression technology.
In related recent work, Han and his colleagues show how reinforcement learning can be used to automatically find the smallest bit length for each layer in a quantified model based on how quickly the device running the model can process images. This flexible bit-width approach reduces latency and power usage by up to 200 percent compared to an 8-bit fixed model, Han says.
Source: MIT.


