Machine learning security is an emerging concern for companies, as recent research by teams from Google Brain, OpenAI, US Army Research Laboratory and top universities has shown how machine learning models can be manipulated to return results fitting the attacker's desire. One area of significant finding has been in image recognition models.
Image recognition is one of the stalwarts of machine learning and deep learning systems, allowing for superhuman performance on classification tasks and enabling proofs of concept in autonomous vehicles. Recent highly successful research showing the exploitation of image recognition models, specifically convolutional neural networks, is especially troubling for autonomous vehicles as attackers could theoretically take control of vehicles, or at least cause them to lose control. Advancements by Geoffrey Hinton and team address a few of the key problems plaguing convolutional neural networks, or CNNs, (more on that below), however, definitive research has not yet been performed to check if they fix the security problems.
I'll outline several security issues that exist in current algorithmic deployments and then walk through some steps to take in order to provide assurance over algorithmic integrity.
CNNs consist of many layers of artificial neurons, with each layer looking for a different aspect of the image in question. For instance, when considering a photograph of an individual's face, one layer of the network might look at the left eyelid, another at the right, while another looks at the eyelashes, another for the nose, etc., until the whole image has been accounted for. As features of a face are examined separately and not holistically, the model does not take into consideration spatial orientation-for example, if the left eye were moved to where the mouth is, and the mouth was moved to where the left eye should be. Therefore, a CNN has a high likelihood of considering these two images as identical (see this blog post at Hackermoon for an example). This lack of "robustness" has serious implications for the use of image recognition systems in the real world.
While nearly all machine learning models are open to manipulation, I wanted to highlight CNN here because 1) Examples of manipulation are easier to understand, and 2) CNN manipulation can have some of the potentially most egregious outcomes.
Recent research has shown that using any machine learning classifier (which is simply a model that classifies between two outcomes-yes/no, true/false, Andrew Clark/Jeff Bezos) can be tricked into returning any result the attacker wants. This type of attack is known as an Adversarial Attack, and several specific attack techniques, such as the Fast Gradient Step Method (FGSM), are fast, efficient, and easy to use for machine learning experts. These techniques have been implemented and experimented with by ML researchers and students, and are publicly available for access by anyone on Github.com. Using this available code, hackers can make non-targeted or targeted attacks, adding noise to an image to make the classifier incorrect or use a targeted approach to get a specific classification result, such as classifying a panda as a gibbon (as shown in the paper, Explaining and Harnessing Adversarial Examples).
When considering additional examples such as check-scanning software and autonomous vehicles, both relying on CNNs, security risk and mitigation of this type of machine learning manipulation become critical. As with all areas of information security, it is much easier to attack a system than to properly defend it. Additionally, the lack of widespread adoption of machine learning in enterprises and the currently limited use of autonomous vehicles means that there is still time to prepare enterprise systems for these sorts of attacks.
Finally, neural networks contain a property call universal approximation theorem. While all classification models are subject to attack, neural networks are best positioned to be robust to attacks through training.
One particularly disturbing aspect of these attacks is that no training or model details are needed to successfully implement them, as research has shown that models previously never seen that are hosted online are susceptible to compromise. Embedded models in IoT and other devices are subject to attack as well.
Before examining the more complex aspects of algorithms in the enterprise, a basic machine learning assurance framework needs to be in place. My recent article in the ISACA Journal provides a rudimentary guide for developing a machine learning assurance framework based on the CRISP-DM model. When baseline procedures are in place, a more targeted, deep-dive examination can commence. However, in-depth reviews will require machine learning and deep learning subject matter experts, usually with advanced STEM degrees.
Another helpful resource is Cleverhans, a library for training and testing adversarial machine models that also provides a framework for creating more robust models. Additional training and evaluation of high-risk, high-impact machine learning models with adversarial frameworks is a necessity for enterprise deployments to reduce the risk of an attack.
The more complex that models are- i.e., models with many inputs and few linear relationships between variables-the more resistant they will be to attack than simpler models with fewer features and linear relationships. However, the more complex the model is, the more "black-box" the model becomes, creating a hard trade-off between human interpretability and security.
As with advances in any area of technology, new risks and opportunities for exploitation are created. Machine learning adversarial attacks are especially a cause for concern given the potential for widespread use of high-risk classification in fields such as medicine and transportation.
Although these types of attacks have been proven by researchers, they have not yet gained full widespread, mainstream awareness, and the particular systems they target have not been widely implemented. Enterprises should use this time wisely to create risk mitigation strategies before major attacks are perpetrated.