Cyber-Physical Systems (CPS) and Internet of Things (IoT) devices such as sensors, wearable devices, robots, drones, and autonomous vehicles facilitate the Intelligence, Surveillance and Reconnaissance to Command and Control and battlefield services. However, the extensive use of information and communication technologies in such systems makes them vulnerable to cyber-attacks in the battlefield. These IoT devices are most often designed without considering security. Unprotected IoT devices can be used as “stepping stones” by attackers to launch more sophisticated attacks such as advanced persistent threats (APTs). An APT is a cyber-attack in which a malicious adversary gains access to a network and remains undetected for a long period of time. A later stage of APT is the “lateral movement” stage, where attackers use benign computer features to move step-by-step deeper into the network in a stealthy manner. For instance, it has been reported that Samsung’s smart fridge could be used to steal a user Gmail login. One can imagine several additional steps such as sending phishing emails to friend or coworker followed by privilege escalation. The above challenges and the high risk and consequence of IoT attacks in the battlefield drive the need to accelerate basic research on IoT security.
We are investigating proactive defense of IoT networks including cyber deception, cyber resilience, cyber agility – this process is also called Moving Target Defense (MTD). We consider the most intelligent adversaries that are able to launch sophisticated attacks (e.g. APTs). We also look into the scientific foundation of cyber security. Theoretical constructs and mathematical abstractions provide a rigorous scientific basis for cyber security because they allow for reasoning quantitatively about cyberattacks. In particular, game theory provides a rich mathematical tool to analyze conflicts within strategic interactions and thereby gain a deeper understanding of cyber security issues. By definition, game theory is “the study of mathematical models of conflict and cooperation between intelligent rational decision-makers”. The level of sophistication of recent cyber-attacks justifies our assumption of attackers’ rationality and thus the need for an intelligent defense mechanism based on game theory.
Advanced Persistent Threat
The cyber kill chain in Figure 1 shows the stages of an APT (red) as well as the defender’s best response at each stage (blue). At the reconnaissance phase, the attacker scans the system to identify potential vulnerabilities, understand the network topology, and find critical targets. This is followed by an exploit of a vulnerability to command & control a node. From that node, the attacker proceeds to a privilege escalation to gain elevated access that will enable lateral movement to reach a critical target. A proactive defense mechanism includes all scheme the defender can implement to protect the network before a cyber-attack is launched or early in the reconnaissance phase. Intrusion prevention systems (IPS), including firewalls and anti-virus, are designed to protect networks against cyber-attack attempts. However, these cyber systems tend to have inadequate prediction performance and misidentify malicious network traffic (e.g., malware, botnet) as benign—these packets are called false negatives. In addition to IPS, research shows that statistical learning techniques can accurately forecast or predict the timing and frequency  of cyber-attacks, based on network and organizational observations (e.g., domain name system traffic, network security policy). When proactive defense fail, the defender tries to detect the intruder, deny or disrupt malicious action or at least contain the attack. In the worst case of a successful attack, the defender should be prepared to quickly recover. The use of IoT devices in the battlefield increase the attack surface that our adversary can exploit. A game theoretic approach is suitable for all stage of an APT, from proactive cyber defense, to fighting through an attack in progress , or survive and recover from a successful attack . Our prior work  uses stochastic game approach to contain a CPS attack in the lateral movement phase.
Proactive Cyber Defense
There are several challenges associated with IoT security compared to securing traditional information technology (IT) systems. First, IoT devices are rapidly mass produced to be low-cost commodity items without security protection in their original design. Even if a device initially has some security features, many IoT manufacturers do not provide any security updates and thus IoT devices can become unsecure as hackers discover new vulnerabilities. Second, IoT devices are highly dynamic, mobile, heterogeneous and lack common standards. Additionally, they have a limited battery capacity, memory, and processing power and cannot integrate standard encryption algorithms and security protocols. Third, it is imperative to understand the natural world, the physical process(es) under IoT control, and how these real-world processes can be compromised before to recommend any relevant security counter measure. When faced with these challenges to IoT security, a proactive approach is better suited to the defense of IoT assets. A proactive IoT defense allows us to plan in advance, analyze all cyber threats and gain a precise understanding of potential vulnerabilities before a cyber-attack is launched. Cyber deception, cyber agility [ also referred to as Moving Target Defense (MTD) in the literature], and cyber resilience are the main components of a proactive cyber defense. Those components can be used separately or in conjunction to protect IoT.
Cyber deception is any attempt to disguise a network and impair the attacker’s decision with false information to protect critical nodes. Deception can delay a cyber-attack by increasing uncertainty. Deception also forces the attacker to perform more trial and error in the reconnaissance phase which increases the probability of intruder detection. The use of honeypots is a basic form of cyber deception used to create the appearance of important targets to the attacker. Honeypots also help to identify attackers and provide a means to learn about their behaviors in a safe environment. The attacker’s strategies learned via the use of honeypots aid in securing critical components . A honeynet is a decoy network that contains one or more honeypots. Valuable deception techniques must confuse the attacker while being transparent to the defender and legitimate users . Advanced deception techniques can dynamically hide or create fake vulnerabilities, data, protocols, communication links, software and applications. However, given enough time, an attacker may be able to discover the defender’s deception strategy. Therefore, a sophisticated cyber deception technique is most often combined with cyber agility.
Cyber agility is the dynamic reconfiguration of network parameters, components, topology, and protocols to oppose an attacker’s ability to collect information about the system. A static configuration gives enough time to attackers to learn about the system and identify potential vulnerabilities or exploits in the reconnaissance phase. An agility strategy randomly changes the network pattern faster than an attacker can learn. The Army Research Laboratory’s Cyber Security Collaborative Research Alliance is currently investigating game theoretic approaches to cyber agility .
Cyber resilience refers to the network capability to continuously maintain mission essential functions after a cyber-attack. Resiliency must be an important consideration in IoT design for a number of reasons. First, the military uses commercial off-the-shelf (COTS) IoT devices available to the general public. Second, IoT devices interconnect with the commercial network not owned by the military. Third, most IoT devices are designed without concern for security and thus contain many vulnerabilities that can be exploited as weak links to gain access to more important targets. Fourth, it is beyond the capability of a developer or a network administrator to predict all natural failure and malicious attacks because of the increased interconnection, interdependency, and complexity of IoT networks. Those facts dictate our pessimistic view that some attacks may be successful regardless of efforts to maintain best practices in the areas of deception and agility. We should proactively design IoT networks while considering remediation against the worst case scenario, that of a successful attack. Resilient mechanisms sometimes involve system replication, to add redundancy and avoid a single point of failure . Furthermore, the replica can be diversified to counter the attacker’s ability to exploit the same vulnerability in all replicas.
Game Theory for Advanced Persistent Threat
A game in normal form is given by a set of players, the set of strategies available to each player, and a payoff function that allocates an award to each player given any combination of strategies representing the choice made by each player. Game theory is suited for proactive cyber defense because of its predictive power. The solution to a cyber security game is its Nash equilibrium (or its derivative). At a Nash equilibrium profile, no player can increase his payoff after a unilateral deviation. Also, each player is playing his best response to other players’ strategies. Therefore, the defender can use the Nash equilibrium profile to predict the attacker’s best action. The prediction power of game theory, combined with cyber deception, cyber agility, and cyber resilience can form the basis of a robust framework for proactive cyber defense.
Each player in a game attempts to maximize his payoff based on his information and his belief about others players’ information. If the set of players, strategies and payoff function is common knowledge, we have a game of complete information. Otherwise, we have an incomplete information game. Therefore, cyber deception and agility which interfere with the attacker’s ability to gain accurate information produces a game of incomplete information with the potential to diminish the attacker’s payoff. However, one must also carefully consider skillful attackers able to deceive the defender. A skillful attacker can behave as if the defender’s deception is effective to misguide the defender to reveal his mode of operation. A useful game model must consider several other possibilities that relate to incomplete information. A skillful attacker may develop unknown exploits (e.g., zero-day vulnerabilities), hide his true intent (i.e., target, payoff), or operate undetected for a long time —this is the intent of an APT).
Recently, there has been increased interest in the literature to apply game theory to cyber deception ,  agility  resilience ,  intrusion detection  lateral movement  and APT . Cuong et al.  provide a detailed survey of these game-theoretic applications to cybersecurity. However, those works are restricted to a single stage of the kill chain and do not consider specific constraints of military IoT. We are currently investigating end-to-end defense mechanisms that can deal with a cyber-attack at multiple stages. The goal is to design mission-aware IoT with an autonomous cyber response capability.
We present a high-level description of our current approach to build an autonomous response  to IoT security with deception capability, learning for detection, and containing lateral movement. Figure 3 shows the diagram of the engine. From the configuration files of hosts (e.g., computers, operating systems, application, firewalls, servers, routers), the engine can compute the topology of the IoT network and generate the attack-graph. Two nodes V1 and V2 are connected in the attack-graph if there is a port, a protocol, and a vulnerable application on V2 that can be exploited to compromise V2 from V1. The engine incorporates a scanning tool capable of discovering new vulnerabilities from public vulnerabilities databases such as National Vulnerability Database (NVD) . Once a new vulnerability is detected, the attack-graph is updated by adding new edges to the graph. We use the Common Vulnerability Scoring System (CVSS)  to compute a relevant assessment on how the attacker can access a vulnerability, how complex it is to exploit the vulnerability, and the number of times one must to authenticate (if any) in order to exploit the vulnerability. If a new patch is released from NVD, then the system will automatically apply the patch and updates the attack-graph by removing all the edges corresponding to the patched vulnerability.
Before an attack is detected, a dynamic cyber deception mechanism is implemented to mislead the attacker and minimize the attacker’s impact on the IoT network. An adversarial machine learning approach robust to intelligent manipulation is implemented to detect these characteristics about the attacker: payoff, motivation, skill, and potential zero-day vulnerabilities not in the NVD database. The learning algorithm has to quickly converge to be compatible with fast changes in network topologies.
When an attacker is detected, a two-player stochastic game representing the interactions between the attacker and the defender is initiated. In this game, the states represent the nodes of the attack-graph and transitions correspond to the edge-vulnerabilities that the attacker can exploit to move laterally. The solution of the game will give the attacker’s optimal policy with deception.
Given the attacker’s optimum policy, the defender’s best response is calculated with accurate information. The best response at any state of the game will allow the system to quickly recover to a secure state. The system uses the optimum policy to automatically disconnect or self-reconfigure vulnerable services and thus slow down the progression of the attack at any node of the system. Finally, continuous learning and scanning of vulnerabilities allows the system to adapt to new threats.
Challenges and Future Work
Modeling IoT presents several challenges that will be addressed in our future work. First, IoT devices may be autonomous and may not have a global knowledge of the network . Also, directives sent from a central command to IoT devices may be delayed or lost. Therefore, a distributed security mechanism is more appropriate in IoT compared to the traditional attacker vs defender model.
Moreover, an IoT network may be subject to several simultaneous attacks from different point of the network, and at different stages of the kill chain. Those attackers may be acting independently or in collusion. The case of colluding attackers  is particularly challenging.
Monitoring is another key consideration. There are scenarios, where players cannot observe the other player’s actions directly but can only observe an imperfect noisy signal correlated to those actions. For instance, the defender may not know exactly the last edge-vulnerability exploited by the attacker or can only infer the new position of the attacker in the attack graph.
Furthermore, IoT devices may have a short time to process a large amount of information in a complex environment with finite memory and limited computational power. This results in the limited rationality of IoT nodes which result in incorrect decisions that deviates from rational equilibrium behavior. Prior work has used evolutionary game theory - and prospect theory  to account for limited rationality.
Machine learning entails improvement of a computer’s performance on a given task with experience. Machine learning algorithms and approaches are also important to our proposed framework for proactive cyber defense. Specifically, using 60 different classifiers (or supervised learning algorithms), Lee et al.  deploy honeypots and accurately identify social spammers on Twitter and MySpace. Furthermore, it is known that evolution-based algorithms that combine machine learning and genetic algorithms can advance cyber agility by periodically changing the system’s configuration and attack surface -. In addition, a key aspect of proactive cyber defense and cyber resilience is cyber-risk quantification—this process involves predicting the number of successful cyber-attacks . Moreover, each of these components of proactive cyber defense require robust intrusion detection systems (IDS) that are behavior or anomaly-based to detect the zero-day cyber-attacks instead of the classical signature-based detection models that are found exclusively in many IDS. For example, Alazab et al.  demonstrated that using support vector machines, a type of supervised learning algorithm, obfuscated malware can be effectively detected. However, there is the need to fully understand the limitation and vulnerabilities of machine learning algorithm . The potential manipulation of those algorithms by an intelligent adversary introduces new threats that need to be investigated. In fact, all IoT devices rely on algorithms based on artificial intelligence and machine learning to operate. Future battlefields will have IoT devices (e.g., robots, drones) from opposing armies . Those IoT devices may have other IoT entities as adversaries. An easy way to win a battle will be to manipulate the algorithm from the opposing IoT. The new and fertile field of adversarial machine learning is at the intersection of game theory and machine learning is promising.