Hybrid network emulation (HNE) , ,  is comprised of a discrete-event simulated links/networks and virtual machines (VMs)/containers that send and receive traffic through such links/or networks (e.g., Figure 1). It allows testing network applications rather than their models on simulated target networks, particularly mobile wireless networks commonly used in lower echelon tactical intranets. In some HNE approaches, e.g., , , applications can run on top of their native operating systems (OSs) without any code modification, so the same executable binary can be used in both HNE and real networks.
HNE addresses both feasibility and scalability concerns of testing applications over target networks. With respect to feasibility, as testing requires only the models of network elements, the availability of network element hardware (e.g., expensive tactical radios or next generation waveforms) is not an issue, and simulation enables testing over various network topologies and configurations, with terrain, mobility, and electronic warfare (EW) conditions that would be prohibitively expensive, perhaps even dangerous, to create in field tests. When both an actual implementation and a simulation model are available, HNE allows the use of either or both in the same experiment. When some physical devices are available, they can be plugged into the HNE networks that can be run in real time (typically at Layer 3). HNE also allows, in principle, the mixing and matching of link and device models developed for multiple simulators to be used in the same experiment.
With respect to scalability, theoretically the scale of the target network is constrained only by the capabilities of discrete event simulators and hardware resource availability. However, major discrete event network simulators with significant model libraries and active user communities (e.g., , ) at the time of writing use conservative scheduling and provide limited support for parallel execution (essentially, only for highly medium-independent network partitions connected over wired links). In this commonly used approach, an indivisible spectrum-sharing wireless network is modeled in a single-threaded simulator process, and may execute slower than real time beyond a certain combination of network size, model complexity, and traffic load. If not addressed, this becomes an issue for the VMs/containers, where emulation by default produces the real-time rate of time advancement for the VM OSs ; the ensuing mismatch can easily invalidate the experimental results as protocols and applications are faced with much slower than intended communication links/networks. In this paper, we describe our HNE implementation in a Cybersecurity Virtual Assured Testbed (CyberVAN) , , , , , , , designed and developed by Perspecta Labs (PL) since 2008 with U.S. Army and OSD funding, and currently used for validating the U.S. Army CCDC Army Research Laboratory’s (ARL) Cyber Security Collaborative Research Alliance (CRA)  research. CyberVAN has also been used internally at Perspecta Labs to support several recent and current DARPA and C5ISR programs, including , .
CyberVAN enables creation of high-fidelity enterprise and tactical network scenarios by constructing a mix of physical machines/devices, virtual machines, physical networks and simulated networks, with active support for ns-3, QualNet, and EMANE, which automatically makes available all the models developed for these simulators. CyberVAN provides scenario creation, deployment, and run-time control from GUIs or command line. CyberVAN includes special features for supporting large-scale, high-fidelity network experimentation, in particular addressing the time advancement problem in HNE, and provides utilities that facilitate the experiment process, including mobility generation, visualization and data collection.
Cyber and EW effects in tactical networks present a unique space that is very difficult to model with real networks. The HNE’s mix-and-match approach allows CRA researchers to rapidly and cheaply create realistic experiments, where both the software under attack, attacks themselves and defensive mechanisms can be developed and tested with the QoS and security assumptions correct for the tactical universe.
We demonstrate the use of HNE in CyberVAN on a platoon-level tactical internet scenario, developed for the CRA, with a situational awareness application (a) evaluated for basic performance (message delivery ratio, latency), (b) attacked in the cyber domain, including the network control plane (unicast routing) and information plane (location falsification), and (c) attacked in the EW domain with a UAV-mounted jammer. Given the open nature of CRA research, we have used only freely available models, OSs, libraries, and applications in the scenario, but we expect readers familiar with the sensitive tactical technologies used for similar purposes to be able to readily translate the scenario to the tactical reality.
The rest of the paper is structured as follows. We briefly describe how hybrid network emulation is implemented in CyberVAN. Next, we present the tactical scenario used in all the experiments. Then, we estimate the distortion introduced by HNE into the application performance metrics of importance to the scenario. Lastly, we discuss two cyber attacks and an EW attack.
Hybrid Network Emulation In Cybervan
CyberVAN’s HNE consists of three seamlessly integrated enabling technologies: software-in-the-loop network simulation, transparent packet forwarding, and host virtualization. Software-in-the-loop (SITL) network simulation allows real network traffic to be forwarded through simulated links, paths or networks, transparent packet forwarding ferries IPv4/IPv6 packets generated in virtual machines and physical devices to and from the network simulator, and host virtualization enables running real applications, libraries, and OSs in virtual machines/containers. As shown in Figure 1, the current implementation uses: (a) custom-built SITL modules (co-simulators) for each supported simulator type, (b) an Open vSwitch (OVS)-based transparent forwarding fabric, with VxLAN Layer 2 tunneling, and (c) QEMU/KVM-based host virtualization.
An IP packet from the sender application on VM A destined to the receiver application on VM B passes through the network device driver and device emulation on VM A, emerges on the virtual interface on the compute server hosting VM A, is encapsulated into a VLAN-tagged frame by the OVS logic, and is sent to a simulation server in a VxLAN tunnel, via a jumbo frame-capable switch (software or hardware). At the simulation server, the packet is extracted from the VLAN-tagged frame, and the VLAN identifier is used to determine the simulated node and network interface on which the packet is injected into the simulated IP stack. Two modes of injection are supported: pre-routing and post-routing. In the former, the packet is subject to the simulated Layer 3 forwarding logic installed in the particular IP stack. In the latter, the simulated Layer 3 forwarding logic is bypassed and the packet is injected straight into the IP interface. Regardless of the injection mode, the simulation logic then determines whether, when and where the packet may emerge from simulation to be delivered to VM B. If the packet emerges, it is again encapsulated into a VLAN-tagged frame and sent towards the compute server hosting VM B. On the compute server, it is de-encapsulated by the OVS logic and injected into the virtual interface corresponding to VM B. It then emerges inside VM B, and is received by the receiver application. The underlying physical network connecting the compute and simulation servers is currently GbE-based.
The post-routing injection mode is useful when Layer 3 decision-making (e.g., forwarding, access control, deep packet inspection) is implemented in real software running inside a VM. While real software can always run in VMs, post-routing injection is not always the best choice as it comes at a price. With pre-routing injection, a packet that traverses multiple simulated links between the sender and the receiver has to be ferried once between the compute server of the sender and the simulator and once between the simulator and the compute server of the receiver. With post-routing injection on every link in the same example, however, the packet will have to be ferried between the simulator and a compute server as many times as there are hops on the path between the sender and receiver nodes. In large scenarios, this can dramatically increase the network load on the transparent forwarding fabric. Pre-/post-routing injection is configurable per network interface.
The HNE packet forwarding story would not be complete without the Address Resolution Protocol (ARP, for IPv4), Neighbor Discovery (IPv6) and ICMP/ICMP6, which involve interactions of VM-based and simulation-based protocol endpoints. CyberVAN supports seamless interaction of real and simulated implementations of the above protocols, for example, handling of ARP requests from VMs in simulation and ICMP-based traceroute through simulated nodes with pre-routing injection. CyberVAN’s HNE also includes support for end device emulation features unrelated to packet forwarding, including National Marine Electronics Association (NMEA)-compliant GPS receiver data, and simulated battery state/consumption.
The simulated/emulated time advancement problem, inherent in HNE, is addressed by driving the emulated clock hardware with the dynamic rate of advancement of the simulator clock, sampled at small real-time intervals . This is a practical solution for most experimental work at or above Layer 3, with the software under test (OSs, libraries, applications) running in VMs/containers. Figure 2 illustrates the concept as currently implemented: the horizontal axis represents the progression of real time, the vertical axis represents the progression of simulation time and VM time.
Simulation time for a scenario is sampled at the simulation server at regular real-time intervals (its advancement is shown in blue); the sampled values are multicast to all QEMU emulators running the VMs in the scenario. Based on the real time passed (as measured by each emulator independently) and simulation time passed since last update, each emulator computes the dilation factor (DF) used to determine the rate of advancement of VM time for the next sampling interval. If the DF is too high or too low based on the actual data, the error is corrected for the next interval. Note that the simulator has been configured not to run faster than real time in this example. The rate of clock advancement is enforced via the QEMU-emulated HPET chip, and the VM OSs are configured to use HPET as the only clock source. This approach works for all major OSs, including Linux, Windows, Android, OSX, BSDs, and Cisco IOSv.
An Army-Relevant, Open Scenario: Platoon-Level Situational Awareness
In support of Cybersecurity CRA, we have recently developed a platoon-level lower tactical internet scenario that utilizes the civilian version of the Android Tactical Assault Kit (ATAK) , an extensible situational awareness (SA) application initially developed by AFRL; the military version of ATAK is in active use in the U.S. Army. Figure 3 shows a screenshot of the main ATAK SA panel with locations of the platoon members on the terrain, and a peer-to-peer messaging panel. Note the GPS data in the lower-right corner of the SA panel. The core ATAK functionality, present in all versions, includes: (a) maintaining blue force SA, (b) posting incident/intelligence reports for the team and higher level commanders, and (c) supporting peer-to-peer and group chat.
In support of blue force tracking, the ATAK application is typically configured to send Position Location Information (PLI) reports (either periodically, or on significant movement) to other team members, via UDP and IPv4 multicast. For the remainder of this paper, we will be concerned with this PLI traffic, and have configured ATAK to send the reports every 3 seconds. Our notional mission involves search and recovery of a small object lost in the vicinity of the Puu Wanawana crater in Kauai, Hawaii. This area (Figure 4) is interesting because it involves complicated terrain, with implications for radio propagation, and high-resolution terrain data are openly available for it from USGS.
For this scenario, we use ns-3-simulated 802.11n radios with omnidirectional antennas at the 2.412 GHz frequency, in ad hoc mode, with broadcast/multicast rate fixed at 1Mbps. The transmission power, antenna gain and sensitivity parameters have been adjusted to enable multi-hop topologies on the scenario terrain, within a distance of a few hundred meters. We use a terrain-aware propagation loss model based on a combination of  and , and a modified reference point group mobility model from , extended to allow (a) explicit group membership, (b) using an actual node as a reference point, and (c) terrain-following 3D mobility. In the scenario, 24 nodes move in groups of two across the terrain at low human speeds, with pause times of up to a few minutes. The scenario duration is 20 minutes.
Each node is running ATAK on QEMU-emulated Android/x86 7.1 tablets. Layer 3 forwarding is implemented with ns-3-simulated OLSRv1 and SMF; pre-routing injection is used.
Two critical metrics for network performance are latency and packet delivery ratio (PDR). To evaluate the differences between the purely simulated and hybrid emulated networks, we analyzed both cases and compared the values of these metrics across 10 runs to ensure that the results are repeatable and not due to random chance.
Both the ATAK application and the ns-3 simulation set the ID field of the IP packets. Thus we can identify the unique send/receive pairs for all nodes. The ATAK application periodically emits PLI information every 3 seconds and the SMF protocol floods these PLI updates across the entire network. Because traffic is being flooded, each PLI that is emitted, will be received in duplicate proportional to the number of 1-hop neighbors at each node.
Since the nodes are in a mobile ad hoc network, we measured the one-way latency for each pair of nodes across the entire network. Because packets are received in duplicate, we compute the difference between the emission time and the first reception time as the one-way transit time. The network is entirely simulated, thus there is no need to synchronize the endpoints as the flow of time is strictly enforced by the network simulator.
To establish a baseline for comparison we ran a purely simulated version of the scenario described in section III. To replicate the traffic pattern of the actual application we used the OnOff traffic generator to generate UDP packets of the right size, destined for the same multicast group and port, every 3 s, with small, normally distributed jitter.
In Figure 5 we show 10 cumulative distribution functions (CDF) for the one way latency. The median transit time was 0.0029 seconds across all 10 runs. The variation in the CDFs between runs is very small compared to the transit time across the network.
To compare we also preformed 10 runs where traffic was generated from the ATAK applications, and injected directly into the simulated network.
The behavior of the network was consistent across all 10 runs as there is very little variation between the CDFs shown in Figure 6.
We plot the averaged CDF across all 10 runs for both cases to verify that the distributions are the same in both cases. Figure 7 demonstrates that there is hardly any impact on the in simulation network latency when traffic is generated from the applications running on VMs in hybrid-emulation.
Packet Delivery Ratio (PDR)
The packet delivery ratio is computed as the ratio of received packets to sent packets. Using the IP address and IP ID fields of the packet header, we identify the unique send and receive pairs. Similar to the latency measurement we count only a single reception of the packet to avoid over counting due to duplication. Figure 8 shows the CDF of the PDR for the simulation baseline. Because of the reliable flooding of the SMF protocol, the PDR is very high 0.9974.
In the case of hybrid emulation (Figure 9), there is again very little impact on the delivery rate. The CDFs for both cases are very tightly clustered demonstrating that the experimental results are very repeatable. The usage of hybrid emulation does not introduce any variability to the distributions.
Looking again at the averaged CDFs side by side (Figure 10) we can see that the CDFs are not significantly different in any meaningful way.
Transparent Forwarding Transit Time
Because the forwarding fabric used to move packets between the VMs and the simulator is a high speed wired Ethernet network, packet loss between the VMs and simulator is extremely rare (unmeasurable). Therefore the only potential impact might be on the transit time of a packet.
We have already measured the transit time through simulation (in section IV-A) and determined it to be approximately 0.003 s. This transit time reflects the network models delay profile based on channel conditions and mobility patterns. To measure the complete path, we need to account for the transit time between the VMs and the network simulator in both directions. In Figures 11 and 12 we show the CDFs form the transit times to and from the network simulator. In both cases the median transit time is approximately 0.0001 seconds, an order of magnitude smaller than the delays introduced by the simulated network conditions. While there is some variation between runs (due to existing network loads on the shared network forwarding fabric), even the worst case delay is 0.0006 seconds which is well below the delay introduced by the network simulation.
In this section, we describe two cyber attacks on the search and recovery mission, and their implementations in CyberVAN.
Black hole attack on Layer 3 unicast forwarding
Layer 3 unicast forwarding with Optimized Link State Routing Protocol (OLSR) can be implemented with either pre-routing or post-routing packet injection, as there exist both an Android based implementation (Naval Research Laboratory OLSRv2/NHDP) ) and an OLSRv1  simulation model (in ns-3). For this example, we chose to implement it in the VMs, with post-routing injection. The VMs are configured with static ARP entries for all the nodes in the platoon, so packets can be injected without ARP requests. Also, for ease of presentation, we consider a single stationary snapshot of the dynamic topology created by the mobility model. The OLSRv2/NHDP-created topology before the black hole attack is shown in Figure 13.
The black hole attack consists of two distinct parts. The first part (”attraction”) is executed in the OLSRv2/NHDP control plane, by falsely claiming non-existing one-hop neighbors in the HELLO and TC messages. As the OLSRv2/NHDP protocols are built on implicit trust, actual neighbors and non-neighbors accept such claims at face value. Our attack is relatively stealthy as it only claims up to a configurable number of false neighbors at and beyond two hops in the actual topology. It is also adaptive as the set of falsely claimed one-hop neighbors is re-evaluated periodically to match topology changes.
The effect of the attraction part of the black hole attack executed on node 22 is shown in Figure 14. The rest of the network, after the short time that it takes to propagate fake link information, considers node 22 as having 10 extra links, shown in red. As a result, OLSRv2 routes, based on the shortest path computation, will force a considerable amount of traffic (e.g., from node 11 to node 23) to go through node 22. At this point, the attacker at node 22 can drop all (black hole) or some (grey hole) traffic that has been forced through it; this is the second (“data plane”) part of the black hole attack. The attack has been implemented as a direct modification of the NRL OLSRv2/NHDP source code, but could equally well be implemented in a packet mangling process external to the OLSRv2/NHDP daemon. It has been used in the evaluation of defensive technologies for link state routing protocols developed on .
In the Location-falsifying attack, the goal is to poison the network by flooding false PLI information. Under nominal conditions, each PLI message is flooded to the rest of the network (as depicted in Figure 15). This will occur even if a node is isolated from the rest of the network, as long as there is a one hop neighbor within range that can forward packets on behalf of the sender (e.g. A13 forward for A18). In the most naive case the attacker can simply pick a victim and blindly modify the packet body of the PLI update message with a random location. In the absence of message signatures, these spoofed updates will cause the victim’s position marker to jump around the map as the client receive both the real and fake PLI information. This approach, however, is easily detectable both in the client (seeing the victim’s position indicator jump around) and in the network (seeing duplicate packets with differing information).
A stealthier attacker may use the topology of the formed network to their advantage and carefully choose the modified position to be within a reasonable distance of the network. They can determine when a node is isolated by examining the routes and positions of a node. In the case of A18, the attacker would note that their position is physically far from the rest of the network and that they have no routes to other nodes via A18. If the attacker floods false information into the network at this point, the rest of the network will only get the falsified packets. In Figure 16, Node A13 is in such a position and delivers altered PLI updates on behalf of A18.
In the CyberVAN testbed, the ATAK application gets its GPS information from the network simulator (via the Android OS location service) and then uses these GPS coordinates to form its PLI message. We tested this attack by modifying the SMF client within the VM to alter the PLI packet bodies when the network conditions are appropriate for a stealthy attack.
Electronic Warfare Experiment
In this experiment, we demonstrate a simple but effective jammer deployed by a passing adversarial UAV on its reconnaissance mission. The UAV’s flight path, at a constant altitude of 500m, is shown in Figure 17. The jammer has a period of seconds and a duty cycle of 0.75.
The jammer is implemented with the microwave oven model available in the ns-3 Spectrum module and described in . The jammer injects interference in free space, on all frequencies between 2.4 and 2.499 GHz, and kills a number of multicast PLI reports that have no redundancy in the 802.11 MAC layer. The jammer was operational for the entire 20 minutes of the scenario duration.
While the illustrated jammer is deliberately chosen to be as simple as possible, a sophisticated jammer modeling framework is under development for ns-3 .
The effect of the jammer’s operation is shown in Figure 18 and Figure 19.
In this paper we presented CyberVAN, a hybrid network emulation testbed. CyberVAN uses network simulation and VMs running unmodified software to model a network at varying levels of fidelity. We described several aspects of the testbed that allow a user to tune the fidelity of the model for use cases ranging from network performance modeling to assessment of cyber threats in representative network topologies.
We have demonstrated that the CyberVAN HNE introduces minimal distortion to network performance measurements, for experiments with emulated components at Layer 3 and above. We have also demonstrated CyberVAN’s ability to model tactical networks, including replicating effects of terrain, mobility and EW. We showed the evaluation of cyber effects in these difficult to reproduce settings. The usage of real applications and OSs as part of the model enables testing of cyber effects which would be difficult in a purely simulated setting. CyberVAN enables repeatable testing in a virtual environment that is easy to manipulate and instrument. This can reduce the cost of testing and reasoning about cyber and EW effects in the networks of interest.
While the CyberVAN TimeSync solution enables theoretically unlimited scenario scalability with limited resources, it is impractical beyond a certain degree of slowdown, which may be possible when modeling large enterprise/tactical networks at full fidelity. We are currently investigating multiple HNE solutions to increase scenario scalability in addition to TimeSync, including (a) multi-threaded discrete event simulation with optimistic scheduling and reversible events, (b) distributed simulation, (c) modeling wired links with CyberVAN switching infrastructure (as opposed to network simulators), and combinations of the above.
While running real OSs and applications allows testing against a vast number of possible real cyber attacks, the current CyberVAN environment may be unsuitable for exploits that rely on some specific hardware features, exact instruction timings and precise configurations of real devices. When required, such needs can be addressed with the CyberVAN hardware-in-the-loop (HITL) capability, with the caveat that a scenario with real hardware requires execution in real time.
Hybrid network emulation has been our chosen modeling approach for over a decade because of its ability to balance fidelity and scale, and take advantage of existing models. We encourage the readers to apply the approach to their problems. CyberVAN software is Government Off-The-Shelf (GOTS) and available upon request from Perspecta Labs.