The Need for Analyst-in-the-Loop Cyber Detection
The nature of cyber attacks has undergone significant changes in recent years, as evidenced by the emergence of more cases of destructive Advanced Persistent Threats (APTs). Leading government agencies and cooperations have experienced increasingly sophisticated attacks that exhibited distinctive characteristics compared to more traditional cyber threats. APTs are typically well sponsored and organized cyber campaigns with very specific and targeted objectives. One of the key objectives of an APT is to achieve a persistent foothold in a system for long period of time by using zero-day exploits, careful propagation, and a small footprint. All of these measures aim to increase the likelihood of remaining undetected (stealth), especially when the defender relies on automated detection tools. Detecting APTs require extensive human involvement and effort when conducting forensic analysis .
Throughout the detection workflow, the analyst should be able to utilize various decision support tools. These tools assist the analyst by filtering, mining, summarizing and visualizing data to speeding up the analysis. In some cases, where there is sufficient confidence in the automated diagnosis, tools can even determine the threat and initiate the appropriate response. However, in the current linear detection workflow (Figure 1), the analyst has no influence on the way evidence is collected or processed, and eventually is required to make decisions based on an externally imposed information flow. The lack of control and transparency can hinder the analyst’s ability to detect threats quickly and accurately. In mission-critical settings such as threat detection, the analyst needs to trust the supporting tools and also have access to the reasoning behind the recommendations or alerts they generate in order to correctly determine whether or not to accept or reject a recommendation .
To detect APTs and ensure high level of mission performance by establishing trust in a decision support tool and compliance to the recommendation it generates, the analyst should have the ability to interact with the underlying detection mechanisms throughout the detection process (as illustrated in Figure 2) and not only at the very end . Furthermore, by the agency of such interactions the analyst can infuse contextual information that can support and improve detection accuracy and speed. The analyst can also continuously tune the detection processes in response to emerging threats and provide instruction on how the detection processes should adapt to changes in the attack surface and attackers’ capabilities.
Recent studies on Human-Data Interaction (HDI) propose a human centric approach to understand and develop interactions with data, dynamic data flows, algorithms, automated reasoning mechanisms and visualizations . HDI core characteristics of the interactions can be adapted to the cyber intrusion detection domain by situating the analyst as an influential component in each and every part of the detection process. Accordingly, in our synergistic analyst-in-the-loop framework we highlight three high-level aspects of the analyst interaction with detection. The first aspect is legibility, encapsulating the notion that both data collection mechanisms and analytics algorithms should be transparent and comprehensible to the analyst. The second aspect is , concerned with the idea that the analyst should have the capacity to control and influence data collection and management processes.
Lastly, negotiability addresses the ability of the analyst to influence the data processing and analytics so that data can be processed using different methods, for different purposes and in different contexts.
Figure 1 (left): Linear detection process — Figure 2 (right): Synergistic detection process
Varying levels of analyst involvement in detection
The attention of the human analyst is a valuable and scarce resource. As such, human attention and cognitive capabilities should be allocated in the most beneficial way that supports the most critical tasks. Other tasks, that do not benefit significantly from human analytical capabilities or play a less critical role, can be partly or completely automated. Parasuraman and colleagues  propose a model for types and levels of human interaction with automation. Among other implementations, this model can be used to assign different levels of automation to the four stages of information processing (information acquisition, information analysis, decision selection, and action implementation). Cyber intrusion detection relies upon information processing, where evidence collection is equivalent to information acquisition, the detection engine operation is equivalent to information analysis and the analyst decision and response stages correspond to decision selection and action implementation stages. Therefore, in each of the detection stages, analyst involvement can range from high to low. high analyst involvement corresponds to a low automation level where the analyst must take all decisions and actions, while low analyst’s involvement corresponds to high automation where detection processes operate autonomously. With respect to the analyst decision of whether or not there is an intrusion, we consider four levels of collaboration. At the lowest end of the collaboration scale, the analyst operates unaided and detects threats in the raw packet level data. Automation can increase by adding detection mechanisms that provide recommendations (i.e., alerts) to the analyst. The actual level of automation depends on the definition of the analyst’s role when responding these alert. The analysts can acknowledge correct detections and detect additional threats that were missed by automated detection. Alternatively, when automation is more extensive and trustworthy, the analyst role can be limited to detecting missed threats and canceling false alerts. At the highest level, all aspect of detection is automated and analyst can direct full attention towards selecting the best response to detected threats.
Adapting detection to the threat’s life cycle
Threats may be characterized by how well they can be detected through automated methods. This tends to be highly correlated with the understanding of the threat. From the analyst’s perspective, the threats go through a progression of understanding which is also captured by the life cycle of a threat .
Initially the threat is unknown to the analyst; such an unknown threat often targets an unknown vulnerability and is referred to as a zero-day exploit. Encountering such a threat requires much forensic examination and study of normal and abnormal network behaviors in order to isolate the threat and gain a preliminary understanding regarding its existence. There can be significant human involvement in this stage, from determining (i.e., labeling) the activity as malicious, to identifying the evidence that indicates the presence of the threat, to identifying its impact. As more examples of the threat are observed, more information is revealed to the human. Eventually, human understanding improves to the point where accurate detection mechanisms can be automated and programmed into the intrusion detection systems. This automation, in parallel to correction (e.g., patching) of the vulnerability, shifts the load away from the analyst. It is critical to provide the analyst support through automation in this part of the threat life cycle as following disclosure the volume of cyber attacks that utilize the specific threat can increase by up to 5 orders of magnitude .
At that point, the detection of the now well-understood a threat can be safely relegated primarily to automated mechanisms. The analyst remains responsible for making the final decision based on the automated detection outputs. However, the majority of the processing to ascertain the probability of the threat is done without human involvement.
The dynamics of the threat and detection life cycle highlight the need to allow different levels of automation in the detection processes. This ability is tightly coupled with the analyst’s ability to interact with the detection processes, understand how they operate, and influence their operation. Eventually, the needs and role of the human analysts can constantly change and as such, detection processes should be flexible enough to facilitate the operation of the analyst in constantly changing levels of understanding and awareness to cyber threats.