Application Specific Abstractions: A Research

aron-visuals-bZZp1PmHI0E-unsplash

Posted: March 8, 2016 | By: Lok Yan

II. Background

The relationship between abstractions, security and performance was introduced in the previous section. The opportunity to utilize the extra transistors in modern IC designs was introduced as well. We elaborate on those initial observations in this section.

A. Abstractions

Abstraction is a technique that 1). hides the complexities of a system and replaces it with a simplified model called an abstraction interface; and 2). partitions the developers into low, who implements the abstraction layer, and high, who uses the abstraction layer to build higher level systems. Simplification and division of labor help reduce the costs of development and is the source of modern computing success.

46

A well known computing abstraction is shown in Figure 1a. Here, the operating system abstracts away the complexities of lower level hardware and presents higher level processes and applications with a System Call Interface. A more complex abstraction stack, the Open Systems Interconnection (OSI) model, is depicted in Figure 1b. The OSI model shows multiple abstraction layers stacked on top of each other as seen from a communications perspective. This demonstrates that abstraction is used both vertically (e.g., the presentation layer further abstracts the session layer) and horizontally (e.g., there is more than one way to hide complexities).

Moreover, abstractions are not only used in software, but also in hardware as depicted in Figure 2. At the bottom of the figure are the fundamental building blocks of modern computers: transistors, resistors, capacitors, etc. The details of these analog devices such as the power and delay characteristics are abstracted away by the manufacturers who present digital logic gates in the form of standard cells that are specific to a manufacturing process. Chip designers can then use the cells to build progressively larger structures such as blocks and modules that are reused and combined to create sophisticated integrated circuits (ICs). These individual ICs are then combined into Systems-on- Chip (SoCs) which represent much of modern personal computing requirements dominated by phones, tablets and internet-of-things.

The rest of the stack is similar to that of Figure 1a, except for the two additional abstraction layers above the operating system in this notional Android stack. In Android, applications or Apps can either use the Android App Framework or native libraries to interact with the rest of the system. As an extra abstraction layer, the App Framework greatly simplifies most common tasks such as interacting with the user and inter process communication.

The framework and the Android Runtime layer that emulates the Dalvik Bytecode of Android Apps, being general abstraction layers, do incur additional overhead. For this reason, performance intensive applications such as games and multimedia Apps use native libraries to bypass the additional abstraction layers and interact with the system directly. This flexibility in how Apps interact with the rest of the system is the beginning of application specific abstractions.

Though the abstraction hierarchy in Figure 2 is fairly deep, it still does not represent all abstractions. In particular, programming language abstractions are not shown. Programming languages are examples of abstractions1 that are implemented through translation whereas the layers in the Android stack implement abstraction through extension and interpretation [1].

47

In translation, the language specification is the interface specification, and the abstraction layer is the compiler. For example, Android Apps are written mainly in Java. The Java source is then compiled down to Dalvik Bytecode (the App’s executable code).

Interpretation can be seen as a dynamic counterpart to translation. The compiler statically translates the behavior described in the higher level language into something that can be executed natively, but the interpreter must decode (interpret) the requests made by higher level users and execute code on their behalf. The Operating System or the Android Runtime are examples of interpretation.

Extensions are abstractions implemented in the same language as the intended users, which implies shared libraries. In this case, the functions, methods, classes etc. exported by a shared library constitute the abstraction interface and the library itself is the implementation.

1) Abstraction Interface Specifications: There are three entities in an abstraction: interface, low-level developer and high-level developer. Thus, we separate the sources 1 Dennis uses the term hierarchies of security vulnerabilities into security vulnerabilities due to ill-defined specifications, poor or incorrect low-level implementations and poor high-level implementations. Since vulnerabilities arising from implementation errors can result with or without abstraction, we will only focus on vulnerabilities due to ill-specified interfaces. We further limit our discussions to problems due to the nature of abstractions and how they are used only. Our discussion does not include cases where the interface specification was defined incorrectly. Thus, mistakes in the specifications, implementation and usage of abstractions are out of scope.

The abstraction interface specification is an agreement that binds the low-level and high-level developers. On one extreme, the rules within the specification are interpreted and defined by one party. In other words, the low-level developers define the interface on their own and the highlevel user simply uses the interface as is. These are generic abstractions since there is a desire for the specification to be generic enough to cover many possible users. Generic abstractions exhibit a one to-many2 relationship where one single low-level developer defines the interface for many users. We use the progressively wider blocks in Figure 2 to illustrate this concept.

On the other extreme, the rules are defined jointly and are application specific. These run the risk of being overly constrained though. Pictorially, if all abstractions layers are one-to-one then all blocks are of the same width.

2) Security: From a security perspective, application specific interfaces are desirable because the user’s requirements can be incorporated directly into the interface specification. Unfortunately this is only applicable to highly sensitive applications that can absorb the high developmental costs, because it removes the multiplicative development factor of generic abstractions. One of the main weaknesses of generic abstractions is that by defining the specification without input from the user, some very important information can be lost leading to a semantic gap problem. Buffer overflows are a good example of this.

A buffer overflow occurs when the data is written beyond the confines of the intended buffer. While these can be attributed to programmer error, we argue that those that cross an abstraction boundary, e.g., resulting from a call to a library function such as memcpy and strcpy, are the products of overly generic specifications. In these cases, the definition of the “intended buffer” is simply lost through abstraction. To demonstrate this we compare array copy as defined in C/C++ and Java.

In C++, arrays are copied either using memcpy, if the array elements are primitive types, or std::copy. Both are generic abstractions by extension. memcpy defines both the src and dst pointers as void* meaning all type information is lost and std::copy is a template function where all instantiations must support a set of common operations. Furthermore, both require the user to ensure that the destination buffer is large enough to hold all of the copied elements. If the user makes a mistake and tries to copy too much, then a buffer overflow vulnerability is created.

Java, on the other hand, automatically creates new Class definitions for every new array type [2]. That is when the user defines an int array (int[]) the Java virtual machine automatically creates an int array class. Copying arrays is achieved using the assignment operator which, for array classes, automatically performs bounds checking. This is an example of an application specific abstraction. One could argue that the C++ template function std::copy serves the same purpose if iterators also perform bounds checking by default. This is only true if the user defines iterators that does so. The built-in iterators do not perform bounds checking for performance reasons.

The same problem exists in abstractions through interpretation. Like memcpy, the read system call relies on the user to ensure that the buffer is large enough and removes the type information as well. The loss of type information can exacerbate security issues since interpretation uses different high and lowlevel languages. For example, when a Java program uses JNI (Java Native Interface) to call native code, all protections that are afforded by the Java Virtual Machine and the language are lost. This means that writing to primitive Java arrays at the native level no longer generates ArrayOutOfBoundsExceptions. The implementer of the JNI function could perform bounds checking, but this is an example of application specialization and also this special treatment cannot be sustained across the multiple abstraction layers as exhibited in modern platforms.

A compiler is the abstraction layer in translation. Because compilers rely heavily on formally proven methods and algorithms, we believe the only source of security issues are due to under-specified or unspecified behaviors resulting in incomplete proofs and errors in the implementation – the latter of which are out of scope for our discussion. Undefined behaviors are interesting in the sense that the specification itself forces the low-level developer to make a design decision that defines the behavior.

A recent study showed that undefined behaviors led to unstable code (a superset of insecure code) in about 40% of all Debian Wheezy packages written in C/C++ [3]. The authors found that modern compilers are removing behaviors such as null-pointer checks, thereby violating the programmer’s intent and introducing security vulnerabilities, for optimization purposes. The authors also note that the unstable behavior is only introduced at higher optimization levels for certain compilers. Since the optimization level is a configuration option, it is also an indication that application specific abstractions – instantiated through configuration – are desirable.

We have provided some background and examples into some security and reliability related issues due to the use of general abstractions in this section. We have also stated that modern computing stacks have many layers of abstraction even though there are overheads. The next subsection will present a brief description on how the community and industry has been able to hide the overhead through more sophisticated and powerful processors. We will also discuss how this dynamic is changing.

3) Overabundance of Transistors: Moore’s law has been a mainstay of the semiconductor industry. Over the past few decades, chip manufacturers have continually made advances that effectively doubled the number of transistors in an integrated circuit every two years. While processors of the 1970’s ranged in thousands of transistors per chip, today’s state of the art processors range in the billions. The increase in transistor counts was mirrored by a similar increase in operating frequencies and performance until about a decade ago. These increases not only allowed for higher performance chips, but also higher sophistication in terms of predictive execution logic, longer pipelines, and specialized instruction sets (horizontal abstractions) well beyond the humble beginnings of the Intel 4004 general purpose processor of the 1971. The performance increases also allowed for far more sophisticated software built upon a multitude of abstraction layers.

The end of frequency scaling is attributable to end of Dennard scaling. In brief, the switching frequency of a transistor can increases as long as the threshold voltage of a transistor decreases in line with decreases in feature sizes; this condition is known as Dennard scaling. (A proper treatment of the topic can be found in Chapter 2 of Shacham’s dissertation [4]). The end of frequency scaling gave way to multi-core scaling where more cores are added to processors to attain performance gains through parallelism and to use the available transistors.

However, these gains are unsustainable. The end to Dennard scaling also means as the number of transistors go up so does the power necessary to drive the overall chip. Thus increasing the number of cores also require higher capacity power sources and cooling solutions to dissipate the heat generated. The industry hit a “power wall” [4] that was exacerbated by the dramatic increase in mobile platforms where both power and cooling are limited. Given the limited power available in mobile platforms and even decreasing power envelopes to obtain better battery life, increasing the number of transistors meant there will come a time when not all transistors can be utilized and not all cores can be processing at once. This is known as the “dark silicon” problem and already exists today [5], [6].

Dark silicon is less of a problem and more of a design constraint though. There isn’t a rule that states all transistors must be fully utilized; the transistors simply can’t all be activated concurrently, they can be effectively used sequentially. Dark silicon does limit the ability to attain better performance through simple and general abstractions such as multi-core. As in the case for security, application specific abstractions and application specific integrated circuits seem to provide a way forward and these extra transistors can be used to serve that purpose. We discuss a research agenda that is based on and follows recent research in the next section.

Want to find out more about this topic?

Request a FREE Technical Inquiry!