Traditional forward disassembly of code disassembles instructions starting from the base address and moving upwards to higher memory locations, while reverse disassembly typically needs to disassemble instructions at memory locations that are lower than a base. But such reverse disassembly on x86 is never an easy task, mostly due to the fact that there is no knowledge of where the previous instruction begins or what its length is.
This makes reverse disassembly a clumsy and inaccurate process. But we still somehow need to know what the previous instructions are.
We can do this because we know that every function has to start with a prelude; so we first search back from the return EIP till the point where we find a valid prelude, then disassemble forward, while creating a CFG of the complete function. Forward Disassembly call 0xcc 1. Search for prelude 3. After constructing the CFG, we locate the node in which the CALL lies, mark the desired register that makes up the call, and search backwards from this node for the instruction that writes into the required register.
In case the CALL contains multiple registers such as base and indexed registers , we repeat this procedure for each individual register. To search backwards in the CFG, we simply reverse the edges of the graph and follow the paths. Within each node, we traverse its instructions in reverse. The above process assumes that function text is organized as a sequence of instructions laid out consecutively in memory, which is the case with most Unix kernels. This technique is shown in Figure 6.
Consider the following fictitious example of a single node in the CFG to illustrate our analysis of a register-indirect call: 1. Searching backwards within this node for the instruction that wrote into eax lands us at position 3. After determining this, if we analyze the register contents at that instruction, and add 4 to the value of ebx, this gives us the hook address. Note that this example only illustrates how the search is conducted within a node; if the desired instruction cannot be found in the same node, we traverse backwards to the nodes leading up to the current one in the CFG, and repeat the procedure shown here.
The addresses of these hooks can be either static or dynamic, depending on where the hooks are located. We henceforth refer to the function that is called using a pointer hook as a hook function. A rootkit hook function is a function that lies within a rootkit, while a kernel hook function is one that lies within the kernel or in non-rootkit LKMs.
In his book on rootkits [27], Joseph Kong describes hooking as a programming technique that employs handler functions to modify control flow. Typically, a hook function will call the original function at some point in order to preserve the intended behavior. A hook function calling another hook function diverting control away from the core kernel is exactly the kind of behavior we wish to detect. Since the hook functions in both cases are called via indirect calls, and the arguments passed to these functions are somewhat identical we elaborate more in Chapter 7 , we should be able to detect such a scenario and report the presence of a rootkit.
We insert probes on the final list of callee functions identified in the learning phase, and when these probes are hit, we scan for the possibility of the above attack vector. We also make the assumption that the kernel text is unmodified by the rootkit — this is not a limiting factor of our system since text modifications are easily detectable by existing systems [31] [42] [30], and so is a dangerous avenue for any rootkit trying to avoid detection. Also, maintaining text integrity is an orthogonal problem which we do not tackle here.
Our method works as follows. During the learning phase, if the hook function or callee function is within the core kernel text, then we insert a probe on this function.
When this probe is hit, we disassemble the call, obtain the address of the callee, and insert a probe on it. After inserting probes this way, our system now enters a passive state, where the actions are triggered only when these probes are hit.
Hence if a particular hook function is never called, then our system imposes no overhead on it. These initial probe handlers also called Type I probe handlers first determine whether the function was called through a pointer hook.
To do this, we repeat the same procedure as in the learning phase, i. In case of a direct call, we return. Note that we still need to repeat this step, since the same function can be called both directly and indirectly from different parts of the kernel. Our next step is to check for the presence of a rootkit.
We have two cases to handle: 1. When a rootkit is installed while autoscopy is functioning: In this case, the rootkit would modify h to point to its own hook function F 0 and since the original functionality must be unhindered, F 0 still calls F. Autoscopy inserts probes on hooks to determine if control flow has been diverted to a new rootkit a , or if control flow is returning from an already-present rootkit b.
The dotted lines indicate how the probe handlers detect the indirect calls. The numbering shows the order of control flow. In case a , our probe resides in the original callee. In case b , our probes land within the rootkit itself.
When a rootkit is already present on the system: In this case, the function that we put a probe on is the rootkit function itself. Since the probe is at the beginning of the function, we first disassemble F 0 in memory.
Then we insert probes also called Type II probes on all functions that F 0 calls both directly and indirectly.
When these probes are hit, we repeat the same process i. The context sup- plied to direct function-call probes indicates the probe handler should continue putting probes on functions underneath it, while the context for indirect call probes indicates that the handler should check its arguments against those of F 0 , while also continue inserting probes if the function is not within the core kernel.
Figure 6. All Type I and Type II probes are also cached so we do not keep inserting and removing probes across these modules, which might diminish performance. To insert probes on all calls within a function, we first construct a CFG of the function. Briefly, a CFG consists of basic-blocks that are delimited only by jump or return instructions. Each node in the CFG represents a contiguous sequence of instructions that has a single jump or return instruction at the end.
The second case is necessary because in case of a conditional jump, the program can either take the jump, or ignore it and continue with the next successive instruction. For both these phases, the probe handlers that autoscopy installs need to be as fast and efficient as possible. Any latency incurred at this level might get exaggerated if the underlying functions are frequently called within the kernel.
For the detection phase, though latency is not particularly an issue, since there are thousands of probes inserted by our system, we need to make sure the OS does not get significantly slow. For the learning phase, latency becomes a critical issue since it is the determining factor in measuring the performance and overhead imposed by autoscopy.
We also need an implementation of a disassembler within the kernel. We use the excellent udis86 disassembler [11], which was modified and compiled to work within an LKM. Also, since we perform reverse disassembly when disassembling the call instruction within the caller function , udis86 was modified to support this mode of operation. The query interfaces of udis86 and other library accessor functions were used to query the current instruction and the operands that were disassembled, and appropriate actions were taken based on the result.
Moreover, since our probe insertion algorithm is unaware of the location of these probes, some of them may and usually do land in highly sensitive areas of the kernel such as top halves, interrupt timers, core scheduling routines etc.
Such areas of the kernel usually cannot suffer even a moderate delay in execution time, and the kernel is prone to crashes when instrumented uncouthly in these parts. Hence, in order to reliably insert probes all across the kernel, our probe handlers in this phase are highly optimized and efficient. In order to avoid expensive computation within the handlers, we only collect the context of each probe, and delegate the actual workload to a user-space process.
Collecting probe context we explain what we mean by context shortly is extremely fast, and thus allows us to insert a large number of probes without drastically affecting system performance.
While performance is not a major concern yet, after inserting all our probes, we run LTP to exercise all parts of the kernel, which takes a few hours. Hence we need to ensure that overall system response is still manageable. The Collector then reads this incoming stream of addresses and inserts probes on all of them.
Analyzer Collector 1. Scan kernel memory and obtain list of potential hooks. Send the list of potential hooks. Insert probes and collect the return EIP context. Send all hit probes along with context. Calculate the addresses of all hook instructions for the probes. Send the list of hook instructions. Insert probes and collect processor registers context. Calculate the hook addresses and build table of hooks. Figure 7. The Collector is implemented as an LKM and exports a driver interface that is used by both the modules for protocol interaction and data transmission.
We now go through all the steps from Figure 7. The Analyzer scans kernel memory through the kmem interface and uses the algorithm from section 6. The Analyzer then sends this list to the Collector. The Collector inserts probes on all the callee functions.
The context of a probe at this point is the return instruction pointer or return EIP that identifies the caller function. After the LTP tests complete, the Collector returns back the list of all probes that were hit. Before proceeding further, we first define a hook instruction as the instruction that contains the hook address in the form of processor registers and optionally an offset.
To verify whether a hook is indeed involved, the Analyzer uses the kmem interface to access the kernel function, disassembles the call, and if indirect, then disassembles the caller function, builds a control flow graph CFG , analyzes the operands of the call instruction, and finally identifies the set of hook instruction s that contain the hook address.
Since we still need to obtain the processor registers when the hook instruction was executed, the Analyzer feeds its list of hook instructions along with the caller-callee information back to the Collector. The Collector again inserts probes on all hook instructions and callee functions. The context of a probe now is the processor registers instead of the return EIP. After a second round of LTP tests, the Collector returns the set of all hit probes along with their context to the Analyzer.
Finally, the Analyzer uses the context information to build a table of hook ad- dresses, resolve all addresses into symbols, and also calculate function argument information explained in section 7.
Since the learning phase has the capability to determine the prototype of the callee function, we leverage this potential to calculate the number of arguments that a callee function expects. For functions within an LKM, since there is no possible way for us to determine this information, we use a heuristic and set the number of arguments as four for such functions. Argument similarity is defined as the number of arguments that are equivalent between two function calls.
Whenever a callee probe handler is called, both scenarios from section 6. If either scenario reports a hit, then a rootkit is detected. Since we also make the assumption that the kernel text is unmodified, we make some performance enhancements this way. To reiterate, Type I handlers are installed for probes detected during the learning phase; all other handlers make up Type II. Stack Verifier: The Stack Verifier is reponsible for ensuring that the current stack is untampered and also determining the origin and type of the function call.
Reports an anomaly if it detects either an unusual stack or a control flow anomaly, such as the presence of two indirect calls within a single control flow that diverts flow from the core kernel to within an LKM. It is called from both Type I and Type II handlers, and is responsible for recognizing both direct and indirect function calls within the current function, and inserting Type II probes, if so desired by the caller. It consults the Probe Registrar to manage the hierarchy of probes, and the State Propagator to propagate state information across Type II handlers.
Maintaining state information for probes is necessary since Type II handlers are inserted for both direct and indirect function calls within LKMs; the former type of handlers need to continue inserting probes recursively on their own functions, while the latter need to verify their parameter list with the stack above.
Acquires and validates the state information handed down by either Type I handlers or other Type II handlers. Also checks the current control flow and raises an alert in case of an anomaly. Probe Registrar: The Probe Registrar is responsible for the insertion and deletion of probes, and also for maintaining a record of active probes within the system. This record is necessary for when autoscopy is uninstalled, and the system needs to be cleaned of all probes inserted for the purposes of assisting detection.
The registrar also caches all probes, instead of continuously inserting and removing them, which would have drastically affected system performance. We note that state information is maintained as a part of the probe structure within the kprobes framework. This is an optimization and can also be considered as a tiny hack that allows us to quickly get the context information associated with a given probe.
Anomaly Reporter: The Anomaly Reporter is responsible for reporting and logging any detected anomaly, along with the present state of the stack and any argument values associated with the function where the anomaly was detected.
As mentioned before, we use udis86 as our disassembler of choice. We maintain a queue of outstanding addresses that is initially empty. The construction of control flow graphs is undertaken only for functions, hence the starting address of a function is taken as input. Given this address, the CFG construction proceeds by disassembling each succeeding instruction one by one and adding it to the current node in the graph.
If the disassembled instruction is a jmp, ret or call, then a new node is created. If the instruction is an unconditional jump, then we take the jump if not already taken before and follow the disassembly from the jump target. Else if the jump target has already been disassembled before, we keep removing addresses from the head of the queue until we find one that has been untraversed.
We quit if the queue becomes empty. If the instruction is a conditional jump, we add the jump target to the tail of the queue, and resume from the succeeding instruction. Else if the instruction is a return, we again keep removing addresses from the queue, until we can proceed with the disassembly. The algorithm ends when the queue has become empty. Thus, our implementation of autoscopy is both modular and optimized for effi- ciency.
While there is the additional overhead of kprobes, as we show in the coming chapter, the overall performance overhead is still within agreeable limits. We first give coverage details for all hooks identified through our learning process, and also illustrate our findings by giving a partial table of descriptions for all resolved hooks hooks deter- mined during our learning phase. Ideally, our full table serves as a catalogue for kernel hooks that could be used by other detection agencies aiming to protect the kernel.
Also, our learning and detection phases need not be limited by our existing im- plementations. Our underlying technique is general enough to be accommodated by a VMM, if so desired, isolation being the only advantage that can be reaped from such an implementation.
To accommodate embedded platforms that have little native support for installing external systems such as CoPilot which uses a PCI interface , we advocate the use of architecture-specific functionalities that prevent the execution of code from a memory page that has been specifically tagged as non-executable. A TPM is a passive hardware module that provides secure storage and cryptographic computation outside the system processor.
We propose extending the appropriate PCR registers with the contents of the text, and verifying the register contents either periodically, or when a probe is hit. This would help detect modifications in static code or data. Finally, if the MMU itself can be modified, then our proposal for a better pol- icy interface at the MMU, via interaction with a customized FPGA [17] efficiently guarantees text integrity, along with other complex policy enforcements.
However, a large portion of these hooks were duplicates they referred to locations with the same value. The next step, as described in section 6. This phase collected Finally, filtering these locations based on whether they were actually called using indirect calls led us to identify hooks of these about odd were system calls. Table 8. For eight other rootkits that were impractical to compile and install into 2.
We note that a majority of publicly available rootkits are released to just demonstrate a proof-of-concept, and accord- ingly our analysis targets rootkit techniques and not specific implementations. The rootkits we tested against are shown in Table 8.
As expected, we were able to successfully detect and report all the rootkits, both when they were pre-installed before our system began operation and when they were installed after our system was present. We point out that our approach is agnostic to the particular content or feature set of a rootkit. Rather, we detect the behavior a rootkit must undertake to invade a system.
In some sense, our detection can be considered rootkit-agnostic our detection approach is analogous to efforts that focus on vulnerabilities rather than exploit content. Our initial aim is to expose the hooks used by these rootkits so that autoscopy can detect the presence of both these instances of malware. Adore-ng is used to hide files, processes and network sockets from an IDS. It works by hijacking hooks at the virtual filesystem VFS layer.
We distinguish between autoscopy and manual source-code analysis. The filesystems that adore-ng considers are the root filesystem and the proc filesystem. We now show that our system exposes these hooks that are used by adore.
In this case, there are only two: the readdir hooks of the proc and ext3 filesystems. The actual structures used are proc root operations and ext3 dir operations - both of type file operations.
Autoscopy was able to successfully discover both hooks. Autoscopy was able to discover these hooks and detect the rootkit. Enyelkm is a kernel rootkit that modifies the kernel functions system call and sysenter entry to redirect control to its own handlers while also maintaining a par- tial copy of the system-call table so as to not modify the original since it could be integrity-checked by IDSes. Enyelkm redirects system-calls sys kill, sys getdents64 and sys read to its own functions that can then give root access to arbitrary pro- cesses, hide modules, files, directories, processes or even hide chunks within a single file.
Our system exposes all three hooks modified by enyelkm and Table 8. Since this particular rootkit does not modify the original hooks, simple integrity checkers on the sys call table structure would fail to detect this rootkit. However, since autoscopy uses control flow anomaly detection, we were able to successfully detect the presence of enyelkm within our system.
With these probes in place, we ran a series of experiments to determine the overhead of running our system versus an uninstrumented version of the kernel. We first conducted three non-synthetic benchmarks: compiling the Apache 2. Both these benchmarks offer a wide range of micro-benchmarking utilities that are useful for stressing particular components of the system and for streamlining results.
In the tables that follow, when reporting the performance overhead, a minus sign before the value indicates that our autoscoped system performed better than a native system. In these cases, we assume that the particular scenario did not heavily exercise our autoscoped hook locations. For these results, we also assume that our measurement values were within the statistical er- ror.
A plus sign before the value indicates that our system performed worse than the native and the percentage value quantifies this overhead. Benchmark Name Native s Autoscoped s Overhead We rebooted the machine between the tests on native and autoscopy. Each SPEC experiment was run three times: we report the median of those runs. In the last column, a plus sign indicates that our system performed worse than the native, and a minus sign means autoscopy imposed no measurable overhead.
As our results indicate, autoscopy performed on par with many benchmarks. Reported values are medians obtained from multiple runs. For the bandwidth measurements, lmbench repeats each test for varying amounts of data that is transferred, from about bytes to MB. We report the values in Table 8.
From the results, we again notice that the performance of autoscopy is almost head to head with the native system. Our first non-synthetic test was to compile Apache under our system and compare this with an Apache compile using a pristine kernel. We repeated the experiment thrice under each environment, and have reported the median values in Table 8. Our results show that we suffer only 1. Finally, we compiled the Linux 2.
This is not a likely scenario since our discovery system is comprehensive enough to cover all of the kernel hooks, as we rely on LTP to hit every area of the kernel. With our design from Section 6. We first define a hook type for a hook as the combo structure offset, where structure refers to the type of structure C types as identified in debugging symbols under which the hook lies for global hooks under no structure, the type is global , and offset is the offset of the hook within the structure.
Now, a normal LKM installs a handler say h1 for a hook of type t1 and another LKM installs a handler h2 for a hook of a different type t2. When h1 is called by the kernel, it in turn calls a direct call some other kernel function which eventually calls h2. Since h1 and h2 are valid LKM functions, this is a valid control flow, but the presence of the two indirect calls to h1 and h2 would be detected by autoscopy and flagged as a rootkit assuming the argument lists are similar.
To accommodate this scenario, we enhance our implementation with a type checker. Notice that the underlying hooks for h1 and h2 are of different types t1 and t2. Thus, hook functions of the same type should never call each other, but can call hook functions of a different type.
We first build hook types for all the hooks we detect. We feed this along with the list of hooks to the detection system. Our detection system then associates each probe with the corresponding type of hook, and whenever the system identifies a rootkit, it also verifies whether the types of hooks used within both indirect calls are identical or not.
If identical, or one of the types is missing, then the rootkit is reported, otherwise it constitutes a false positive and is logged as such without raising any other critical alarms. We reiterate that since autoscopy operates at the same privilege level as the attacker, by definition, we cannot prevent the attacker from overwriting kernel text or any self-modification of code.
We emphasize that, by themselves, these other approaches cannot detect even moderately sophisticated rootkits that employ some form of dynamic hook modification or control-flow redirection.
Rather, we see these tools as building blocks to help systems such as autoscopy that are directed towards more devious rootkits than the ones that just modify static text or data.
The following are some scenarios of how a rootkit might attempt to subvert autoscopy: 1. The rootkit constructs CALL instructions manually, i. By analyzing the operand of each call instruction that we disassemble, and verifying that the call operand corresponds to the actual function that was called, we can detect such an attack.
We consider both static and dynamic analyses techniques, and evaluate them against autoscopy. Petroni and Hicks present an imitation of CFI called state-based CFI or SBCFI [34] that validates the data structures of the kernel and the kernel state itself as a whole instead of tracking the individual execution branches as they occur. This has the advantage that the introspecting process can be external to the system in a VMM , but opens up a window during which an ephemeral or non-persistent rootkit might be deployed.
Their learning process is done statically by analyzing the kernel code for global variables and tracking their usage across the kernel, and validating them at runtime. The static nature of their system and the inability to adapt their learning process to the rapidly evolving Linux kernel are some shortcomings of this approach. However, a static white-list of valid memory regions and kernel symbols are necessary in order to distinguish a malicious LKM from an benign one.
Building such comprehensive lists is of course a huge problem for such large and constantly evolving systems. The Memory Protected Zone consists of the kernel hook tables and other static regions of the kernel that need to be protected from rootkits, while the File Protected Zone consists of system binaries and libraries that need to be prevented against modifications.
Given the specifications of these zones, Paladin uses a VMM to monitor write accesses across the system for validity. Any time an invalid access into any of the above zones is detected, a process-tracker component then identifies the entire process subtree for the process that elicited the write and kills all processes within this tree.
This serves as an excellent prevention mechanism too, except that the specifications of MPZ and FPZ are again static and a comprehensive survey of them is infeasible. Also, just monitoring for writes to specified locations and preventing them does not prevent rootkits from hijacking function hooks within data structures since these locations are meant to be overwritten.
Their basic idea is quite similar to ours — a rootkit wishing to hide itself — in this case, from certain user-level applications such as ls, ps, netstat or explicit monitoring agencies such as Tripwire [10] or AIDE [1].
To do this, a rootkit would have to hook locations that lie in the execution paths of these programs. Hence tracking the kernel control flow for these applications would reveal those hooks which, as we know, are just indirect function calls.
Cataloguing the hooks would serve an IDS that leverages these findings to protect such pointers. HookMap has the distinct advantage of narrowing its playing field to present a concise description of just the necessary hooks that need to be protected within the kernel.
Moreover, hooks that do not lie in the execution paths of any of these programs would still go undetected. Our aim is broader and we cover most of the kernel hooks in the data region that have a reasonable chance of getting manipulated by any rootkit. Any differences here would indicate the presence of a rootkit, and thus prevent it from executing on the guest system.
A disadvantage of this kind of authentication scheme is that it needs to be manually performed every time a module is inserted into the kernel, and in-depth analysis is necessary to ensure that the LKM does not invalidate the kernel.
Most require either a static specification of vulnerable code and data and hence are fast , or use VM technology to identify and shield the vulnerable portions from malicious code comparatively slow. Autoscopy is at the crossroads of both these approaches and presents a manageable interface, which requires no VM complications and hence is insusceptible to VM overhead, while at the same time is extensive enough to capture and track the anomalous nature of misbehaving malcode.
Looking back to the period following the dawn of Unix, operating systems have been designed to be opaque, focusing more on providing a clean interface for kernel-user communication, rather than opening up the OS internals for any kind of meaningful analysis. In this thesis, we attempt to provide just such an answer — by using OS tracing frameworks to protect the OS itself — hence the name autoscopy. Building a security layer within the OS to monitor itself also means we can leverage this intimacy to gather as much of kernel context as needed without the hassle of trying to bridge the VM semantic gap [20].
Finally, through this thesis, we hope that we can fuel more thoughts and ideas about using tracing technologies for better policy enforcement and privilege management at the OS level. One possible implementation we are considering is using a processor emulator such as QEMU [16] that also does dynamic instruction translation. We could also make autoscopy more judicious and adaptive in its handling of kernel probes — for example, if the system performance drops below a certain level due to autoscopy, we could temporarily disable certain probes along critical paths in order to boost performance — or insert more probes if the system performance is exceeding anticipated thresholds.
We can achieve rigid security by building hardware primitives that complement our system by providing better memory guarantees for both the ker- nel code and user level applications. We have done some preliminary work in this direction in our proposal for a Better Mousetrap [17]. We then leverage our findings by building a low-overhead, rigorous detection system for rootkits intercepting kernel control flow.
Sec- tion A. Section A. SWAtt: Software-based attestation for embedded devices. Attacking and Defending Networked Embedded Devices. Security and Privacy, Locasto, Ashwin Ramaswamy, and Sean W. Cantrill, Michael W. Shapiro, and Adam H. Dynamic In- strumentation of Production Systems. Chen and B. When virtual is better than real [operating sys- tem relocation to virtual machines].
Hot Topics in Operating Systems, Proceedings of the Eighth Workshop on, pages —, May Hofmeyr, Anil Somayaji, and Thomas A. A Sense of Self for Unix Processes. A virtual machine introspection based architecture for intrusion detection. In In Proc. Network and Distributed Systems Security Symposium, pages —, PhD thesis, Georgia Institute of Technology, Hofmeyr, Stephanie Forrest, and Anil Somayaji.
Intrusion detection using Sequences of System Calls. Designing BSD Rootkits. No Starch Press, Michael and St. Loscocco, Perry W. Wilson, J. Aaron Pendergrass, and C. Durward McDonell. Linux kernel Integrity Measurement using Contextual Inspection. Probing the Guts of Kprobes. Rootkit Revealer. Nick L. Petroni and Michael Hicks. Detecting kernel rootkits. In RAID, System Virginity Verifier. Introducing Stealth Malware Taxonomy. In SyScan, Pioneer: verifying code integrity and enforcing untampered code execution on legacy systems.
To do this, the program block to be executed is first compiled for the target system and then called. The compilation takes place as a just-in-time compilation JIT compilation. JRE : Jave runtime environment is a package of class libraries, loader class, and JVM, which altogether facilitates users to run Java programs. JDK : Java Development Kit, as its name suggests, is meant for developers and comes with all necessary packages to help them in coding and running Java programs and applets.
To check that everything is done correctly, open the command prompt and type: java -version. This site uses Akismet to reduce spam. Learn how your comment data is processed. How To. Visit Java.
0コメント