Function Extraction Malware Detection Techniques

Extraction Malware Detection Techniques

Student Name
Institutional Affiliation

FUNCTION EXTRACTION 2

Introduction
Malware is software that has been designed with the deliberate intent of infiltrating or
damaging a computer system without the owner’s knowledge. Malware can appear as code,
active content for example in a browser, scripts, and other software. The numerous attacks made
by this malicious software pose a significant security threat to computer users. Hence, malware
detection is of prime importance if the attempt to secure a computer system is to be successful
(Zolkipli & Jantan, 2010). Malware attacks over the internet have increased significantly over
the last few years. This can be attributed to the financials gains that the makers of the malicious
software hope to gain from infecting users PCs. Even more disconcerting, the numbers point to a
failure to the traditional methods of malware detection. Newer methods and mechanisms have to
be explored since the old ones do not suffice. Research has been done to identify other improved
methods for detecting malware. One method that has recently become popular is Function
Extraction.
Malware and other potentially harmful software have a significant impact on the security,
reliability and privacy of a user’s security system. Hackers hope for financial gain when they
infect vulnerable machines. Hence, rather than just vandalizing the client machine, the attackers
steal confidential or personal information. Moreover, spyware and other malware result in a
significant dip in performance as well as instability of infected computers systems. Thus, an anti-
malware program is usually installed to ensure endpoint security is in place. An anti-malware
engine is responsible for the detection and removal of malware before it infecting a computer.
The anti-malware engine is responsible for scanning, detection, and removal of malicious
software. Scanning involves monitoring of critical components of a computer system such as the

FUNCTION EXTRACTION 3

registry, hard disk and main memory for any changes which might point to an infection. Once
the engine has identified a candidate for further examination, it tries to determine the presence of
malware. Candidate identification may be as a result of a scan, or an explicit request by the user.
Most anti-malware programs refer to a frequently updated list of known malware programs
called a Blacklist that contains signatures or identifiable patterns of known malware. If a
candidate matches a signature on the list, the file or program is removed completely or placed in
quarantine to await further action by the user.
Signature-based antimalware approach
Traditionally, anti-malware programs applied a signature-based detection method to
identify the presence of a malware infection or instance. If at least one-byte code pattern of a
candidate program matched with the database of signatures of known malicious programs in the
blacklist, the candidate software was flagged as being malicious. The basic premise that
underlines this approach was that malware can be described through patterns or signatures (Dalla
Preda, Jha & Debray, 2008). Signature-based detection is by far the most commonly applied
technique in anti-malware systems. However, the technique has certain distinct disadvantages.
For one, the approach is susceptible to evasion. Since the signature byte patterns of previously
discovered malware are commonly known, a hacker can apply simple obfuscation techniques
like code re-ordering and inserting no-ops. In this way, the signature can be altered thus evading
detection. Similarly, the method is very ineffective in zero-day attacks due to the fact that they
are constructed on the basis of known malware. For this reason, they are unable to detect
unknown malware or even their mere variants. Thus, without valid signatures, they prove
ineffective at the detection of polymorphic malware (Tang, Xiao & Lu, 2011). Moreover, since a

FUNCTION EXTRACTION 4

signature-based antimalware program produces a separate signature for each malware variant,
the blacklist tends to grow exponentially.
Due to the limitations of signature-based blacklisting, a new approach to anti-malware
detection, Whitelisting, was advanced. Whitelisting is a modern and particularly useful technique
that can be applied to manage actively the software programs that can be installed on a computer.
In this approach, permission is granted to a select number of pre-approved software to install and
run. An attempt to mount or run any software products that are not explicitly on the whitelist
results in the computer locking down. While this approach offers a better and more promising
alternative to blacklisting, it comes with its set of disadvantages. First, whitelisting makes for a
very annoying user experience as pop-ups constantly alert the user about rights and permissions
to run or install programs. Secondly, the approach creates a very rigid environment with strict
enforcement of rules on what programs can be installed and what programs cannot. In effect,
whitelisting severely limits a users’ ability to download or use new software. Third, whitelisted
applications are not necessarily safe. Browsers are for example notorious for running active
content. Thus, if malware injects itself into active content running in a whitelisted browser, then
it will not be detected. This poses a significant security risk while using this approach.
Behavior-based malware detection
In order to address the challenges of signature-based approaches to malware detection,
most modern anti-malware programs apply a behavior-based technique. Behavior-based
approaches monitor the behaviors of a program to determine if it is malicious or not. One key
advantage if this approach is that anti-malware programs that implement the behavior based
method can observe behaviors of a program from outside without actually executing it or waiting
for it to run. If a program is identified as having pre-defined malicious behavior, then it can be

FUNCTION EXTRACTION 5

identified as malware and removed. The reason many anti-malware and anti-virus programs are
adding behavior-based detection is because malware creators began using encrypted or
polymorphic code segments which are exceedingly difficult to detect based on signatures. Under
such circumstances, watching for particular patterns of behavior make it easy to identify the
malware. A program’s behavior is typically monitored by the stream of system calls that it issues
to the operating system. Since behavior-based methods monitor what a program does, they are
not susceptible to the shortcomings of signature-based detection techniques. For this reason,
behavior-based methods are preferred over the alternatives. Several types of behavior based
discoveries exist. The most common are anomaly detection.
Anomaly detection
In anomaly detection, the profile of typical program behavior is constructed, and any
deviations from this pattern are flagged as anomalous and thus possible malware activity. Say a
program never writes to a particular sensitive directory during its normal execution. And during
monitoring, the anti-malware program notices writes to that sensitive directory by the program,
the behavior will be flagged as anomalous by the detection system. The advantage of such
detections is interrupts into the malware execution cycle can be done before any serious harm is
done. However, anomaly detection has its shortcomings as well. The first is it is susceptible to
false positives. Owing to the very complicated nature of some computer software, it becomes
very hard to construct a normal behavior model for the programs. Such inadequacies in the
making of the models can lead to false positives. Similarly, anomaly detection is susceptible to
mimicry attacks. An attacker who is aware of the typical execution model of a particular
software program running on the computer system can mask their malicious code to run on a

FUNCTION EXTRACTION 6

similar model. In this way, malicious code can be run by transforming malicious software to act
as if it has a standard behavior model.
Function Extraction
Owing to the way malware creators are seemingly coming up with new ways to outsmart
prevention techniques and the distinct disadvantages that current anti-malware programs have,
research has been ongoing to identify new methods of preventing malware infections. A recent
and emerging trend is the use of Function Extraction (FX). Function Extraction presents a novel
way for software analysis. One area of prime application is the detection of malware before it
runs and infects a target machine. A common and well-regarded observations in the anti-malware
industry is that software with unknown behavior has unknown security. Therefore, it becomes
necessary to know the software behavior, but in a way that is unaffected by the limitations of
other types of behavior-based techniques like anomaly detection. To this end, researchers have
been able to make use of Function Extraction to identify the computed behavior of a software
program. Computed behavior involves monitoring what a program does under any and all
circumstances, or the as-built functionality and specification of a program. The underlying basis
that makes all this possible is that computer programs are mathematical artifacts, and can thus be
subjected to mathematical analysis. Through Function Extraction, malware detection can be done
with mathematical precision in a yet unbeatable way of detecting and eliminating malicious
software.
Practical Usage – Hyperion Cybersecurity Software
The most notable Function Extraction application in the newly emerging field is
Hyperion Cyber Security software. The Hyperion Cybersecurity System is based on a Function
Extraction (FX) algorithms that were first developed by International Business Machines (IBM).

FUNCTION EXTRACTION 7

Computer Scientists at IBM were the first to theorize on the possibility of mathematical
foundations for deriving program behavior (ORNL, 2016). Following the initial research
developed at IBM, the research area lay latent for over two decades before scientists at Carnegie
Mellon took up the research and applied it to the CERT FX project to compute the behavior of
compiled binaries (Linger, 2016). In 2010, the project was further refined at the high-
performance computing facility at Oak Ridge National Laboratory (ORNL). While at ORNL, the
Function Extraction team was joined by experts from various government agencies. The first step
the newly comprised FX team took was software modernization of the research project, first by
modernizing the code base offered as sample code. Then, Automated Vulnerability Detection
was integrated into the project under the supervision of experts from the Department of Energy
(ORNL). Finally, and Ultrascale Verification of Security Properties Project as carried out as part
of the Internal Research and Development by the Function Extraction team (Linger, 2016).
Following these and other improvements, the project was marked ready for commercialization
by the FX team.
The Hyperion Cyber Security program has some features which make is especially suited
for anti-malware detection. The key properties of the Hyperion include treating programs as rules
for mathematical functions for behavior-based detection of anti-malware (Linger, 2016). The
security suite operates on semantics and not syntax hence makes it possible to detect any
malicious software without running the program. Where the programs make use of polymorphic
or encrypted code segments, Hyperion analyzes binaries to approach ground truth on the piece of
software code (Linger, 2016). Unlike other behavior-driven methods like anomaly detection, the
security software does not rely on heuristics. Mathematical precision offered by Function
Extraction makes it possible to detect malware and remove it without even looking at things in

FUNCTION EXTRACTION 8

the code. Moreover, the behavior computation model it applies ensure that even if a hacker
breaks up the malicious code into segments and scatters it all over the code, it is still detected by
the anti-malware program. The behavior computation model stores the abstracted behavior as a
Behavior Specification Unit (BSU) that specify malicious actions such as changing registry
values, writing to system files and keylogging. In this way, a handy look-up table that cannot be
compromised or exploited by malicious software is made available to the anti-malware program.

How Hyperion Cybersecurity Software works
Hyperion Cybersecurity Software works by principally applying two mathematical
theorems. These are the structure and correctness theorem. The structure theorem is used to
transform unstructured spaghetti code and transform it into a more structured form. The structure
theorem as applied defines a transformation that changes the complex logic found in a software
program into a function equivalent structured form expressed as a sequence of If-Then-Else and
Do-While control structures (Linger, 2016). The importance of this theorem is in identifying

FUNCTION EXTRACTION 9

malicious code that has been strewn all over a software program to hide it. The second step in the
conversion cycle applies the correctness theorem to ensure the code is automatically transformed
into a standardized form. The correctness theorem defines the mathematical transformations that
will be responsible for changing the procedural logic expressed in a sequence of If-Then-Else
and Do-While control structures into the behaviorally equivalent functional forms. A simplified
model of this model takes the software being analyzed, prepares it for transformation into a
structured from, performs a three sequence computation on the code followed by an If-Then-Else
computation, then reduces the program and does a one-step computation. The model that is
derived from this calculation is stored in a Behavior Specification Unit that serves as semantic
signatures for malicious behavior, functional specifications for malware components, and
behavior structures that humans can understand.
Conclusion
Signature-based and Behavior-based detection approaches each have their pros and cons.
From the properties and features of Hyperion, software behavior computation emerges as a key
way of detecting and managing computer programs, and specifically malicious software.
Function Extraction offers a handy alternative to the existing anti-malware detection
technologies and approaches in use presently. While the technology has only had very few test
cases and has not reached maturity yet, research data points to Hyperion having very useful
applications in anti-malware detection and removal. The mathematical precision of the detection
method, as well as the ability to analyze binary, makes the new method hard to exploit. In all, it
offers the most promising antimalware detection capable technology yet.

FUNCTION EXTRACTION 10

References

Dalla Preda, M., Christodorescu, M., Jha, S., & Debray, S. (2008). A semantics-based approach
to malware detection. ACM Transactions on Programming Languages and Systems,
30(5), 1-54.
Linger, R. (2016). The Hyperion System: Computing Software Behavior with Function
Extraction Technology (1st ed.). CSIIR Group. Retrieved from https://buildsecurityin.us-
cert.gov/sites/default/files/The%20Hyperion%20System%20Computing%20Software%2
0Behavior%20with%20Function%20Extraction%20Technology-Rick%20Linger.pdf
ORNL,. (2016). Hyperion cyber security tech receives commercialization award | ORNL.
Ornl.gov. Retrieved 15 February 2016, from https://www.ornl.gov/news/hyperion-cyber-
security-tech-receives-commercialization-award
Tang, Y., Xiao, B., & Lu, X. (2011). Signature tree generation for polymorphic worms.
Computers, IEEE Transactions on, 60(4), 565-579.
Zolkipli, M. F., & Jantan, A. (2010, September). Malware behavior analysis: Learning and
understanding current malware threats. In Network Applications Protocols and Services
(NETAPPS), 2010 Second International Conference on (pp. 218-221). IEEE.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Function Extraction Malware Detection Techniques ”

Get high-quality paper

NEW! AI matching with writer

Continue to order Get a quote

Homework help cost calculator

Homework type:

Pages:

600 words

Academic level:

We'll send you the complete homework by September 11, 2018 at 10:52 AM

Total price:

$26

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

Free title page and bibliography
Unlimited revisions
Plagiarism-free guarantee
Money-back guarantee
24/7 customer support

On-demand options

Writer’s samples
Part-by-part delivery
4 hour deadline
Copies of used sources
Expert Proofreading

Paper format

300 words per page
12 pt Arial/Times New Roman
Double line spacing
Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Function Extraction Malware Detection Techniques

Homework help cost calculator

Our guarantees

Money-back guarantee

Zero-plagiarism guarantee

Free-revision policy

Privacy policy

Fair-cooperation guarantee