Computer Security Exam
Ch03-BufferOverflow.pptx
Buffer Overflow Attacks
1
2009-01-28
Operating Systems: Basic Concepts
CS 166
What is an Exploit?
An exploit is any input (i.e., a piece of software, an argument string, or sequence of commands) that takes advantage of a bug, glitch or vulnerability in order to cause an attack
An attack is an unintended or unanticipated behavior that occurs on computer software, hardware, or something electronic and that brings an advantage to the attacker
10/13/10
Buffer Overflow
2
not necessarily a program... while it can be a program that communicates bad input to a vulnerable piece of software, it can also be just the bad input itself... any bad input (or even valid input that the developer just failed to anticipate) can cause the vulnerable application to behave improperly...
Operating Systems: Basic Concepts
2009-01-28
CS 166
2
Buffer Overflow Attack
One of the most common OS bugs is a buffer overflow
The developer fails to include code that checks whether an input string fits into its buffer array
An input to the running process exceeds the length of the buffer
The input string overwrites a portion of the memory of the process
Causes the application to behave improperly and unexpectedly
Effect of a buffer overflow
The process can operate on malicious data or execute malicious code passed in by the attacker
If the process is executed as root, the malicious code will be executing with root privileges
10/13/10
Buffer Overflow
3
3
Because of the nature of the address space, locally declared buffers are allocated on the stack
Since the stack grows downward, if you write past the end of the buffer, you can corrupt the content of the rest of the stack, thus, if enough information is known about the program, one could write over known register information and the return address
2009-01-28
Operating Systems: Basic Concepts
CS 166
Address Space
Every program needs to access memory in order to run
For simplicity sake, it would be nice to allow each process (i.e., each executing program) to act as if it owns all of memory
The address space model is used to accomplish this
Each process can allocate space anywhere it wants in memory
Most kernels manage each process’ allocation of memory through the virtual memory model
How the memory is managed is irrelevant to the process
10/13/10
Buffer Overflow
4
4
This would also be consistent with the process model proposed earlier where each process feels like it “owns” the machine. The size of the address space is machine dependent, until the Intel 386 came around, most address spaces were 16 bit, for most of the past 15 years, we have been sing 32 bit machines, though increasingly larger number of processors with 64 bit modes are making their way into people’s computers.
2009-01-28
Operating Systems: Basic Concepts
CS 166
Virtual Memory
Mapping virtual addresses to real addresses
10/13/10
Buffer Overflow
5
Another
Program
Hard Drive
Program Sees
Actual Memory
Unix Address Space
Text: machine code of the program, compiled from the source code
Data: static program variables initialized in the source code prior to execution
BSS (block started by symbol): static variables that are uninitialized
Heap : data dynamically generated during the execution of a process
Stack: structure that grows downwards and keeps track of the activated method calls, their arguments and local variables
10/13/10
Buffer Overflow
6
Low Addresses
0x0000 0000
High Addresses
0xFFFF FFFF
Stack
Heap
BSS
Data
Text
Vulnerabilities and Attack Method
Vulnerability scenarios
The program has root privileges (setuid) and is launched from a shell
The program is part of a web application
Typical attack method
Find vulnerability
Reverse engineer the program
Build the exploit
10/13/10
Buffer Overflow
7
Buffer Overflow Attack in a Nutshell
First described in
Aleph One. Smashing The Stack For Fun And Profit. e-zine www.Phrack.org #49, 1996
The attacker exploits an unchecked buffer to perform a buffer overflow attack
The ultimate goal for the attacker is getting a shell that allows to execute arbitrary commands with high privileges
Kinds of buffer overflow attacks:
Heap smashing
Stack smashing
10/13/10
Buffer Overflow
8
Buffer Overflow
Retrieves domain registration info
e.g., domain brown.edu
10/13/10
Buffer Overflow
9
domain.c
Main(int argc, char *argv[ ])
/* get user_input */
{
char var1[15];
char command[20];
strcpy(command, “whois ");
strcat(command, argv[1]);
strcpy(var1, argv[1]);
printf(var1);
system(command);
}
Top of
Memory
0xFFFFFFFF
Bottom of
Memory
0x00000000
.
.
.
Stack
Fill
Direction
var1 (15 char)
command
(20 char)
strcpy() Vulnerability
argv[1] is the user input
strcpy(dest, src) does not check buffer
strcat(d, s) concatenates strings
10/13/10
Buffer Overflow
10
domain.c
Main(int argc, char *argv[])
/*get user_input*/
{
char var1[15];
char command[20];
strcpy(command, “whois ");
strcat(command, argv[1]);
strcpy(var1, argv[1]);
printf(var1);
system(command);
}
var1 (15 char)
command
(20 char)
argv[1] (15 char)
argv[1] (20 char)
Top of
Memory
0xFFFFFFFF
Bottom of
Memory
0x00000000
.
.
.
Stack
Fill
Direction
Overflow
exploit
strcpy() vs. strncpy()
Function strcpy() copies the string in the second argument into the first argument
e.g., strcpy(dest, src)
If source string > destination string, the overflow characters may occupy the memory space used by other variables
The null character is appended at the end automatically
Function strncpy() copies the string by specifying the number n of characters to copy
e.g., strncpy(dest, src, n); dest[n] = ‘\0’
If source string is longer than the destination string, the overflow characters are discarded automatically
You have to place the null character manually
10/13/10
Buffer Overflow
Return Address Smashing
The Unix fingerd() system call, which runs as root (it needs to access sensitive files), used to be vulnerable to buffer overflow
Write malicious code into buffer and overwrite return address to point to the malicious code
When return address is reached, it will now execute the malicious code with the full rights and privileges of root
10/13/10
Buffer Overflow
12
void fingerd (…) {
char buf[80];
…
get(buf);
…
}
current frame
previous frames
f() arguments
buffer
local variables
program code
program code
next location
padding
attacker’s input
malicious code
return address
f() arguments
EIP
return address
EIP
12
The fragment of C code for fingerd() above shows the problem
A local array buf[80] is declared, which gets allocated on the stack, but the function get does not do bounds checking, and hence makes buffer overflows possible.
2009-01-28
Operating Systems: Basic Concepts
CS 166
Unix Shell Command Substitution
The Unix shell enables a command argument to be obtained from the standard output of another
This feature is called command substitution
When parsing command line, the shell replaces the output of a command between back quotes with the output of the command
Example:
File name.txt contains string farasi
The following two commands are equivalent
finger `cat name.txt`
finger farasi
10/13/10
Buffer Overflow
13
Shellcode Injection
An exploit takes control of attacked computer so injects code to “spawn a shell” or “shellcode”
A shellcode is:
Code assembled in the CPU’s native instruction set (e.g. x86 , x86-64, arm, sparc, risc, etc.)
Injected as a part of the buffer that is overflowed.
We inject the code directly into the buffer that we send for the attack
A buffer containing shellcode is a “payload”
10/13/10
14
Buffer Overflow
14
Now comes the question of injecting our own code to be executed. We inject the code directly into the buffer that we send for the attack.
Buffer Overflow Mitigation
We know how a buffer overflow happens, but why does it happen?
This problem could not occur in Java; it is a C problem
In Java, objects are allocated dynamically on the heap (except ints, etc.)
Also cannot do pointer arithmetic in Java
In C, however, you can declare things directly on the stack
One solution is to make the buffer dynamically allocated
Another (OS) problem is that fingerd had to run as root
Just get rid of fingerd’s need for root access (solution eventually used)
The program needed access to a file that had sensitive information in it
A new world-readable file was created with the information required by fingerd
10/13/10
Buffer Overflow
15
15
Why doesn’t get do a bounds check and why does the operating system allow writing beyond the array bounds?
In Java can’t just overwrite the stack because you don’t know where the stack is!
In Java, cannot access memory without direct access, since we lack pointer arithmetic
2009-01-28
Operating Systems: Basic Concepts
CS 166
Stack-based buffer overflow detection using a random canary
The canary is placed in the stack prior to the return address, so that any attempt to over-write the return address also over-writes the canary.
10/13/10
Buffer Overflow
16
Buffer
Other local variables
Canary (random)
Return address
Other data
Buffer
Corrupt return address
Attack code
Normal (safe) stack configuration:
Buffer overflow attack attempt:
Overflow data
x
Ch03-OS.pptx
Operating Systems Concepts
1
1
10/13/10
Introduction
A Computer Model
An operating system has to deal with the fact that a computer is made up of a CPU, random access memory (RAM), input/output (I/O) devices, and long-term storage.
2
Disk Drive
RAM
CPU
0
1
2
3
4
5
6
7
8
9
.
.
.
I/O
OS Concepts
An operating system (OS) provides the interface between the users of a computer and that computer’s hardware.
An operating system manages the ways applications access the resources in a computer, including its disk drives, CPU, main memory, input devices, output devices, and network interfaces.
An operating system manages multiple users.
An operating system manages multiple programs.
3
Multitasking
Give each running program a “slice” of the CPU’s time.
The CPU is running so fast that to any user it appears that the computer is running all the programs simultaneously.
4
Public domain image from http://commons.wikimedia.org/wiki/File:Chapters_meeting_2009_Liam_juggling.JPG
The Kernel
The kernel is the core component of the operating system. It handles the management of low-level hardware resources, including memory, processors, and input/output (I/O) devices, such as a keyboard, mouse, or video display.
Most operating systems define the tasks associated with the kernel in terms of a layer metaphor, with the hardware components, such as the CPU, memory, and input/output devices being on the bottom, and users and applications being on the top.
5
User Applications
Non-essential OS Applications
The OS Kernel
CPU, Memory, Input/Output
Userland
Operating System
Hardware
Input/Output
The input/output devices of a computer include things like its keyboard, mouse, video display, and network card, as well as other more optional devices, like a scanner, Wi-Fi interface, video camera, USB ports, etc.
Each such device is represented in an operating system using a device driver, which encapsulates the details of how interaction with that device should be done.
The application programmer interface (API), which the device drivers present to application programs, allows those programs to interact with those devices at a fairly high level, while the operating system does the “heavy lifting” of performing the low-level interactions that make such devices actually work.
6
System Calls
7
User applications don’t communicate directly with low-level hardware components, and instead delegate such tasks to the kernel via system calls.
System calls are usually contained in a collection of programs, that is, a library such as the C library (libc), and they provide an interface that allows applications to use a predefined series of APIs that define the functions for communicating with the kernel.
Examples of system calls include those for performing file I/O (open, close, read, write) and running application programs (exec).
Processes
A process is an instance of a program that is currently executing.
The actual contents of all programs are initially stored in persistent storage, such as a hard drive.
In order to be executed, a program must be loaded into random-access memory (RAM) and uniquely identified as a process.
In this way, multiple copies of the same program can be run as different processes.
For example, we can have multiple copies of MS Powerpoint open at the same time.
8
Process IDs
Each process running on a given computer is identified by a unique nonnegative integer, called the process ID (PID).
Given the PID for a process, we can then associate its CPU time, memory usage, user ID (UID), program name, etc.
9
File Systems
A filesystem is an abstraction of how the external, nonvolatile memory of the computer is organized.
Operating systems typically organize files hierarchically into folders, also called directories.
Each folder may contain files and/or subfolders.
Thus, a volume, or drive, consists of a collection of nested folders that form a tree.
The topmost folder is the root of this tree and is also called the root folder.
10
File System Example
11
File Permissions
File permissions are checked by the operating system to determine if a file is readable, writable, or executable by a user or group of users.
In Unix-like OS’s, a file permission matrix shows who is allowed to do what to the file.
Files have owner permissions, which show what the owner can do, and group permissions, which show what some group id can do, and world permissions, which give default access rights.
12
Memory Management
The RAM memory of a computer is its address space.
It contains both the code for the running program, its input data, and its working memory.
For any running process, it is organized into different segments, which keep the different parts of the address space separate.
As we will discuss, security concerns require that we never mix up these different segments.
13
Memory Organization
Text. This segment contains the actual (binary) machine code of the program.
Data. This segment contains static program variables that have been initialized in the program code.
BSS. This segment, which is named for an antiquated acronym for block started by symbol, contains static variables that are uninitialized.
Heap. This segment, which is also known as the dynamic segment, stores data generated during the execution of a process.
Stack. This segment houses a stack data structure that grows downwards and is used for keeping track of the call structure of subroutines (e.g., methods in Java and functions in C) and their arguments.
14
Memory Layout
15
Virtual Memory
There is generally not enough computer memory for the address spaces of all running processes.
Nevertheless, the OS gives each running process the illusion that it has access to its complete (contiguous) address space.
In reality, this view is virtual, in that the OS supports this view, but it is not really how the memory is organized.
Instead, memory is divided into pages, and the OS keeps track of which ones are in memory and which ones are stored out to disk.
16
ATM
Page Faults
17
Process
1. Process requests virtual address not in memory,
causing a page fault.
2. Paging supervisor pages out
an old block of RAM memory.
3. Paging supervisor locates requested block
on the disk and brings it into RAM memory.
“read 0110101”
“Page fault,
let me fix that.”
Blocks in
RAM memory:
Paging supervisor
External disk
old
new
Virtual Machines
Virtual machine: A view that an OS presents that a process is running on a specific architecture and OS, when really it is something else. E.g., a windows emulator on a Mac.
Benefits:
Hardware Efficiency
Portability
Security
Management
18
Public domain image from http://commons.wikimedia.org/wiki/File:VMM-Type2.JPG
Stack
Dynamic
BSS
Data
Text
Another Program
Hard Drive
Program Sees: Actual Memory:
Ch04-Malware.pptx
Malware: Malicious Software
10/21/2010
Malware
1
1
2009-02-02
CS 166 - Malware
Viruses, Worms, Trojans, Rootkits
Malware can be classified into several categories, depending on propagation and concealment
Propagation
Virus: human-assisted propagation (e.g., open email attachment)
Worm: automatic propagation without human assistance
Concealment
Rootkit: modifies operating system to hide its existence
Trojan: provides desirable functionality but hides malicious operation
Various types of payloads, ranging from annoyance to crime
10/21/2010
Malware
2
2
Name derives from the wooden horse left by the Greeks at the gates of Troy during the siege of Troy
A Trojan horse program intentionally hides malicious activity while pretending to be something else
Usually described as innocuous looking, or software delivered through innocuous means which either allows to take control of systems
Trojan horse programs do not replicate themselves
Sometimes passed on using commonly passed executables, things like jokes forwarded by e-mail
Sometimes marketed/distributed as “remote administration tool”
Often combined with rootkits to disguise activity and remote access
Popularized to an extent by software like Cult of the Dead Cow’s Back Orifice, offered as a free download for running “remote administration” tasks or playing spooky jokes on friends
The line between user-launched worms and Trojans is highly blurred, with many user-launched worms behaving in a manner similar to worms.
Trojans are by definition malicious. The classic movie/television exploit of remotely opening disk drives is a definite symptom of being infected by a Trojan.
Have lately begun using much of the same defense mechanisms used by viruses, there are known Trojans which use WSH to run.
To detect infected computers, attackers often use so called sweep lists, list of IP addresses known to be online. One of the popular ways of doing this is to monitor IRC chat rooms and use the IP addresses of participants in these rooms.
Payload examples
perform amusing or annoying pranks
destroy/corrupt files and applications
monitor and transmit user activity (spyware, logger)
install backdoor (makes the infected computer a zombie)
email spam
launch denial-of-service attack
alter browser settings to display ads
dial out international or 900 numbers (dialer)
2009-02-02
CS 166 - Malware
Insider Attacks
An insider attack is a security breach that is caused or facilitated by someone who is a part of the very organization that controls or builds the asset that should be protected.
In the case of malware, an insider attack refers to a security hole that is created in a software system by one of its programmers.
10/21/2010
Malware
3
Backdoors
A backdoor, which is also sometimes called a trapdoor, is a hidden feature or command in a program that allows a user to perform actions he or she would not normally be allowed to do.
When used in a normal way, this program performs completely as expected and advertised.
But if the hidden feature is activated, the program does something unexpected, often in violation of security policies, such as performing a privilege escalation.
Benign example: Easter Eggs in DVDs and software
10/21/2010
Malware
4
Logic Bombs
A logic bomb is a program that performs a malicious action as a result of a certain logic condition.
The classic example of a logic bomb is a programmer coding up the software for the payroll system who puts in code that makes the program crash should it ever process two consecutive payrolls without paying him.
Another classic example combines a logic bomb with a backdoor, where a programmer puts in a logic bomb that will crash the program on a certain date.
10/21/2010
Malware
5
The Omega Engineering Logic Bomb
An example of a logic bomb that was actually triggered and caused damage is one that programmer Tim Lloyd was convicted of using on his former employer, Omega Engineering Corporation. On July 31, 1996, a logic bomb was triggered on the server for Omega Engineering’s manufacturing operations, which ultimately cost the company millions of dollars in damages and led to it laying off many of its employees.
10/21/2010
Malware
6
The Omega Bomb Code
The Logic Behind the Omega Engineering Time Bomb included the following strings:
7/30/96
Event that triggered the bomb
F:
Focused attention to volume F, which had critical files
F:\LOGIN\LOGIN 12345
Login a fictitious user, 12345 (the back door)
CD \PUBLIC
Moves to the public folder of programs
FIX.EXE /Y F:\*.*
Run a program, called FIX, which actually deletes everything
PURGE F:\/ALL
Prevent recovery of the deleted files
10/21/2010
Malware
7
Defenses against Insider Attacks
Avoid single points of failure.
Use code walk-throughs.
Use archiving and reporting tools.
Limit authority and permissions.
Physically secure critical systems.
Monitor employee behavior.
Control software installations.
10/21/2010
Malware
8
Computer Viruses
A computer virus is computer code that can replicate itself by modifying other files or programs to insert code that is capable of further replication.
This self-replication property is what distinguishes computer viruses from other kinds of malware, such as logic bombs.
Another distinguishing property of a virus is that replication requires some type of user assistance, such as clicking on an email attachment or sharing a USB drive.
10/21/2010
Malware
9
Biological Analogy
Computer viruses share some properties with Biological viruses
10/21/2010
Malware
10
Attack
Penetration
Replication and assembly
Release
Early History
1972 sci-fi novel “When HARLIE Was One” features a program called VIRUS that reproduces itself
First academic use of term virus by PhD student Fred Cohen in 1984, who credits advisor Len Adleman with coining it
In 1982, high-school student Rich Skrenta wrote first virus released in the wild: Elk Cloner, a boot sector virus
(c)Brain, by Basit and Amjood Farooq Alvi in 1986, credited with being the first virus to infect PCs
10/21/2010
Malware
11
Much of the macro classification carries over from viruses, worms based on macro capabilities of programs are programmed in much the same way as viruses, with minor differences
Primary classification has often been based on a worm relying on e-mail or IRC, ICQ, AIM.
Through much of the mid-90s IRC was a popular target, and worms were often combined with Trojans to allow for remotely controlling systems
Examples include IRC.Worm.Ceyda and IRC.Worm.Whacked, the later of which is also a Trojan
Simultaneously with a growth in instant messaging, popular IM clients have been targeted by worms
There are known worms targeting AIM (W32.AimVen.Worm), MSN (W32.Kelvir and variants), ICQ (W32.Bizex), Yahoo Messenger (W32.Hawawi) and pretty much every other popular IM network
P2P networks have been targeted of late, with W32.Hawawi and others spreading through Kazza
E-mail, exploited indirectly by the Morris Worm continues to be a popular propagation method, with worms like W97M.Melissa, and W32.Navidad relying on MAPI to provide them with an easy way to e-mail themselves out.
CS 166 - Malware
2009-02-02
11
Virus Phases
Dormant phase. During this phase, the virus just exists—the virus is laying low and avoiding detection.
Propagation phase. During this phase, the virus is replicating itself, infecting new files on new systems.
Triggering phase. In this phase, some logical condition causes the virus to move from a dormant or propagation phase to perform its intended action.
Action phase. In this phase, the virus performs the malicious action that it was designed to perform, called payload.
This action could include something seemingly innocent, like displaying a silly picture on a computer’s screen, or something quite malicious, such as deleting all essential files on the hard drive.
10/21/2010
Malware
12
Infection Types
Overwriting
Destroys original code
Pre-pending
Keeps original code, possibly compressed
Infection of libraries
Allows virus to be memory resident
E.g., kernel32.dll
Macro viruses
Infects MS Office documents
Often installs in main document template
10/21/2010
Malware
13
virus
compressed
original code
Resident viruses continue running after executing the infected file
Modified system calls
Modified DLLs
Non-resident viruses
Resident viruses are more common than non-resident viruses, and essentially latch onto system calls, DLLs and the like, and stay resident, affecting every program run subsequent to them being introduced into memory.
Non resident viruses are executed every time an infected file is executed
All Windows DLLs have an export table listing the functions provided and their addresses
A virus can hook onto a DLL
Fairly easy for viruses using DLLs to get memory resident
kernel32.dll is a collection of core Windows API calls (system calls) that is imported by most applications
Most viruses relying on patching DLLs usually attack kernel32.dll
For instance W32.Kriz will attack any PE executable, and also kernel32.dll to get a hook on system calls
Hooking system calls may be done by legitimate programs, such as Regmon (a registry monitoring utility)
Viruses hook onto DLLs by either changing their exported symbol table, so as to call malicious code, or by adding mallicious code to the DLL.
CS 166 - Malware
2009-02-02
13
Degrees of Complication
Viruses have various degrees of complication in how they can insert themselves in computer code.
10/21/2010
Malware
14
Concealment
Encrypted virus
Decryption engine + encrypted body
Randomly generate encryption key
Detection looks for decryption engine
Polymorphic virus
Encrypted virus with random variations of the decryption engine (e.g., padding code)
Detection using CPU emulator
Metamorphic virus
Different virus bodies
Approaches include code permutation and instruction replacement
Challenging to detect
10/21/2010
Malware
15
Computer Worms
A computer worm is a malware program that spreads copies of itself without the need to inject itself in other programs, and usually without human interaction.
Thus, computer worms are technically not computer viruses (since they don’t infect other programs), but some people nevertheless confuse the terms, since both spread by self-replication.
In most cases, a computer worm will carry a malicious payload, such as deleting files or installing a backdoor.
10/21/2010
Malware
16
Early History
First worms built in the labs of John Shock and Jon Hepps at Xerox PARC in the early 80s
CHRISTMA EXEC written in REXX, released in December 1987, and targeting IBM VM/CMS systems was the first worm to use e-mail service
The first internet worm was the Morris Worm, written by Cornell student Robert Tappan Morris and released on November 2, 1988
10/21/2010
Malware
17
Much of the macro classification carries over from viruses, worms based on macro capabilities of programs are programmed in much the same way as viruses, with minor differences
Primary classification has often been based on a worm relying on e-mail or IRC, ICQ, AIM.
Through much of the mid-90s IRC was a popular target, and worms were often combined with Trojans to allow for remotely controlling systems
Examples include IRC.Worm.Ceyda and IRC.Worm.Whacked, the later of which is also a Trojan
Simultaneously with a growth in instant messaging, popular IM clients have been targeted by worms
There are known worms targeting AIM (W32.AimVen.Worm), MSN (W32.Kelvir and variants), ICQ (W32.Bizex), Yahoo Messenger (W32.Hawawi) and pretty much every other popular IM network
P2P networks have been targeted of late, with W32.Hawawi and others spreading through Kazza
E-mail, exploited indirectly by the Morris Worm continues to be a popular propagation method, with worms like W97M.Melissa, and W32.Navidad relying on MAPI to provide them with an easy way to e-mail themselves out.
CS 166 - Malware
2009-02-02
17
Worm Development
Identify vulnerability still unpatched
Write code for
Exploit of vulnerability
Generation of target list
Random hosts on the internet
Hosts on LAN
Divide-and-conquer
Installation and execution of payload
Querying/reporting if a host is infected
Initial deployment on botnet
Worm template
Generate target list
For each host on target list
Check if infected
Check if vulnerable
Infect
Recur
Distributed graph search algorithm
Forward edges: infection
Back edges: already infected or not vulnerable
10/21/2010
Malware
18
Worm Propagation
Worms propagate by finding and infecting vulnerable hosts.
They need a way to tell if a host is vulnerable
They need a way to tell if a host is already infected.
10/21/2010
Malware
19
initial infection
Propagation: Theory
Classic epidemic model
N: total number of vulnerable hosts
I(t): number of infected hosts at time t
S(t): number of susceptible hosts at time t
I(t) + S(t) = N
b: infection rate
Differential equation for I(t):
dI/dt = bI(t) S(t)
More accurate models adjust propagation rate over time
10/21/2010
Malware
20
Source:
Cliff C. Zou, Weibo Gong, Don Towsley, and Lixin Gao. The Monitoring and Early Detection of Internet Worms, IEEE/ACM Transactions on Networking, 2005.
Propagation: Practice
Cumulative total of unique IP addresses infected by the first outbreak of Code-RedI v2 on July 19-20, 2001
10/21/2010
Malware
21
Source:
David Moore, Colleen Shannon, and Jeffery Brown. Code-Red: a case study on the spread and victims of an Internet worm, CAIDA, 2002
Trojan Horses
A Trojan horse (or Trojan) is a malware program that appears to perform some useful task, but which also does something with negative consequences (e.g., launches a keylogger).
Trojan horses can be installed as part of the payload of other malware but are often installed by a user or administrator, either deliberately or accidentally.
10/21/2010
Malware
22
Current Trends
Trojans currently have largest infection potential
Often exploit browser vulnerabilities
Typically used to download other malware in multi-stage attacks
10/21/2010
Malware
23
Source:
Symantec Internet Security Threat Report, April 2009
Rootkits
A rootkit modifies the operating system to hide its existence
E.g., modifies file system exploration utilities
Hard to detect using software that relies on the OS itself
RootkitRevealer
By Bryce Cogswell and Mark Russinovich (Sysinternals)
Two scans of file system
High-level scan using the Windows API
Raw scan using disk access methods
Discrepancy reveals presence of rootkit
Could be defeated by rootkit that intercepts and modifies results of raw scan operations
10/21/2010
Malware
24
Malware Zombies
Malware can turn a computer in to a zombie, which is a machine that is controlled externally to perform malicious attacks, usually as a part of a botnet.
10/21/2010
25
Botnet Controller (Attacker)
Victim
Botnet:
Attack Commands
Attack Actions
Financial Impact
Malware often affects a large user population
Significant financial impact, though estimates vary widely, up to $100B per year (mi2g)
Examples
LoveBug (2000) caused $8.75B in damages and shut down the British parliament
In 2004, 8% of emails infected by W32/MyDoom.A at its peak
In February 2006, the Russian Stock Exchange was taken down by a virus.
10/21/2010
Malware
26
26
2009-02-02
CS 166 - Malware
Economics of Malware
New malware threats have grown from 20K to 1.7M in the period 2002-2008
Most of the growth has been from 2006 to 2008
Number of new threats per year appears to be growing an exponential rate.
10/21/2010
Malware
27
Source:
Symantec Internet Security Threat Report, April 2009
Professional Malware
Growth in professional cybercrime and online fraud has led to demand for professionally developed malware
New malware is often a custom-designed variations of known exploits, so the malware designer can sell different “products” to his/her customers.
Like every product, professional malware is subject to the laws of supply and demand.
Recent studies put the price of a software keystroke logger at $23 and a botnet use at $225.
10/21/2010
Malware
28
Image by User:SilverStar from http://commons.wikimedia.org/wiki/File:Supply-demand-equilibrium.svg
used by permission under the Creative Commons Attribution ShareAlike 3.0 License
Adware
10/21/2010
Malware
29
Adware software payload
Adware engine infects
a user’s computer
Computer user
Adware agent
Adware engine requests
advertisements
from adware agent
Advertisers
Advertisers contract with
adware agent for content
Adware agent delivers
ad content to user
Spyware
10/21/2010
Malware
30
Spyware software payload
1. Spyware engine infects
a user’s computer.
Computer user
Spyware data collection agent
2. Spyware process collects
keystrokes, passwords,
and screen captures.
3. Spyware process
periodically sends
collected data to
spyware data collection
agent.
Signatures: A Malware Countermeasure
Scan compare the analyzed object with a database of signatures
A signature is a virus fingerprint
E.g.,a string with a sequence of instructions specific for each virus
Different from a digital signature
A file is infected if there is a signature inside its code
Fast pattern matching techniques to search for signatures
All the signatures together create the malware database that usually is proprietary
10/21/2010
Malware
31
Signatures Database
Common Malware Enumeration (CME)
aims to provide unique, common identifiers to new virus threats
Hosted by MITRE
http://cme.mitre.org/data/list.html
Digital Immune System (DIS)
Create automatically new signatures
10/21/2010
Malware
32
While not completely standardized, virus naming follows a fairly standard convention
Viruses often have multiple names in standard usage, and names reported often depend on the detection software used.
Commonly used prefixes include:
@m: Worms or viruses propagating by e-mail
@mm: Mass mailer worms or viruses
Dr: Dropper programs
Family: A virus which shares characteristics with other viruses in a family
Gen: Similar to family
Int: An intended virus, a virus which failed
Worm: Sometimes used to indicate worms
CS 166 - Malware
2009-02-02
32
White/Black Listing
Maintain database of cryptographic hashes for
Operating system files
Popular applications
Known infected files
Compute hash of each file
Look up into database
Needs to protect the integrity of the database
10/21/2010
Malware
33
Heuristic Analysis
Useful to identify new and “zero day” malware
Code analysis
Based on the instructions, the antivirus can determine whether or not the program is malicious, i.e., program contains instruction to delete system files,
Execution emulation
Run code in isolated emulation environment
Monitor actions that target file takes
If the actions are harmful, mark as virus
Heuristic methods can trigger false alarms
10/21/2010
Malware
34
Shield vs. On-demand
Shield
Background process (service/daemon)
Scans each time a file is touched (open, copy, execute, etc.)
10/21/2010
Malware
35
On-demand
Scan on explicit user request or according to regular schedule
On a suspicious file, directory, drive, etc.
Performance test of scan techniques
Comparative: check the number of already known viruses that are found and the time to perform the scan
Retrospective: test the proactive detection of the scanner for unknown viruses, to verify which vendor uses better heuristics
Anti-viruses are ranked using both parameters:
http://www.av-comparatives.org/
Malicious Code
2008-02-04
35
Online vs Offline Anti Virus Software
Online
Free browser plug-in
Authentication through third party certificate (i.e. VeriSign)
No shielding
Software and signatures update at each scan
Poorly configurable
Scan needs internet connection
Report collected by the company that offers the service
Offline
Paid annual subscription
Installed on the OS
Software distributed securely by the vendor online or a retailer
System shielding
Scheduled software and signatures updates
Easily configurable
Scan without internet connection
Report collected locally and may be sent to vendor
10/21/2010
Malware
36
Quarantine
A suspicious file can be isolated in a folder called quarantine:
E.g,. if the result of the heuristic analysis is positive and you are waiting for db signatures update
The suspicious file is not deleted but made harmless: the user can decide when to remove it or eventually restore for a false positive
Interacting with a file in quarantine it is possible only through the antivirus program
The file in quarantine is harmless because it is encrypted
Usually the quarantine technique is proprietary and the details are kept secret
10/21/2010
Malware
37
Malicious Code
2008-02-04
37
Static vs. Dynamic Analysis
Static Analysis
Checks the code without trying to execute it
Quick scan in white list
Filtering: scan with different antivirus and check if they return same result with different name
Weeding: remove the correct part of files as junk to better identify the virus
Code analysis: check binary code to understand if it is an executable, e.g., PE
Disassembling: check if the byte code shows something unusual
Dynamic Analysis
Check the execution of codes inside a virtual sandbox
Monitor
File changes
Registry changes
Processes and threads
Networks ports
10/21/2010
Malware