Threats in Active or Mobile Code
Active code or mobile code is a
general name for code that is pushed to the client for execution. Why should the
web server waste its precious cycles
and bandwidth doing simple work that the client's workstation can do? For
example, suppose you want your web site to have bears dancing across the top of
the page. To download the dancing bears, you could download a new image for
each movement the bears take: one bit forward, two bits forward, and so forth.
However, this approach uses far too much server time and bandwidth to compute
the positions and download new images. A more efficient use of (server)
resources is to download a program that runs on the client's machine and
implements the movement of the bears.
Since you have been studying
security and are aware of vulnerabilities, you probably are saying to yourself,
"You mean a site I don't control, which could easily be hacked by
teenagers, is going to push code to my machine that will execute without my
knowledge, permission, or oversight?" Welcome to the world of (potentially
malicious) mobile code. In fact, there are many different kinds of active code,
and in this section we look at the related potential vulnerabilities.
Strictly speaking, cookies
are not active code. They are data files that can be stored and fetched by a
remote server. However, cookies can be used to cause unexpected data transfer
from a client to a server, so they have a role in a loss of confidentiality.
is a data object that can be held in memory (a per-session cookie) or stored on
disk for future access (a persistent cookie). Cookies can store anything about
a client that the browser can determine: keystrokes the user types, the machine
name, connection details (such as IP address), date and type, and so forth. On
command a browser will send to a server the cookies saved for it. Per-session
cookies are deleted when the browser is closed, but persistent cookies are
retained until a set expiration date, which can be years in the future.
Cookies provide context to a server. Using
cookies, certain web pages can greet you with "Welcome back, James
Bond" or reflect your preferences, as in "Shall I ship this order to
you at 135 Elm Street?" But as these two examples demonstrate, anyone
possessing someone's cookie becomes that person in some contexts. Thus, anyone
intercepting or retrieving a cookie can impersonate the cookie's owner.
What information about you
does a cookie contain? Even though it is your information, most of the time you
cannot tell what is in a cookie, because the cookie's contents are encrypted
under a key from the server.
So a cookie is something that
takes up space on your disk, holding information about you that you cannot see,
forwarded to servers you do not know whenever the server wants it, without
informing you. The philosophy behind cookies seems to be "Trust us, it's
good for you."
Clients can invoke services
by executing scripts on servers. Typically, a web browser displays a page. As
the user interacts with the web site via the browser, the browser organizes
user inputs into parameters to a defined script; it then sends the script and
parameters to a server to be executed. But all communication is done through
HTML. The server cannot distinguish between commands generated from a user at a
browser completing a web page and a user's handcrafting a set of orders. The
malicious user can monitor the communication between a browser and a server to
see how changing a web page entry affects what the browser sends and then how
the server reacts. With this knowledge, the malicious user can manipulate the
To see how easily this manipulation
is done, remember that programmers do not often anticipate malicious behavior;
instead, programmers assume that users will be benign and will use a program in
the way it was intended to be used. For this reason, programmers neglect to
filter script parameters to ensure that they are reasonable for the operation
and safe to execute. Some scripts allow arbitrary files to be included or
arbitrary commands to be executed. An attacker can see the files or commands in
a string and experiment with changing them.
A well-known attack against web servers is the escape-character attack. A common
scripting language for web servers, CGI (Common Gateway Interface), defines a
machine-independent way to encode communicated data. The coding convention uses
%nn to represent ASCII special characters. However, special characters may be
interpreted by CGI script interpreters. So, for example, %0A (end-of-line)
instructs the interpreter to accept the following characters as a new command.
The following command requests a copy of the server's password file:
CGI scripts can also initiate actions directly
on the server. For example, an attacker can observe a CGI script that includes
a string of this form:
<!-#action arg1=value arg2=value ->
and submit a subsequent command where the
string is replaced by
<!--#exec cmd="rm *" ->
to cause a command shell to
execute a command to remove all files in the shell's current directory.
Microsoft uses active server pages (ASP) as its
scripting capability. Such pages instruct the browser on how to display files,
maintain context, and interact with the server. These pages can also be viewed
at the browser end, so any programming weaknesses in the ASP code are available
for inspection and attack.
The server should never trust
anything received from a client, because the remote user can send the server a
string crafted by hand, instead of one generated by a benign procedure the
server sent the client. As with so many cases of remote access, these examples
demonstrate that if you allow someone else to run a program on your machine,
you can no longer be confident that your machine is secure.
Displaying web pages started
simply with a few steps: generate text, insert images, and register mouse
clicks to fetch new pages. Soon, people wanted more elaborate action at their
web sites: toddlers dancing atop the page, a three-dimensional rotating cube,
images flashing on and off, colors changing, totals appearing. Some of these
tricks, especially those involving movement, take significant computing power;
they require a lot of time and communication to download from a server. But
typically, the client has a capable and underutilized processor, so the timing
issues are irrelevant.
To take advantage of the
processor's power, the server may download code to be executed on the client.
This executable code is called active code. The two main kinds of active code
are Java code and ActiveX controls.
Sun Microsystems [GOS96] designed and promoted the Java technology
as a truly machine-independent programming language. A Java program consists of
Java byte-code executed on a Java
virtual machine (JVM) program. The bytecode programs are machine
independent, and only the JVM interpreter needs to be implemented on each class
of machine to achieve program portability. The JVM interpreter contains a
built-in security manager that enforces a security policy. A Java program runs
in a Java "sandbox," a
constrained resource domain from which the program cannot escape. The Java
programming language is strongly typed, meaning that the content of a data item
must be of the appropriate type for which it is to be used (for example, a text
string cannot be used as a numeric).
The original, Java 1.1
specification was very solid, very restrictive, and hence very unpopular. In
it, a program could not write permanently to disk, nor could it invoke
arbitrary procedures that had not been included in the sandbox by the security
manager's policy. Thus, the sandbox was a collection of resources the user was
willing to sacrifice to the uncertainties of Java code. Although very strong,
the Java 1.1 definition proved unworkable. As a result, the original
restrictions on the sandbox were relaxed, to the detriment of security. Koved
et al. [KOV98] describe how the Java security model evolved.
The Java 1.2 specification
opened the sandbox to more resources, particularly to stored disk files and
executable procedures. (See, for example, [GON96,
GON97 ].) Although it is still difficult
to break its constraints, the Java sandbox contains many new toys, enabling
more interesting computation but opening the door to exploitation of more
serious vulnerabilities. (For more information, see [DEA96]
and review the work of the Princeton University Secure Internet Programming
Does this mean that the Java system's designers
made bad decisions? No. As we have seen many times before, a product's security
flaw is not necessarily a design flaw. Sometimes the designers choose to trade
some security for increased functionality or ease of use. In other cases, the
design is fine, but implementers fail to uphold the high security standards set
out by designers. The latter is certainly true for Java technology. Problems
have occurred with implementations of Java virtual machines for different
platforms and in different components. For example, a version of Netscape
browser failed to implement type checking on all data types, as is required in
the Java specifications. A similar vulnerability affected Microsoft Internet
Explorer. Although these vulnerabilities have been patched, other problems
could occur with subsequent releases.
A hostile applet is downloadable Java code that can cause harm on the
client's system. Because an applet is not screened for safety when it is
downloaded and because it typically runs with the privileges of its invoking
user, a hostile applet can cause serious damage. Dean et al. [DEA96] list necessary conditions for secure execution of applets:
The system must control applets' access to sensitive system
resources, such as the file system, the processor, the network, the user's
display, and internal state variables.
The language must protect memory by preventing forged memory
pointers and array (buffer) overflows.
The system must prevent object reuse by clearing memory contents
for new objects; the system should perform garbage collection to reclaim memory
that is no longer in use.
The system must control interapplet communication as well as
applets' effects on the environment outside the Java system through system
Microsoft's answer to Java
technology is the ActiveX series. Using ActiveX controls, objects of arbitrary
type can be downloaded to a client. If the client has a viewer or handler for
the object's type, that viewer is invoked to present the object. For example,
downloading a Microsoft Word .doc file would invoke Microsoft Word on a system
on which it is installed. Files for which the client has no handler cause other
code to be downloaded. Thus, in theory, an attacker could invent a type, called
.bomb, and cause any unsuspecting user who downloaded a web page with a .bomb
file also to download code that would execute .bombs.
To prevent arbitrary
downloads, Microsoft uses an authentication scheme under which downloaded code
is cryptographically signed and the signature is verified before execution. But
the authentication verifies only the source of the code, not its correctness or
safety. Code from Microsoft (or Netscape or any other manufacturer) is not
inherently safe, and code from an unknown source may be more or less safe than
that from a known source. Proof of origin shows where it came from, not how good
or safe it is. And some vulnerabilities allow ActiveX to bypass the
Auto Exec by Type
Data files are processed by
programs. For some products, the file type is implied by the file extension,
such as .doc for a Word document,
.pdf (Portable Document
Format) for an Adobe Acrobat file, or .exe for an executable file. On many
systems, when a file arrives with one of these extensions, the operating system
automatically invokes the appropriate processor to handle it.
By itself, a Word document is
unintelligible as an executable file. To prevent someone from running a file
temp.doc by typing that name as a command, Microsoft embeds within a file what
type it really is. Double-clicking the file in a Windows Explorer window brings
up the appropriate program to handle that file.
But, as we noted in Chapter 3, this scheme presents an opportunity to
an attacker. A malicious agent might send you a file named innocuous.doc, which
you would expect to be a Word document. Because of the .doc extension, Word
would try to open it. Suppose that file is renamed "innocuous"
(without a .doc). If the embedded file type is .doc, then double-clicking
innocuous also brings the file up in Word. The file might contain malicious
macros or invoke the opening of another, more dangerous file.
Generally, we recognize that
executable files can be dangerous, text files are likely to be safe, and files
with some active content, such
as .doc files, fall in
between. If a file has no apparent file type and will be opened by its built-in
file handler, we are treading on dangerous ground. An attacker can disguise a
malicious active file under a nonobvious file type.
Bots, hackerese for robots, are pieces of malicious code under remote
control. These code objects are Trojan horses that are distributed to large numbers of victims' machines.
Because they may not interfere with or harm a user's computer (other than
consuming computing and network resources), they are often undetected.
Bots coordinate with each
other and with their master through ordinary network channels, such as Internet
Relay Chat (IRC) channels or peer-to-peer networking (which has been used for
sharing music over the Internet). Structured as a loosely coordinated web, a
network of bots, called a botnet, is
not subject to failure of any one bot or group of bots, and with multiple
channels for communication and coordination, they are highly resilient.
Botnets are used for distributed denial-of
-service attacks, launching attacks from many sites in parallel against a
victim. They are also used for spam and other bulk email attacks, in which an
extremely large volume of e-mail from any one point might be blocked by the
sending service provider.