Traditional protection mechanisms use hardware: the virtual memory subsystem of a computer is used by an operating system to ensure that one process cannot "attack" the memory space of another process. But when two processes that don't quite trust each other need to communicate -- as when your browser sends events to the applet, and the applet draws in its browser window -- hardware protection is clumsy, and an object-oriented interface if much more efficient and expressive.
Computer scientists have long researched the use of programming-language type-checking as a protection mechanism that doesn't need hardware support. Java is the most widely used language to work this way, but what I'll say also applies to other languages such as Modula-3 and ML.
Perhaps applets are merely toys, but in the real world it's very common to build applications from components, where you don't have control over all the components. What can you say about the security of a system you build from untrusted components?
In a type-safe language, if it type-checks then it can't go wrong. The technical term "go wrong" means to mistake an integer for a pointer, to dereference a pointer at the wrong type, to access a private variable from outside its class, and so on. If a Java class doesn't go wrong, then it can't trash the private variables of other classes to which it is linked. If you link to Java classes, they may have bugs, but you can still rely on your own classes' private variables. So you may not be able to guarantee the correctness of the system you build (if there's a bug in the arithmetic component it might not compute the customer's taxes right) but you can hope to guarantee the safety (it won't trash the customer's disk).
But you trust the JIT, right, because it comes from a major software vendor? Unfortunately, the JIT is a large, complicated program -- an optimizing compiler -- and it will inevitably have a bug. Some JITs contain a million lines of code! You can trust that the JIT vendor has not put malicious attacks inside the JIT, and has kept viruses out of the JIT, but you can't realistically trust that there are no bugs at all.
For most software, it's all right if there are a few nonmalicious bugs. But bugs in the JIT can be exploited by a malicious attacker who sends you an applet (or provides a class that you want to use as a component of your software). A clever attacker can tweak his code on purpose to make the JIT compiler produce code that goes wrong, mistaking an integer of his choice for an array of ints. Then all bets are off: he can put whatever bits he wants in memory at whatever location he wants, and you have no security left at all.
The compiler front end translates the source program into a high-level intermediate representation (IR) annotated with type declarations. The type-checker runs on this IR, rejecting any unsound programs. Then the type declarations are discarded, and the optimizer does analyses and transformations on the IR to make the program faster; the code generator translates into a machine-specific IR, the register allocator fills in some details, and out comes a machine-language program.
If there's a bug in the optimizer, or the code generator, or the register allocator, etc. then the program may crash, or a maliciously designed program may cause intentional harm.
The new technology uses typed intermediate languages -- at each level, the IR still has type declarations, and a type-checker can be run to verify the soundness of the program. Of course, if the compiler has no bugs, then it's unnecessary to run the typechecker at each level -- once the source program is type-checked at the first IR, then each lower-level IR ought to type-check. But the untrusting user will want to type-check the machine-language program that comes out the end of the compiler. (See Figure 2.)
The lower-level IR's can be a good deal more complicated than the source language (e.g., Java). Only in the last few years have computer scientists designed type systems capable of dealing with lower levels of the compiler. In fact, the type checking of a machine-language program looks very much like a mathematical proof.
This has led to the notion of proof-carrying code. The compiler produces a machine-language program and a proof that the program is safe to execute. Checking this proof is a simple -- but long and tedious -- process, ideally suited for computers. Your trusted vendor will give you a proof-checker, and your JVM can run it on the output of the JIT. (See Figure 3.)
Proof-carrying code was invented by George Necula and Peter Lee for the Touchstone compiler for a safe subset of C. The Secure Internet Programming project at Princeton University is conducting research on many aspects of computer security, including proof-carrying code.
A start-up company called Cedilla Systems is adapting the Touchstone technology to Java, and expects to release a high-assurance Java compiler in early 2000.
The book Securing Java by Gary McGraw and Ed Felten has comprehensive advice about interacting with untrusted Java classes. Ken Thompson's classic Turing Award lecture, Reflections on Trusting Trust, shows the limitations of working with untrusted code.