Virtual Honeypots: From Botnet Tracking to Intrusion Detection
by Niels Provos and Thorsten Holz
Addison-Wesley 2008.
ISBN 978-0-321-33632-3. Amazon.com $31.49 Bookpool.com $31.50
Reviewed by Richard Austin January 11, 2008
Softly, softly, catchee monkey .. with a honeypot
While the exact origin of that phrase is a bit nebulous as a Google search will show, the idea of quietly and patiently pursuing a goal is no stranger to the security profession and one of the techniques that has demonstrated great success in searching out security exploits in the wild has been the Honeynet Project (www.honeynet.org).
There have been several previous books on the subject of honeypots ranging from Lance Spitzer's "Honeypots: Tracking Hackers," to Roger Grimes' "Honeypots for Windows" to the second edition of "Know your Enemy", so one might question why we're in need of another one. A honeypot is a stalking horse or sacrificial victim whose sole purpose is to be compromised by an attacker to allow the honeypot owner to study the methods, tools and techniques used in the compromise. This book is about virtual honeypots and includes both the idea of running a full-function (or high-interaction) honeypot on a virtual server and also the idea of so called "low interaction" honeypots which just implement the vulnerable portions of specific services. While the advantages of using virtual servers to host a honeypot are pretty obvious (we can host many honeypots on a single physical server and can easily restore the state of a compromised honeypot by replacing its virtual disks), the use of "virtual pieces" of systems (low interaction honeypots) is shown to be a valuable technique for increasing the possible scale of a honeypot deployment.
I would recommend reading the book's chapters out of order - begin with the first chapter which introduces honeypot technology, introduces the ideas of high and low interaction honeypots with some review of required networking background and then skip to chapters 10 ("Case Studies") and 11 (" Tracking Botnets") for some real world case studies in how honeypots are actually used in practice. Chapter 10 provides a detailed walkthrough of how real honeypots were compromised and how the compromise was captured and studied. Chapter 11 provides a similar exercise for botnets. This grounding in how honeypots are used will help prevent the reader from becoming lost in the details of the other chapters.
The second chapter is devoted to high-interaction honeypots and covers their use on several common virtualization platforms (VMware, Microsoft Virtual Server and PC, User Mode Linux, etc. There's good advice here on the thorny subject of safeguarding your honeypots from becoming a danger after they achieve their intended purpose of being compromised.
Chapter 3 introduces low interaction honeypots which do not provide a full installation of an operating system or application but rather only emulate vulnerable versions of specific services. It is noted that they are most useful in detecting exploit attempts using known vulnerabilities and serve as a sort of burglar alarm to let you know how often particular types of attacks are occurring.
Chapters 4 and 5 continue the presentation of low-interaction honeypots by discussing honeyd in detail. Honeyd is an Open Source solution that allows emulation of huge numbers of vulnerable targets. This scale allows an organization to efficiently instrument significant portions of their network address space.
Chapter 6 ("Collecting Malware with Honeypots") covers the important topic of capturing viruses and worms using "Nepenthes," "Honeytrap," etc. Nepenthes is a low-interaction honeypot that emulates a vulnerable network service to provide an attractive target for malware. Since it is not a full implementation of the service, it can't really be exploited and thus provides a safe way to capture malware. Nepenthes' vulnerability modules implement "just enough" of the vulnerable service to "fool" the malware into thinking it has found a target. Nepenthes "executes" the malware payload to carry out the download of attacker tools, etc, and then halts the execution. The other tools offer somewhat different capabilities but the overriding advantage of all the tools is their immense scalability. Since they are quite lightweight compared to say a full Windows or Linux installation, a single physical server can host many hundreds of apparently vulnerable targets.
Of course, one of the weaknesses of the low-interaction honeypots is that they only emulate portions of vulnerable services and are really most effective with known vulnerabilities. Chapter 7 introduces "Hybrid Systems" that combine low and high interaction honeypots to extend their capabilities. For example, when a low-interaction honeypot detects an exploit attempt that it cannot emulate, it might transparently hand that attempt off to a high-interaction honeypot which could capture the full process. This would allow significant coverage of the network address space with few resources while still allowing capture of new exploits as they are found. Unfortunately, these hybrid systems are not Open Source but do offer interesting insights on the future of honeypot technology.
Chapter 8 addresses the "other" side of exploitation - client side exploits - by examining client-side honeypots. While a server-side honeypot can sit and patiently wait for an attacker to come "knocking at its door," a client-side honeypot must go looking for malicious content.
Chapter 9 covers the ways attackers can detect honeypots. Obviously, an attacker is typically wasting their time when interacting with a honeypot and, worse from their point of view, may reveal a new exploitation technique. With an active underground economy in selling/trading new exploits, this creates economic incentive for attackers to be able to detect a honeypot. Detection can be relatively simple such as noting that all the IP addresses for virtual honeypots have the same MAC address or the fact that a given low-interaction honeypot is hosting what looks like Linux and Windows at the same network address to more complex techniques that detect the virtualization layer itself.
At this point, a review of the case studies in chapters 10 and 11 will reinforce the presentation and provide insights on how honeypots are actually used in practice.
The final chapter covers malware analysis using an automated tool called "CWSandbox." As we have come to know too well, malware authors are making significant strides in improving their productivity in producing malware which has challenged the ability of the "good guys and gals" to reverse engineer it. CWSandbox is a tool that provides a safe execution environment for malware (a sandbox) and provides automated analysis of its activities. Once can even submit a malware sample online at www.cwsandbox.org and receive the automated analysis.
In summary, this is an excellent overview of honeypots, how they are used in practice, and most significantly, how virtualization can be used to scale them to cover large portions of the network address space with fewer physical resources. While honeypots are not a technology every organization will employ, they are a valuable tool for the security professional to keep in mind.
And as a bit of humor, a comic from xkcd pictures what may happen when one spends far too much time looking at malware -- http://xkcd.com/350/
Before retiring, Richard Austin was the storage network security architect at a Fortune 25 company and currently earns his bread and cheese as an itinerant university instructor and security consultant. He welcomes your thoughts and comments at rda7838 at Kennesaw dot edu