If you’re a veteran system administrator, you might remember an era of extremely expensive hard disk storage, when any serious network would have a beefy central file server (probably accessed using the Network File System, NFS) that formed the lifeblood of its operations. It was a well-loved feature as early as Linux kernel 2.0 that you could actually boot your machine with a root filesystem in NFS and have no local disk at all. Hardware costs went down, similar machines could share large parts of their system binaries, upgrades could be done without touching anything but the central server—sysadmins loved this.
But that was then. Diskless booting these days seems a lot less common, even though the technology still exists. You hear about supercomputer clusters using it, but not the “typical” IT department. What happened?
Part of it, I’m sure, is that hard disks became speedier and cheaper more quickly than consumer network technology gained performance. With local disks, it’s still difficult to roll out updates to a hundred or a thousand computers simultaneously, but many groups don’t start with a hundred or a thousand computers, and multicast system re-imaging software like Norton Ghost prevents the hassle from being unbearable enough to force a switch.
More important, though, is that after a few years of real innovation, the de facto standard in network booting has been stagnant for over a decade. Back in 1993, when the fastest Ethernet anyone could use transferred a little over a megabyte of data per second and IDE hard drives didn’t go much faster, network card managers were already including boot ROMs on their expansion cards, each following its own proprietary protocol for loading and executing a bootstrap program. A first effort at standardization, Jamie Honan’s “Net Boot Image Proposal”, was informally published that year, and soon enough two open-source projects, Etherboot (1995) and Netboot (1996), were providing generic ROM images with pluggable driver support. (Full disclosure: I’m an Etherboot Project developer.) They took care of downloading and executing a boot file, but that file would have no way of going back to the network for more data unless it had a network card driver built in. These tools thus became rather popular for booting Linux, and largely useless for booting simpler system management utilities that couldn’t afford the maintenance cost of their own network stack and drivers.
Around this time, Intel was looking at diskless booting from a more commercial point of view: it made management easier, consolidated resources, avoided leaving sysadmins at the mercy of users who broke their systems thinking themselves experts. They published a specification for the Preboot Execution Environment (PXE), as part of a larger initiative called Wired for Management. Network cards started replacing their proprietary boot ROMs with PXE, and things looked pretty good; the venerable SYSLINUX bootloader grew a PXELINUX variant for PXE-booting Linux, and a number of enterprise system management utilities became available in PXE-bootable form.
But, for whatever reason, the standard hasn’t been updated since 1999. It still operates in terms of the ancient x86 real mode, only supports UDP and a “slow, simple, and stupid” file transfer protocol called TFTP, and officially limits boot program size to 32kB. For modern-day applications, this is less than ideal.
Luckily for us, the Etherboot Project still exists, and Etherboot’s successor gPXE has been picking up where Intel left off, and supports a number of more modern protocols. Between that, excellent support in recent Linux kernels for both accessing and serving SAN disks with high performance, and the flexibility gained by booting with an initial ramdisk, diskless booting is making a big comeback. Read on . . . .