But in recent weeks, route flapping has caused numerous outages over the global computer network that have disrupted electronic mail, prevented users from accessing World Wide Web sites and slowed the instantaneous transfer of information by several hours in some cases.
Route flaps are like shock waves that travel through the Internet's backbone. Each computer on the Internet has an electronic traffic map of the network that tells the computer the best way to send information to other computers that are on the Net.
During a route flap, the computers receive messages that tell them to change their maps wildly. For example, a computer in San Francisco might be told the fastest way to send data to Boston is to send it through an intermediate computer in Chicago. The next moment, it might be told to send through Washington, D.C. A moment later, it might be told to send the data through Chicago again.
All of those routing requests take time to be processed - time that is not available for transporting information across the network. Furthermore, the requests themselves fill up the network - squeezing out other data.
''There have been a couple of instances recently,'' of flaps causing service loss, says Alecia Cooper, a SprintLink Product Management group manager. Cooper says the outages have been ''growing pains'' that have been experienced by SprintLink and other Internet service providers.
Users of the Internet can experience outages and other problems in different ways. During a major outage, electronic mail can be delayed or stopped entirely. Pages or computers on the World Wide Web might not be displayed when a user clicks on their links. Alternatively, a user might experience no delay at all, because the portion of the Internet that he or she is using is unaffected.
''We had a very rough week'' in mid-November, says Benham Malcom, SprintLink's manager of engineering. ''What you need to remember is that the Internet is a collection of networks.'' But because they are all connected, says Malcom, a single outage on one network can affect ''users that are on different networks.''
For example, if one link between two key networks is having trouble, it can cause information to get squeezed through the remaining routes. That's what happened late last month, when one of Sprint's high-speed fiber optic links suddenly developed problems and had to be taken out of service, according to messages that Sprint distributed over a customer mailing list.
Old bug triggered
In another case, a series of SprintLink outages were the result of a bug that was in the software of a Cisco Systems router, which SprintLink uses on its Internet backbone. The bug had been in the software ''since Day 1,'' says Malcom, but it wasn't until a few weeks ago the Internet had enough congestion to trigger the problem. It took nearly a week for Cisco to find the error in its code and give SprintLink a fix. SprintLink claims to carry 60 percent of the nation's commercial Internet traffic, and 80 percent of the commercial Internet traffic outside the United States.
The underlying technology that runs the Internet was designed in the 1960s and '70s to be able to withstand a nuclear attack. As a result, when there is a downed phone line or a broken computer, the network automatically determines the location of the fault and ''routes'' data around the break. It is those very systems that the Internet uses for routing that are causing the network's new generation of problems.
Today's Internet is built largely from high-speed, long-distance telephone lines and special-purpose computers called routers, which move information between those links and local area networks. The telephone lines are for the most part provided by AT&T, MCI and Sprint, while a majority of the routers on the Internet are manufactured by San Jose-based Cisco.
Most consumers access the Internet by using a modem to connect their home computers to computers operated by a company like America Online or Netcom Communications. Those computers, in turn, connect to the Internet through a router.
Route to route to route
''When all of the links are up and all of the sites (computers) are up, that is a stable condition,'' says John Curran, chief technical officer of BBN Planet, a SprintLink competitor. Problems happen, he says, whenever a router or a long-distance link somewhere on the Net stops working.
When the failure is detected by the neighboring routers, says Curran, each sends a routing message to its adjacent routers, so they can route around it. Those adjacent routers, in turn, send the message to their neighbors, which continue sending the message to hundreds, if not thousands, of other routers on the Internet.
''One Internet service company can create a great deal of work throughout the network,'' says Curran, who likens the world-wide mesh of Internet routers to a network of tightly strung strings.
''Now, you have to go to this system where anytime someone plucks a string, it vibrates across all of the places that have active routing tables,'' says Curran. The problem with route flapping, he says, is that ''there are so many people involved that somewhere on the Internet there is always a string being plucked. You get a continual background roar of routing changes.''
Today's network has 25,000 to 30,000 individual routes, ''up from about 10,000 a couple of years ago,'' explains Morgan Littlewood, Cisco Systems' product line manager for Internet service providers. The routing tables have grown so large so fast that only the largest, fastest and most expensive routers can hold them.
The routers are also being pushed harder than before. ''Routers (in the Internet's core) that used to run about 40,000 packets per second less than a year ago are now running 110,000 packets per second,'' says Littlewood. Cisco's current generation of routers have a maximum capacity of roughly 300,000 packets per second. Each time Cisco's software is pushed to a faster speed, new bugs are uncovered.
A 'test bed'
Cisco now operates a ''test bed'' in one of its San Jose buildings, which Littlewood says can simulate some of the demanding conditions that take place inside the Internet's core. The test bed includes 40 high-speed routers and ''traffic generators,'' which can produce the equivalent load of millions of simultaneous users surfing the network. To test each release of software, the company turns on all of the equipment, ''then we do gross things, like flapping tens of thousands of routes at a time,'' he says, meaning they bombard the routers with thousands of route change instructions in a few seconds.
''We have been working very closely with Cisco to develop an Internet testing and inter-operability lab at Cisco, so they can test software before they give it to us,'' says SprintLink's Malcom. ''In recent weeks, we had a person on-site doing testing with Cisco. We test the software as best we can before we deploy it.''
Unfortunately, that doesn't always happen. That's because the Internet is growing so fast, and so many different things are happening at the same time, that it is nearly impossible to adequately simulate the network within a single laboratory.
''The Internet poses a unique set of conditions in terms of routing (and)
routing interactions,'' says BBN's Curran. The number of routes and the speed
with which they are changing makes the system ''very difficult to simulate.''