Two 40-Gbyte disks Two 18-Gbyte disks
Two 300-MHz Sparc Two 550-MHz x86
100-Mbps Ethernet 100-Mbps Ethernet
1 Gbyte 1 Gbyte
One 30-Gbyte disk Six 40-Gbyte disks
Two 844-MHz x86 Two 550-MHz x86
100-Mbps Ethernet 100-Mbps Ethernet
512 Mbytes 256 Mbytes
Two 100-Gbyte disks
Two 750-MHz x86
Global Deployment of Data Centers
Table 3. Node architectures of the three services studied.
Architecture Online Worker node Proxy cache node Content Metadata server node Storage server node ReadMostly Service node
the first storage server or its site fails.
Finally, ReadMostly uses full replication to achieve high availability and improve perfor- mance. The service replicates each piece of data among nodes in the cluster and among geograph- ically distributed clusters. It load balances incom- ing requests from the front end among redundant copies of data stored in the back end. This improves both back-end availability and perfor- mance by a factor proportional to the number of data copies stored. However, such a redundancy scheme would not be appropriate for a more write- intensive service: The service must propagate updates to all nodes storing a particular piece of data, which slows updates.
Networking. The wide-area networking structure among server sites varies greatly for the three ser- vices, whereas single-site networking follows a common pattern. In particular, a collocation facil- ity’s network is generally connected to a service’s first-level switch, which is then connected by Gigabit Ethernet to one or more smaller second- level switches per machine room rack. These sec- ond-level switches are in turn connected by 100- Mbps Ethernet to individual nodes. Although sites commonly use redundant connections between first- and second-level switches, none of the sites use redundant Ethernet connections between nodes and second-level switches, and only one (ReadMostly) uses redundant connections between the collocation facility’s network and the first- level switches.
Despite the recent emergence of industry-stan- dard, low-latency, high-bandwidth system-area networking technologies such as Virtual Intercon- nect Architecture (VIA) and Infiniband, all three services use UDP over IP or TCP over IP within clusters. This is most likely due to cost: a 100BaseT NIC card costs less than US$50 (and is often inte- grated into modern motherboards), although a PCI
card for VIA costs about US$1,000.
Service Operational Issues
The need to develop, deploy, and upgrade services on “Internet time,” combined with the size and complexity of service infrastructures, places unique pressures on the traditional software life cycle. This is most apparent in testing, deployment, and oper- ations. The amount of time available for testing has shrunk, the frequency and scale of deployment has expanded, and the importance of operational issues such as monitoring, problem diagnosis and repair, configuration and reconfiguration, and general system usability for human operators has grown. Online, Content, and ReadMostly address these challenges in various ways.
Testing All three services test their software before releas- ing it to their production cluster. Development quality assurance (QA) teams test the software before deployment. These groups use traditional testing techniques such as unit and regression tests, and use both single nodes and small-scale test clusters that reflect the production cluster architecture.
Online has a three-step testing and deployment process for new software and for features added after the software passes unit development QA tests.
The development group deploys the candidate software on its test cluster. Developers run the software, but the operations team configures it.
The operations group takes the stable version of the software, deploys it, and operates it on its test cluster.
The operations group releases alpha and beta versions of the software to the production ser- vice. For major releases, the group releases the new version to a subset of users for up to two weeks before rolling it out to the rest of the users.
SEPTEMBER • OCTOBER 2002
IEEE INTERNET COMPUTING