Microsoft® Windows Server™ 2003 White Paper
Designing and implementing scalable architecture requires not only understanding the server features that support scaling but also understanding a few application, hardware, and server concepts.
Applications and Scaling
There is a continuum along which applications have characteristics that allow them to be scaled up or scaled out. Both scaling modes require parallelization, but they differ in the level at which the application can be parallelized; they also differ in the granularity with which they are partitioned.
In order to scale up, an application does not need to be partitioned at the core stream level but it still has to be partitioned at the filestream level. For example, one non-obvious bottleneck is that a multi-threaded process must avoid writing to the same data records from different threads. Simultaneous writes are an expensive operation because caches must be synchronized. Similarly, if a file or record is locked, the requests tend to queue up for processing in a serial fashion, in which case you lose the advantage of threading.
In order to scale out an application needs to be created such that each element that runs on a server does not contain state information. This allows it to exploit added resources in a scalable manner. In other words, to scale out well, an application should be parallelized so that different parts are independent of other parts, with each part taking advantage of appropriate resources.
For example a scalable Web application will generally be partitioned into a presentation layer, which runs on the client where scaling is not an issue, a middle tier that implements the business logic, for which scaling may be a substantive issue, and a data layer, with well-known database scaling requirements and solutions. The middle tier should be built where each component is not tied to a particular server across round trip requests from a client. This allows the server farm to grow or shrink as scaling dictates. Furthermore, because the throughput of an entire system is equal only to the throughput of its slowest part, partitioning an application into modular portion allows resources to be targeted at the particular bottleneck.
Load Balancing and Clustering
Let’s take a look at two Windows Server 2003 technologies and see how they relate to scaling an application.
With Windows Server 2003, Microsoft uses a two-part clustering strategy:
Network Load Balancing (NLB) provides load balancing support for IP-based applications and services that require high scalability and availability.
Server Cluster (SC) provides failover support for applications and services that require high availability, scalability and reliability.
Network Load Balancing (NLB) is a clustering technology that distributes TCP requests across servers. For instance, if there are two servers in a cluster, NLB will allocate TCP requests across those two servers. NLB is easy to setup and is included in all Windows Server 2003 products.
Implementing a Scalable Architecture11