A Word on Scalability

| | Comments (7) | TrackBacks (3)

Scalability is frequently used as a magic incantation to indicate that something is badly designed or broken. Often you hear in a discussion “but that doesn’t scale” as the magical word to end an argument. This is often an indication that developers are running into situations where the architecture of their system limits their ability to grow their service. If scalability is used in a positive sense it is in general to indicate a desired property as in “our platform needs good scalability”.

What is it that we really mean by scalability? A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added. Increasing performance in general means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow.

In distributed systems there are other reasons for adding resources to a system; for example to improve the reliability of the offered service. Introducing redundancy is an important first line of defense against failures. An always-on service is said to be scalable if adding resources to facilitate redundancy does not result in a loss of performance.

Why is scalability so hard? Because scalability cannot be an after-thought. It requires applications and platforms to be designed with scaling in mind, such that adding resources actually results in improving the performance or that if redundancy is introduced the system performance is not adversely affected. Many algorithms that perform reasonably well under low load and small datasets can explode in cost if either requests rates increase, the dataset grows or the number of nodes in the distributed system increases.

A second problem area is that growing a system through scale-out generally results in a system that has to come to terms with heterogeneity. Resources in the system increase in diversity as next generations of hardware come on line, as bigger or more powerful resources become more cost-effective or when some resources are placed further apart. Heterogeneity means that some nodes will be able to process faster or store more data than other nodes in a system and algorithms that rely on uniformity either break down under these conditions or underutilize the newer resources.

Is achieving good scalability possible? Absolutely, but only if we architect and engineer our systems to take scalability into account. For the systems we build we must carefully inspect along which axis we expect the system to grow, where redundancy is required, and how one should handle heterogeneity in this system, and make sure that architects are aware of which tools they can use for under which conditions, and what the common pitfalls are.

3 TrackBacks

Listed below are links to blogs that reference this entry: A Word on Scalability.

TrackBack URL for this entry: http://mt.vogels.net/mt-tb.cgi/22

» RE: A Word on Scalability from Realtime ruminations

Werner Vogels, the Amazon.com CTO recently posted a note on Scalability - a topic that is close to my heart and something I have spoken about before as Performance vs. Scalability , Performance vs. Scalability Part 2 and Performance vs. Scalability ... Read More

» Scalability from CBDI Forum News Blog

Werner Vogels (Amazon CTO) has just posted some notes about Scalability, which he defines thus: A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added Read More

» ReaderWriterLock versus Monitor from Distribute This

Which synchronization primitive is best to use? ReaderWriterLock or Monitor. Background For those... Read More

7 Comments

sree Author Profile Page said:

So I have to admit, "scalability" is often the end of debate argument at AOL, too - I'd say its in fact a common derision point (as in, "Oh, that doesn't scale"). Arguably, we look at "scale" as a core company competency.

I'm curious, though, while scale indeed is tough to fix later and can cause all sorts of issues and bumps in the road, have you found cases where it is FATAL? Business ventures, companies, or products that completely died on the vine because of this?

Or is this a problem that, when required, you find people rise to the occasion to address?

I guess my point is that most people don't know what "scalability" really means until they've been bitten by it more than a few times - and usually they live to fix the issue (though its so painful they attempt to pass on the wisdom)

Miles Barr Author Profile Page said:

Interesting post. I've been tasked with building a distributed data repository and need to get up to speed on the key issues pretty quickly.

This book looks like a good summary of the current state of the art. Do you have any favourite books, or a reading list on the subject?


Cheers,
Miles

Emmanuel Cecchet said:

Hi Werner,

I think that your statement "A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added" is ambiguous.

I can have a perfectly scalable system but if no resource is maxed out before I add new resources, it is unlikely that I will see any performance improvement. I think that there is often a misunderstanding between the scalability and its relation to the workload of the system.

I would rather say that scalability should be a constant ratio between workload and throughput. If we increase the workload proportionally to the resources we add, then the throughput should increase in that same proportion.


Best regards,
Emmanuel

wqt said:

question: In the "scalibility" definition, "performance" is used. How is that defined - Throughput, response time, both, or something else?

Q

Dan Ciruli Author Profile Page said:

I'm a big believer in designing for scalability as well. I'm also a believer in using grid computing as one of the tools to provide scalability.

A grid is a good solution heterogeneity you mention as an obstacle. By their very nature, grid solutions are built to handle heterogeneity--that means that they will take advantage of the newer, faster hardware as much as possible, while still using the older hardware as effectively as possible. They're also designed to be very scalable--from 5 nodes to thousands.

Note that I'm not advocating using desktop cycle-scavenging to serve critical, public-facing applications. Rather, I see a grid deployed within a datacenter as a powerful tool for scalability.

G. Roper said:

An operational definition of scalability was provided by Michael D. Kersey on an old newsgroup thread:
http://groups.google.com/group/microsoft.public.inetserver.asp.components/msg/d9846b908f678f15?hl=en&
To quote from that newsgroup post:

“IMO a reasonable definition of scalability for a given platform P and application A is
S(A,P) = R(A,P) / C(A,P)
where
R = Maximum number of requests processed per second by application A on platform P,
C = Cost of hardware and software to develop and support application A on platform P.

I’ve assumed 100% availability for the purposes of this discussion. Availability could be added as an input to the definition if desired. This term displays the expected behavior shown by common usage of the term “scalability”:
1. As throughput R increases, scalability increases,
2. As cost C increases, scalability decreases,
3. Different platforms and different software may be compared using this definition,
4. You can use this definition to estimate costs of a proposed system, given an anticipated user load.
5. Both R and C can be estimated using known techniques.

So using this definition, scalability’s dimensions would be “requests processed per second per dollar”. Given the following known values for a single application Z:

running on platform X:
R(Z) = 1000 requests/second,
C(Z) = $40,000
S(Z) = 1000 requests/second / $40,000 = 0.025

running on not-so-fast but less expensive platform Y:
R(Z) = 500 requests/second,
C(Z) = $10,000
S(Z) = 500 requests/second / $10,000 = 0.05

While platform Y’s throughput (performance) is much less than that of platform Y, Y is much more scalable than (in fact is twice as scalable as) platform X when running application Z.

This definition can also be used to estimate the utility of using various software methodologies. For example, heavy use of components or object technology may or may not change each factor in the definition: the degree to which each is changed determines whether the resultant system is more or less scalable.”

Nati Shalom Author Profile Page said:

In his excellent book Pro-JavaEE, Steve Haines discusses the topic of scalability and performance. Here's Steve's definition of Scalability vs Performance:

"The terms “performance” and “scalability” are commonly used interchangeably, but the two are distinct: performance measures the speed with which a single request can be executed, while scalability measures the ability of a request to maintain its performance under increasing load. For example, the performance of a request may be reported as generating a valid response within three seconds, but the scalability of the request measures the request’s ability to maintain that three-second response time as the user load increases."

Quoting Werner Vogels from this post "Why is scalability so hard? Because scalability cannot be an after-thought."

It looks to me that we all come to the same realization that scalability is something we need to plan and its not just an optimization. It has to be part of the design at the early stages.

Few of the questions that remains open:
- how can we ensure true linear scalability in a stateful environment?
- how can we make it simple enough so that people that are not familiar with all the theory behind scalability will be able to build scalable applications in a simple way.

I tried to put my thoughts together around that topic in a serious of blogs , the latest one been "The true meaning of linear scalability"

It would be interesting to see what other thinks of those idea's and whether we can come up with some sort of a blueprint on that area or at least best practices such as the one suggested here Space Based Architecture..

About this Entry

This page contains a single entry by Werner Vogels published on March 30, 2006 5:21 PM.

Naked Answers was the previous entry in this blog.

Reactive Innovation? is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.