The workplace of the future
We already have an idea of how digitalization, and above all new technologies like machine learning, big-data analytics or IoT, will change companies' business models — and are already changing them on a wide scale. So now's the time to examine more closely how different facets of the workplace will look and the role humans will have.
In fact, the future is already here – but it's still not evenly distributed. Science fiction author William Gibson said that nearly 20 years ago. We can observe a gap between the haves and the have-nots: namely between those who are already using future technologies and those who are not. The consequences of this are particularly visible on the labor market many people still don't know which skills will be required in the future or how to learn them.
Against that background, it's natural for people – even young digital natives – to feel some growing uncertainty. According to a Gallup poll, 37% of millennials are afraid of losing their jobs in the next 20 years due to AI. At the same time there are many grounds for optimism. Studies by the German ZEW Center for European Economic Research, for example, have found that companies that invest in digitalization create significantly more jobs than companies that don't.
How many of the jobs that we know today will even exist in the future? Which human activities can be taken over by machines or ML-based systems? Which tasks will be left over for humans to do? And will there be completely new types of the jobs in the future that we can't even imagine today?
Future of work or work of the future?
All of these questions are legitimate. "But where there is danger, a rescuing element grows as well." German poet Friedrich Hölderlin knew that already in the 19th century. As for me, I'm a technology optimist: Using technology to drive customer-centric convenience, such as in the cashier-less Amazon Go stores, will create shifts in where jobs are created. Thinking about the work of tomorrow, it doesn't help to base the discussion on structures that exist today. After the refrigerator was invented in the 1930s, many people who worked in businesses that sold ice feared for their jobs. Indeed, refrigerators made this business superfluous for the most part; but in its place, many new jobs were created. For example, companies that produced refrigerators needed people to build them, and now that food could be preserved, whole new businesses were created which were targeted at that market. We should not let ourselves be guided in our thinking by the perception of work as we know it today. Instead, we should think about how the workplace could look like in the future. And to do that, we need to ask ourselves an entirely different question, namely: What is changing in the workplace, both from an organizational and qualitative standpoint?
Many of the tasks carried out by people in manufacturing, for example, have remained similar over time in terms of the workflows. Even the activities of doctors, lawyers or taxi drivers have hardly changed in the last decade, at least in terms of their underlying processes. Only parts of the processes are being performed by machines, or at least supported by them. Ultimately, the desired product or service is delivered in – hopefully – the desired quality. But in the age of digitalization, people do much more than fill the gaps between the machines. The work done by humans and machines is built around solving customer problems. It's no longer about producing a car, but about the service "mobility", about bringing people to a specific location. "I want to be in a central place in Berlin as quickly as possible" is the requirement that needs to be fulfilled. In the first step we might reach this goal by combining the fastest mobility services through a digital platform; in the next, it might be a task fulfilled by Virtual Reality. These new offerings are organized on platforms or networks, and less so in processes. And artificial intelligence makes it possible to break down tasks in such a way that everyone contributes what he or she can do best. People define problems and pre-structure them, and machines or algorithms develop solutions that people evaluate in the end.
Radiologists are now assisted by machine-learning-driven tools that allow them to evaluate digital content in ways that were not possible before. Many radiologists have even claimed that ML-driven advice has significantly improved their ability to interpret X-rays.
I would even go a step further because I believe it's possible to "rehumanize" work and make our unique abilities as human beings even more important. Until now, access to digital technologies was limited above all by a machine's abilities: The interfaces to our systems are no longer machine-driven; in the future humans will be the starting point. For example, anyone who wanted to teach a robot how to walk in the age of automation had to exactly calculate every single angle of slope from the upper to lower thigh, as well as the speed of movement and other parameters, and then formulate them as a command in a programming language. In the future, we'll be able to communicate and work with robots more intensively in our "language". So teaching a robot to walk will be much easier in the future. The robot can be controlled by anyone via voice command, and it could train itself by analyzing how humans do it via a motion scanner, applying the process, and perfecting it.
With the new technological possibilities and greater computing power, work in the future will be more focused on people and less on machines. Machine learning can make human labor more effective. Companies like C-SPAN show how: scores of people would have to scan video material for hours in order to create keywords, for example, according to a person's name. Today, automated face recognition can do this task in seconds, allowing employees to immediately begin working with the results.
Redefining the relationship between human and machine
The progress at the interface of human and machine is happening at a very fast pace with already a visible impact on how we work. In the future, technology can become a much more natural part of our workplace that can be activated by several input methods — speaking, seeing, touching or even smelling. Take voice-control technologies, a field that is currently undergoing a real disruption. This area distinguishes itself radically from what we knew until now as the "hands-free" work approach, which ran purely through simple voice commands. Modern voice-control systems can understand, interpret and answer conversations in a professional way, which makes a lot of work processes easier to perform. Examples are giving diagnoses to patients or legal advice. At the end of 2018, voice (input) will have already significantly changed the way we develop devices and apps. People will be able to connect technologies into their work primarily through voice. One can already get an inkling of what that looks like in detail.
At the US space agency NASA, for example, Amazon Alexa organizes the ordering of conference rooms. A room doesn't always have to be requested for every single meeting. Rather, anyone who needs a room asks Alexa and the rest happens automatically. Everyone knows the stress caused by telephone conferences: they never start on time because someone hasn't found the right dial-in number and it takes a while until you've typed in the 8-digit number plus a 6-digit conference code. A voice command creates a lot more productivity. The AWS Service Transcribe could start creating a transcript right away during the meeting and send it to all participants afterwards. Other companies, such as the Japanese firm Mitsui or the software provider bmc, use Alexa for Business to achieve a more efficient and better collaboration between their employees, among others.
The software provider fme also uses voice control to offer its customers innovative applications in the field of business intelligence, social business collaboration and enterprise-content-management technologies. The customers of fme mainly come from life sciences and industrial manufacturing. Employees can search different types of content using voice control, navigate easily through the content, and have the content displayed or read to them. Users can have Alexa explain individual tasks to them in OpenText Documentum, to give another example. This could be used to make the onboarding of new employees faster and cheaper – their managers would not have to perform the same information ritual again and again. A similar approach can be found at pharmaceutical company AstraZeneca, which uses Alexa in its manufacturing: Team members can ask questions about standard processes to find out what they need to do next.
Of course, responsibilities and organizations will change as a result of these developments. Resources for administrative tasks can be turned into activities that have a direct benefit for the customer. Regarding the character of work in the future, we will probably need more "architects," "developers," "creatives," "relationship experts," "platform specialists," and "analysts" and fewer people who need to perform tasks according to certain pre-determined steps, as well as fewer "administrators". By speaking more to humans' need to create and shape, work might ultimately become more fulfilling and enjoyable.
Expanding the digital world
This new understanding of the relationship between man and machine has another important effect: It will significantly expand the number of people who can participate in digital value creation: older people, people who at the moment don't have access to a computer or smartphone, people for whom using the smartphone in a specific situation is too complicated, and people in developing countries who can't read or write. A good example of the latter is rice farmers who work with the International Rice Research Institute, an organization based near Manila, the Philippines. The institute's mission is to fight poverty, hunger and malnutrition by easing the lives and work of rice farmers. Rice farmers can benefit from knowledge to which they wouldn't have access were they on their own. The institute has saved 70,000 DNA sequences of different types of rice, from which conclusions can be drawn about the best conditions for growing rice. Every village has a telephone, and by using it the farmers can access this knowledge: they select their dialect in a menu and describe which piece of land they tend. The service is based on machine learning. It generates recommendations on how much fertilizer is needed and when the best time is to plant the crops. So with the help of digital technologies, farmers can see how their work becomes more valuable: a given amount of effort produces a richer harvest of rice.
Until now we only have a tiny insight into the possibilities for the world of work. But they make clear that the quality of work for us humans will most probably increase, and that technology can allow us to perform many activities that we still cannot imagine today. Although there are twice as many robots per capita in German companies than in US firms, German industry still has trouble finding qualified employees rather than having to fight unemployment. In the future we humans will be able to carry out activities in a way that is closer to our creative human nature than is the case today. I believe that if we want to do justice to the technological possibilities, we should do it like Hölderlin and have faith in the rescue, but at the same time try to minimize the risks by understanding and shaping things.
I wrote to you over two years ago about what happens under the hood of Amazon ECS. Last year at re:Invent, we launched AWS Fargate, and today, I want to explore how Fargate fundamentally changes the landscape of container technology.
I spend a lot of time talking to our customers and leaders at Amazon about innovation. One of the things I've noticed is that ideas and technologies which dramatically change the way we do things are rarely new. They're often the combination of an existing concept with an approach, technology, or capability in a particular way that's never been successfully tried before.
The rapid embrace of containers in the past four years is the result of blending old technology (containers) with a new toolchain and workflow (i.e., Docker), and the cloud. In our industry, four years is a long time, but I think we've only just started exploring how this combination of code packaging, well-designed workflows, and the cloud can reshape the ability of developers to quickly build applications and innovate.
Containers solve a fundamental code portability problem and enable new infrastructure patterns on the cloud. Having a consistent, immutable unit of deployment to work with lets you abstract away all the complexities of configuring your servers and deployment pipelines every time you change your code or want to run your app in a different place. But containers also put another layer between your code and where it runs. They are an important, but incremental, step on the journey of being able to write code and have it run in the right place, with the right scale, with the right connections to other bits of code, and the right security and access controls.
Solving these higher order problems of deploying, scheduling, and connecting containers across environments gave us container management tools. Container orchestration has always seemed to me to be very not cloud native. Managing a large server cluster and optimizing the scheduling of containers all backed by a complex distributed state store is counter to the premise of the cloud. Customers choose the cloud to pay as they go, not have to guess capacity, get deep operational control without operational burden, build loosely coupled services with limited blast radii to prevent failures, and self-service for everything they need to run their code.
You should be able to write your code and have it run, without having to worry about configuring complex management tools, open source or not. This is the vision behind AWS Fargate. With Fargate, you don't need to stand up a control plane, choose the right instance type, or configure all the other components of your application stack like networking, scaling, service discovery, load balancing, security groups, permissions, or secrets management. You simply build your container image, define how and where you want it to run, and pay for the resources you need. Fargate has native integrations to Amazon VPC, Auto Scaling, Elastic Load Balancing, IAM roles, and Secrets Management. We've taken the time to make Fargate production ready with a 99.99% uptime SLA and compliance with PCI, SOC, ISO, and HIPAA.
With AWS Fargate, you can provision resources to run your containers at a much finer grain than with an EC2 instance. You can select exactly the CPU and memory your code needs and the amount you pay scales exactly with how many containers you run. You don't have to guess at capacity to handle spikey traffic, and you get the benefit of perfect scale, which lets you offload a ton of operational effort onto the cloud. MiB for MiB, this might mean that cloud native technologies like Fargate look more expensive than more traditional VM infrastructure on paper. But if you look at the full cost of running an app, we believe most applications will be cheaper with Fargate as you only pay what you need. Our customers running Fargate see big savings in the developer hours required to keep their apps running smoothly.
The entire ecosystem of container orchestration solutions arose out of necessity because there was no way to natively procure a container in the cloud. Whether you use Kubernetes, Mesos, Rancher, Nomad, ECS or any other system no longer matters anymore because with Fargate, there is nothing to orchestrate. The only thing that you have to manage is the construction of the applications themselves. AWS Fargate finally makes containers cloud native.
I think the next area of innovation we will see after moving away from thinking about underlying infrastructure is application and service management. How do you interconnect the different containers that run independent services, ensure visibility, manage traffic patterns, and security for multiple services at scale? How do independent services mutually discover one another? How do you define access to common data stores? How do you define and group services into applications? Cloud native is about having as much control as you want and I am very excited to see how the container ecosystem will evolve over the next few years to give you more control with less work. We look forward to working with the community to innovate forward on the cloud native journey on behalf of our customers.
AWS Fargate already seamlessly integrates with Amazon ECS. You just define your application as you do for Amazon ECS. You package your application into task definitions, specify the CPU and memory needed, define the networking and IAM policies that each container needs, and upload everything to Amazon ECS. After everything is setup, AWS Fargate launches and manages your containers for you.
AWS Fargate support for Amazon EKS, the Elastic Kubernetes Service, will be available later in 2018.
At AWS, we don't mark many anniversaries. But every year when March 14th comes around, it's a good reminder that Amazon S3 originally launched on Pi Day, March 14, 2006. The Amazon S3 team still celebrate with homemade pies!
March 26, 2008 doesn't have any delicious desserts associated with it, but that's the day when we launched Availability Zones for Amazon EC2. A concept that has changed infrastructure architecture is now at the core of both AWS and customer reliability and operations.
Powering the virtual instances and other resources that make up the AWS Cloud are real physical data centers with AWS servers in them. Each data center is highly reliable, and has redundant power, including UPS and generators. Even though the network design for each data center is massively redundant, interruptions can still occur.
Availability Zones draw a hard line around the scope and magnitude of those interruptions. No two zones are allowed to share low-level core dependencies, such as power supply or a core network. Different zones can't even be in the same building, although sometimes they are large enough that a single zone spans several buildings.
We launched with three autonomous Availability Zones in our US East (N. Virginia) Region. By using zones, and failover mechanisms such as Elastic IP addresses and Elastic Load Balancing, you can provision your infrastructure with redundancy in mind. When two instances are in different zones, and one suffers from a low-level interruption, the other instance should be unaffected.
How Availability Zones have changed over the years
Availability Zones were originally designed for physical redundancy, but over time they have become re-used for more and more purposes. Zones impact how we build, deploy, and operate software, as well as how we enforce security controls between our largest systems.
For example, many AWS services are now built so that as much functionality as possible can be autonomous within an Availability Zone. The calls used to launch and manage EC2 instances, fail over an RDS instance, or handle the health of instances behind a load balancer, all work within one zone.
This design has a double benefit. First, if an Availability Zone does lose power or connectivity, the remaining zones are unaffected. The second benefit is even more powerful: if there is an error in the software, the risk of that error affecting other zones is minimized.
We maximize this benefit when we deploy new versions of our software, or operational changes such as a configuration edit, as we often do so zone-by-zone, one zone in a Region at a time. Although we automate, and don't manage instances by hand, our developers and operators know not to build tools or procedures that could impact multiple Availability Zones. I'd wager that every new AWS engineer knows within their first week, if not their first day, that we never want to touch more than one zone at a time.
Availability Zones run deep in our AWS development and operations culture, at every level. AWS customers can think of zones in terms of redundancy, "Use two or more Availability Zones for reliability." At AWS, we think of zones in terms of isolation, "Stay within the Availability Zone, as much as possible."
Silo your traffic or not – you choose
When your architecture does stay within an Availability Zone as much as possible, there are more benefits. One is that the latency within a zone is incredibly fast. Today, packets between EC2 instances in the same zone take just tens of microseconds to reach other.
Another benefit is that redundant zonal architectures are easier to recover from complex issues and emergent behaviors. If all of the calls between the various layers of a service stay within one Availability Zone, then when issues occur they can quickly be remediated by removing the entire zone from service, without needing to identify the layer or component that was the trigger.
Many of you also use this kind of "silo" pattern in your own architecture, where Amazon Route 53 or Elastic Load Balancing can be used to choose an Availability Zone to handle a request, but can also be used to keep subsequent internal requests and dependencies within that same zone. This is only meaningful because of the strong boundaries and separation between zones at the AWS level.
Not too long after we launched Availability Zones, we also launched our second Region, EU (Ireland). Early in the design, we considered operating a seamless global network, with open connectivity between instances in each Region. Services such as S3 would have behaved as "one big S3," with keys and data accessible and mutable from either location.
The more we thought through this design, the more we realized that there would be risks of issues and errors spreading between Regions, potentially resulting in large-scale interruptions that would defeat our most important goals:
- To provide the highest levels of availability
- To allow Regions to act as standby sites for each other
- To provide geographic diversity and lower latencies to end users
Our experience with the benefits of Availability Zones meant that instead we doubled down on compartmentalization, and decided to isolate Regions from each other with our hardest boundaries. Since then, and still today, our services operate autonomously in each Region, full stacks of S3, DynamoDB, Amazon RDS, and everything else.
Many of you still want to be able to run workloads and access data globally. For our edge services such as Amazon CloudFront, Amazon Route 53, and AWS Lambda@Edge, we operate over 100 points of presence. Each is its own Availability Zone with its own compartmentalization.
As we develop and ship our services that span Regions, such as S3 cross-region object replication, Amazon DynamoDB global tables, and Amazon VPC inter-region peering, we take enormous care to ensure that the dependencies and calling patterns between Regions are asynchronous and ring-fenced with high-level safety mechanisms that prevent errors from spreading.
Doubling down on compartmentalization, again
With the phenomenal growth of AWS, it can be humbling how many customers are being served even by our smallest Availability Zones. For some time now, many of our services have been operating service stacks that are compartmentalized even within zones.
For example, AWS HyperPlane—the internal service that powers NAT gateways, Network Load Balancers, and AWS PrivateLink—is internally subdivided into cells that each handle a distinct set of customers. If there are any issues with a cell, the impact is limited not just to an Availability Zone, but to a subset of customers within that zone. Of course, all sorts of automation immediately kick in to mitigate any impact to even that subset.
Ten years after launching Availability Zones, we're excited that we're still relentless about reducing the impact of potential issues. We firmly believe it's one of the most important strategies for achieving our top goals of security and availability. We now have 54 Availability Zones, across 18 geographic Regions, and we've announced plans for 12 more. Beyond that geographic growth, we'll be extending the concept of compartmentalization that underlies Availability Zones deeper and deeper, to be more effective than ever.
In machine learning, more is usually more. For example, training on more data means more accurate models.
At AWS, we continue to strive to enable builders to build cutting-edge technologies faster in a secure, reliable, and scalable fashion. Machine learning is one such transformational technology that is top of mind not only for CIOs and CEOs, but also developers and data scientists. Last re:Invent, to make the problem of authoring, training, and hosting ML models easier, faster, and more reliable, we launched Amazon SageMaker. Now, thousands of customers are trying Amazon SageMaker and building ML models on top of their data lakes in AWS.
While building Amazon SageMaker and applying it for large-scale machine learning problems, we realized that scalability is one of the key aspects that we need to focus on. So, when designing Amazon SageMaker we took on a challenge: to build machine learning algorithms that can handle an infinite amount of data. What does that even mean though? Clearly, no customer has an infinite amount of data.
Nevertheless, for many customers, the amount of data that they have is indistinguishable from infinite. Bill Simmons, CTO of Dataxu, states, "We process 3 million ad requests a second - 100,000 features per request. That's 250 trillion ad requests per day. Not your run-of-the-mill data science problem!" For these customers and many more, the notion of "the data" does not exist. It's not static. Data always keeps being accrued. Their answer to the question "how much data do you have?" is "how much can you handle?"
To make things even more challenging, a system that can handle a single large training job is not nearly good enough if training jobs are slow or expensive. Machine learning models are usually trained tens or hundreds of times. During development, many different versions of the eventual training job are run. Then, to choose the best hyperparameters, many training jobs are run simultaneously with slightly different configurations. Finally, re-training is performed every x-many minutes/hours/days to keep the models updated with new data. In fraud or abuse prevention applications, models often need to react to new patterns in minutes or even seconds!
To that end, Amazon SageMaker offers algorithms that train on indistinguishable-from-infinite amounts of data both quickly and cheaply. This sounds like a pipe dream. Nevertheless, this is exactly what we set out to do. This post lifts the veil on some of the scientific, system design, and engineering decisions we made along the way.
To handle unbounded amounts of data, our algorithms adopt a streaming computational model. In the streaming model, the algorithm only passes over the dataset one time and assumes a fixed-memory footprint. This memory restriction precludes basic operations like storing the data in memory, random access to individual records, shuffling the data, reading through the data several times, etc.
Streaming algorithms are infinitely scalable in the sense that they can consume any amount of data. The cost of adding more data points is independent of the entire corpus size. In other words, processing the 10th gigabyte and 1000th gigabyte is conceptually the same. The memory footprint of the algorithms is fixed and it is therefore guaranteed not to run out of memory (and crash) as the data grows. The compute cost and training time depend linearly on the data size. Training on twice as much data costs twice as much and take twice as long.
Finally, traditional machine learning algorithms usually consume data from persistent data sources such as local disk, Amazon S3, or Amazon EBS. Streaming algorithms also natively consume ephemeral data sources such as Amazon Kinesis streams, pipes, database query results, and almost any other data source.
Another significant advantage of streaming algorithms is the notion of a state. The algorithm state contains all the variables, statistics, and data structures needed to perform updates, that is, all that is required to continue training. By formalizing this concept and facilitating it with software abstractions, we provide checkpointing capabilities and fault resiliency for all algorithms. Moreover, checkpointing enables multi-pass/multi-epoch training for persistent data, a pause/resume mechanism that is useful for cost effective HPO, and incremental training that updates the model only with new data rather running the entire training job from scratch.
Acceleration and distribution
When AWS customers run large-scale training tasks on Amazon SageMaker, they are interested in reducing the running time and cost of their job, irrespective of the number and kinds of machines used under the hood. Amazon SageMaker algorithms are therefore built to take advantage of many Amazon EC2 instance types, support both CPU and GPU computation, and distribute across many machines.
Cross-instance support relies heavily on containerization. Amazon SageMaker training supports powerful container management mechanisms that include spinning up large numbers of containers on different hardware with fast networking and access to the underlying hardware, such as GPUs. For example, a training job that takes ten hours to run on a single machine can be run on 10 machines and conclude in one hour. Furthermore, switching those machines to GPU-enabled ones could reduce the running time to minutes. This can all be done without touching a single line of code.
To seamlessly switch between CPU and GPU machines, we use Apache MXNet to interface with the underlying hardware. By designing algorithms that operate efficiently on different types of hardware, our algorithms gain record speeds and efficiency.
Distribution across machines is achieved via a parameter server that stores the state of all the machines participating in training. The parameter server is designed for maximal throughput by updating parameters asynchronously and offering only loose consistency properties of the parameters. While these are unacceptable in typical relational database designs, for machine learning, the tradeoff between accuracy and speed is worth it.
Post-training model tuning and rich states
Processing massively scalable datasets in a streaming manner poses a challenge for model tuning, also known as hyperparameter optimization (HPO). In HPO, many training jobs are run with different configurations or training parameters. The goal is to find the best configuration, usually the one corresponding to the best test accuracy. This is impossible in the streaming setting.
For ephemeral data sources, the data is no longer available for rerunning the training job (for persistent data sources, this is possible but inefficient). Amazon SageMaker algorithms solve this by training an expressive state object, out of which many different models can be created. That is, a large number of different training configurations can be explored after only a single training job.
Amazon SageMaker offers production-ready, infinitely scalable algorithms such as:
- Linear Learner
- Factorization Machines
- Neural Topic Modeling
- Principal Component Analysis (PCA)
- K-Means clustering
- DeepAR forecasting
They adhere to the design principles above and rely on Amazon SageMaker's robust training stack. They are operationalized by a thick, common SDK that allows us to test them thoroughly before deployment. We have invested heavily in the research and development of each algorithm, and every one of them advances the state of the art. Amazon SageMaker algorithms train larger models on more data than any other open-source solution out there. When a comparison is possible, Amazon SageMaker algorithms often run more than 10x faster than other ML solutions like Spark ML. Amazon SageMaker algorithms often cost cents on the dollar to train, in terms of compute costs, and produce more accurate models than the alternatives.
I think the time is here for using large-scale machine learning in large-scale production systems. Companies with truly massive and ever-growing datasets must not fear the overhead of operating large ML systems or developing the associated ML know-how. AWS is delighted to innovate on our customers' behalf and to be a thought leader, especially in exciting areas like machine learning. I hope and believe that Amazon SageMaker and its growing set of algorithms will change the way companies do machine learning.
At Amazon, we are heavily invested in machine learning (ML), and are developing new tools to help developers quickly and easily build, train, and deploy ML models. The power of ML is in its ability to unlock a new set of capabilities that create value for consumers and businesses. A great example of this is the way we are using ML to deal with one of the world's biggest and most tangled datasets: human speech.
Voice-driven conversation has always been the most natural way for us to communicate. Conversations are personal and they convey context, which helps us to understand each other. Conversations continue over time, and develop history, which in turn builds richer context. The challenge was that technology wasn't capable of processing real human conversation.
The interfaces to our digital system have been dictated by the capabilities of our computer systems—keyboards, mice, graphical interfaces, remotes, and touch screens. Touch made things easier; it let us tap on screens to get the app that we wanted. But what if touch isn't possible or practical? Even when it is, the proliferation of apps has created a sort of "app fatigue". This essentially forces us to hunt for the app that we need, and often results in us not using many of the apps that we already have. None of these approaches are particularly natural. As a result, they fail to deliver a truly seamless and customer-centric experience that integrates our digital systems into our analog lives.
Voice becomes a game changer
Using your voice is powerful because it's spontaneous, intuitive, and enables you to interact with technology in the most natural way possible. It may well be considered the universal user interface. When you use your voice, you don't need to adapt and learn a new user interface. Voice interfaces don't need to be application-centric, so you don't have to find an app to accomplish the task that you want. All of these benefits make voice a game changer for interacting with all kinds of digital systems.
Until 2-3 years ago we did not have the capabilities to process voice at scale and in real time. The availability of large scale voice training data, the advances made in software with processing engines such as Caffe, MXNet and Tensorflow, and the rise of massively parallel compute engines with low-latency memory access, such as the Amazon EC2 P3 instances have made voice processing at scale a reality.
Today, the power of voice is most commonly used in the home or in cars to do things like play music, shop, control smart home features, and get directions. A variety of digital assistants are playing a big role here. When we released Amazon Alexa, our intelligent, cloud-based voice service, we built its voice technology on the AWS Natural Language Processing platform powered by ML algorithms. Alexa is constantly learning, and she has tens of thousands of skills that extend beyond the consumer space. But by using the stickiness of voice, we think there are even more scenarios that can be unlocked at work.
Helping more people and organizations use voice
People interact with many different applications and systems at work. So why aren't voice interfaces being used to enable these scenarios? One impediment is the ability to manage voice-controlled interactions and devices at scale, and we are working to address this with Alexa for Business. Alexa for Business helps companies voice-enable their spaces, corporate applications, people, and customers.
To use voice in the workplace, you really need three things. The first is a management layer, which is where Alexa for Business plays. Second, you need a set of APIs to integrate with your IT apps and infrastructure, and third is having voice-enabled devices everywhere.
Voice interfaces are a paradigm shift, and we've worked to remove the heavy lifting associated with integrating Alexa voice capabilities into more devices. For example, Alexa Voice Service (AVS), a cloud-based service that provides APIs to interface with Alexa, enables products built using AVS to have access to Alexa capabilities and skills.
We're also making it easy to build skills for the things you want to do. This is where the Alexa Skills Kit and the Alexa Skills Store can help both companies and developers. Some organizations may want to control who has access to the skills that they build. In those cases, Alexa for Business allows people to create a private skill that can only be accessed by employees in your organization. In just a few months, our customers have built hundreds of private skills that help voice-enabled employees do everything from getting internal news briefings to asking what time their help desk closes.
Just like Alexa is making smart homes easier, the same is possible in the workplace. Alexa can control the environment, help you find directions, book a room, report an issue, or find transportation. One of the biggest applications of voice in the enterprise is conference rooms and we've built some special skills in this area to allow people to be more productive.
For example, many meetings fail to start on time. It's usually a struggle to find the dial-in information, punch in the numbers, and enter a passcode every time a meeting starts. With Alexa for Business, the administrator can configure the conference rooms and integrate calendars to the devices. When you walk into a meeting, all you have to say is "Alexa, start my meeting". Alexa for Business automatically knows what the meeting is from the integrated calendar, mines the dial-in information, dials into the conference provider, and starts the meeting. Furthermore, you can also configure Alexa for Business to automatically lower the projector screen, dim the lights, and more. People who work from home can also take advantage of these capabilities. By using Amazon Echo in their home office and asking Alexa to start the meeting, employees who have Alexa for Business in their workplace are automatically connected to the meeting on their calendar.
Voice interfaces will really hit their stride when we begin to see more voice-enabled applications. Today, Alexa can interact with many corporate applications including Salesforce, Concur, ServiceNow, and more. IT developers who want to take advantage of voice interfaces can enable their custom apps using the Alexa Skills Kit, and make their skills available just for their organization. There are a number of agencies and SIs that can help with this, and there are code repositories with code examples for AWS services.
We're seeing a lot of interesting use cases with Alexa for Business from a wide range of companies. Take WeWork, a provider of shared workspaces and services. WeWork has adopted Alexa, managed by Alexa for Business, in their everyday workflow. They have built private skills for Alexa that employees can use to reserve conference rooms, file help tickets for their community management team, and get important information on the status of meeting rooms. Alexa for Business makes it easy for WeWork to configure and deploy Alexa-enabled devices, and the Alexa skills that they need to improve their employees' productivity.
The next generation of corporate systems and applications will be built using conversational interfaces, and we're beginning to see this happen with customers using Alexa for Business in their workplace. Want to learn more? If you are attending Enterprise Connect in Orlando next week, I encourage you to attend the AWS keynote on March 13 given by Collin Davis. Collin's team has focused on helping customers use voice to manage everyday tasks. He'll have more to share about the advances we're seeing in this space, and what we're doing to help our customers be successful in a voice-enabled era.
When it comes to enabling voice capabilities at home and in the workplace, we're here to help you build.
This article titled "Daten müssen strategischer Teil des Geschäfts werden" appeared in German last week in the "IT und Datenproduktion" column of Wirtschaftwoche.
How companies can use ideas from mass production to create business with data
Strategically, IT doesn't matter. That was the provocative thesis of a much-talked-about article from 2003 in the Harvard Business Review by the US publicist Nicolas Carr. Back then, companies spent more than half of their entire investment for their IT, in a non-differentiating way. In a world in which tools are equally accessible for every company, they wouldn't offer any competitive advantage – so went the argument. The author recommended steering investments toward strategically relevant resources instead. In the years that followed, many companies outsourced their IT activities because they no longer regarded them as being part of the core business.
A new age
Nearly 15 years later, the situation has changed. In today's era of global digitalization there are many examples that show that IT does matter. Developments like cloud computing, the internet of things, artificial intelligence, and machine learning are proving that IT has (again) become a strategic business driver. This is transforming the way companies offer products and services to their customers today. Take the example of industrial manufacturing: in prototyping, drafts for technologically complex products are no longer physically produced; rather, their characteristics can be tested in a purely virtual fashion at every location across the globe by using simulations. The German startup SimScale makes use of this trend. The founders had noticed that in many companies, product designers worked in a very detached manner from the rest of production. The SimScale platform can be accessed through a normal web browser. In this way, designers are part of an ecosystem in which the functionalities of simulations, data and people come together, enabling them to develop better products faster.
Value-added services are also playing an increasingly important role for both companies and their customers. For example, Kärcher, the maker of cleaning technologies, manages its entire fleet through the cloud solution "Kärcher Fleet". This transmits data from the company's cleaning devices e.g. about the status of maintenance and loading, when the machines are used, and where the machines are located. The benefit for customers: Authorized users can view this data and therefore manage their inventories across different sites, making the maintenance processes much more efficient.
Kärcher benefits as well: By developing this service, the company gets exact insight into how the machines are actually used by its customers. By knowing this, Kärcher can generate new top-line revenue in the form of subscription models for its analysis portal.
More than mere support
These examples underline that the purpose of software today is not solely to support business processes, but that software solutions have broadly become an essential element in multiple business areas. This starts with integrated platforms that can manage all activities, from market research to production to logistics. Today, IT is the foundation of digital business models, and therefore has a value-added role in and of itself. That can be seen when sales people, for example, interact with their customers in online shops or via mobile apps. Marketers use big data and artificial intelligence to find out more about the future needs of their customers. Breuninger, a fashion department store chain steeped in tradition, has recognized this and relies on a self-developed e-commerce platform in the AWS Cloud. Breuninger uses modern templates for software development, such as Self-Contained Systems (SCS), so that it can increase the speed of software development with agile and autonomous teams and quickly test new features. Each team acts according to the principle: "You build it, you run it". Hence, the teams are themselves responsible for the productive operation of the software. The advantage of this approach is that when designing new applications, there is already a focus on the operating aspects.
Value creation through data
In a digital economy, data are at the core of value creation, whereas physical assets are losing their significance in business models. Until 1992, the most highly valued companies in the S&P 500 Index were those that made or distributed things (for example the pharmaceutical industry, trade). Today, developers of technology (for example medical technology, software) and platform operators (social media enablers, credit card companies) are at the top. Also, trade with data contributes more to global growth than trade with goods. Therefore, IT has never been more important for strategy than it is now – not only for us, but for every company in the digital age. Anyone who wants to further develop his business digitally can't do that today without at the same time thinking about which IT infrastructure, which software and which algorithms he needs in order to achieve his plans.
If data take center stage then companies must learn how to create added value out of it – namely by combining the data they own with external data sources and by using modern, automated analytics processes. This is done through software and IT services that are delivered through software APIs.
Companies that want to become successful and innovative digital players need to get better at building software solutions.We should ponder how we can organize the 'production' of data in such a way so that we ultimately come out with a competitive advantage. We need mechanisms that enable the mass production of data using software and hardware capabilities. These mechanisms need to be lean, seamless and effective. At the same time, we need to ensure that quality requirements can be met. Those are exactly the challenges that were solved for physical goods through the industrialization of manufacturing processes. A company that wants to industrialize 'software production' needs to find ideas on how to achieve the same kind of lean and qualitatively first-class mass production that has already occurred for industrial goods. And inevitably, the first place to look will be lean production approaches such as Kanban and Kaizen, or total quality management. In the 1980s, companies like Toyota revolutionized the production process by reengineering the entire organization and focusing the company on similar principles. Creating those conditions, both from an organizational and IT- standpoint, is one of the biggest challenges that companies face in the digital age.
Learn from lean
Can we transfer this success model to IT as well? The answer is yes. In the digital world, it is decisive to activate data-centric processes and continuously improve them. Thus, any obstacles that stand in the way of experimentation and the further development of new ideas should be removed as fast as possible. Every new IT project should be regarded as an idea that must go through a data factory – a fully equipped production site with common processes that can be easily maintained. The end-product is high-quality services or algorithms that support digital business models. Digital companies differentiate themselves through their ideas, data and customer relationships. Those that find a functioning digital business model the fastest will have a competitive edge. Above all, the barrier between software development and the operating business has to be overcome. The reason is that the success and speed and frequency of these experiments depend on the performance of IT development, and at the same time on the relevance of the solutions for business operations. Autoscout24 has gained an enormous amount of agility through its cloud solution. The company meanwhile has 15 autonomous interdisciplinary teams working constantly to test and explore new services. The main goal in all this is to have the possibility to quickly iterate experiments through the widest range of architectures, combine services with each other, and compare approaches.
In order to become as agile as Autoscout24, companies need a "machine" that produces ideas. Why not transfer the success formulas from industrial manufacturing and the principles of quality management to the creation of software?
German industrial companies in particular possess a manufacturing excellence that has been built up over many decades. Where applicable, they should do their best to transfer this knowledge to their IT, and in particular to their software development.
In many companies, internal IT knowhow has not developed fast enough in the last few years – quite contrary to the technological possibilities. Customers provide feedback online immediately after their purchase. Real-time analyses are possible through big data and software updates are generated daily through the cloud. Often, the IT organization and its associated processes couldn't keep up. As a consequence, specialist departments with the structures of yesterday are supposed to fulfill customer requirements of tomorrow. Bringing innovative products and services quickly to market is not possible with long-term IT sourcing cycles. It's no wonder that many of specialist departments try to circumvent their own IT department, for example by shifting activities to the cloud, which offers many powerful IT building blocks through easy-to-use APIs for which companies previously had to operate complicated software and infrastructure. Such a decentralized 'shadow IT' delivers no improvements. The end effect is that the complexity of the system increases, which is not efficient. This pattern should be broken. Development and Operations need to work hand in hand instead of working sequentially after each other, as in the old world. And ideally, this should be done in many projects running parallel. Under the heading of DevOps – the combination of "Development and Operations" – IT guru Gene Kim has described the core characteristics of this machinery.
Ensuring the flow
Kim argues that theorganization must be built around the customer benefit and that the flow of projects must be as smooth as possible. Hurdles that block the creation of client benefits should be identified and removed. At Amazon this starts by staffing projects with cross-functional and interdisciplinary teams as a rule. Furthermore, for the sake of agility the teams should not exceed a certain size. We have a rule that teams should be exactly the size that allows everyone to feel full after eating two (large!) pizzas. This approach reduces the number of necessary handovers, increases responsibility, and allows the team to provide customers with software faster.
The earlier client feedback flows back into the "production process", the better. In addition, companies must ensure that every piece of feedback is applied to future projects. To avoid getting lost in endless feedback loops, this should be organized in a lean way: Obtaining the feedback of internal and external stakeholders must by no means hamper the development process.
Learning to take risks
"Good intentions never work, you need good mechanisms to make anything happen," says Jeff Bezos. For that, you need a corporate culture that teaches employees to experiment constantly and deliver. With every new experiment, one should risk yet another small step forward behind the previous step. At the same time, from every team we need data based on predefined KPIs about the impact of the experiments. And we need to establish mechanisms that take effect immediately if we go too far or if something goes wrong, for example if the solution never reached the customer.
Anyone who has tried this knows it's not easy to start your own digital revolution in the company and keep the momentum going. P3 advises cellular operators and offers its customers access to data that provide information about the quality of cellular networks (for example signal strength, broken connection and the data throughput) – worldwide and independent of the network operator and cellular provider. This allows the customers to come up with measures in order to expand their networks or new offerings for a more efficient utilization of their capacity. By introducing DevOps tools, P3 can define an automated process that implements the required compute infrastructure in the AWS Cloud and deploys project-specific software packages with the push of a button. Moreover, the process definition can be revised by developers, the business or data scientists at any time, for example in order to develop new regions, add analytics software or implement new AWS services. Now P3 can focus fully on its core competence, namely developing its proprietary software. Data scientists can use their freed-up resources to analyze in real time data that are collected from around the world and put insights from the analysis at the disposal of their clients
The cloud offers IT limitless possibilities on the technical side, from which new opportunities have been born. But it's becoming ever clearer what is required in order to make use of these opportunities. Technologies change faster than people. And individuals faster than entire organizations. Tackling these challenges is a strategic necessity. Changing the organization is the next bottleneck on the way to becoming a digital champion.
Today, I'm happy to announce that the AWS EU (Paris) Region, our 18th technology infrastructure Region globally, is now generally available for use by customers worldwide. With this launch, AWS now provides 49 Availability Zones, with another 12 Availability Zones and four Regions in Bahrain, Hong Kong, Sweden, and a second AWS GovCloud (US) Region expected to come online by early 2019.
In France, you can find one of the most vibrant startup ecosystems in the world, a strong research community, excellent energy, telecom, and transportation infrastructure, a very strong agriculture and food industry, and some of the most influential luxury brands in the world. The cloud is an opportunity to stay competitive in each of these domains by giving companies freedom to innovate quickly. This is why tens of thousands of French customers already use AWS in Regions around the world. Starting today, developers, startups, and enterprises, as well as government, education, and non-profit organizations can leverage AWS to run applications and store data in France.
French companies are using AWS to innovate in a secure way across industries as diverse as energy, financial services, manufacturing, media, pharmaceuticals and health sciences, retail, and more. Companies of all sizes across France are also using AWS to innovate and grow, from startups like AlloResto, CaptainDash, Datadome, Drivy, Predicsis, Payplug, and Silkke to enterprises like Decathlon, Nexity, Soitec, TF1 as well as more than 80 percent of companies listed on the CAC 40, like Schneider Electric, Societe Generale, and Veolia.
We are also seeing a strong adoption of AWS within the public sector with organizations using AWS to transform the services they deliver to the citizens of France.Kartable, Les Restos du Coeur, OpenClassrooms, Radio France, SNCF, and many more are using AWS to lower costs and speed up their rate of experimentation so they can deliver reliable, secure, and innovative services to people across the country.
The opening of the AWS EU (Paris) Region adds to our continued investment in France. Over the last 11 years, AWS has expanded its physical presence in the country, opening an office in La Defense and launching Edge Network Locations in Paris and Marseille. Now, we're opening an infrastructure Region with three Availability Zones. We decided to locate the AWS data centers in the area of Paris, the capital and economic center of France because it is home to many of the world's largest companies, the majority of the French public sector, and some of Europe's most dynamic startups.
To give customers the best experience when connecting to the new Region, today we are also announcing the availability of AWS Direct Connect. Today, customers can connect to the AWS EU (Paris) Region via Telehouse Voltaire. In January 2018, customers will be able to connect via Equinix Paris in January and later in the year via Interxion Paris. Customers that have equipment within these facilities can use Direct Connect to optimize their connection to AWS.
In addition to physical investments, we have also continually invested in people in France. For many years, we have been growing teams of account managers, solutions architects, trainers, business development, and professional services specialists, as well as other job functions. These teams are helping customers and partners of all sizes, including systems integrators and ISVs, to move to the cloud.
We have also been investing in helping to grow the entire French IT community with training, education, and certification programs. To continue this trend, we recently announced plans for AWS to train, at no charge, more than 25,000 people in France, helping them to develop highly sought-after skills. These people will be granted access to AWS training resources in France via existing programs such as AWS Academy, AWS Educate, AWSome days. They also get access to webinars delivered in French by AWS Technical Trainers and AWS Certified Trainers. To learn more about these trainings or discover when the next event will take place, visit: https://aws.amazon.com/fr/events/
All around us, we see AWS technologies fostering a culture of experimentation. I have been humbled by how much our French customers have been able to achieve using AWS technology. Over the past few months we've had Engie and Radio France at the AWS Summit, as well as Decathlon, Smatis, Soitec and Veolia at the AWS Transformation Days in Lille, Lyon, Nantes, Paris, and Toulouse. Everyone was talking about how they are using AWS to transform and scale their organizations. I, for one, look forward to seeing many more innovative use cases enabled by the cloud at the next AWS Summit in France!
Our AWS EU (Paris) Region is open for business now. We are excited to offer a complete portfolio of services, from our foundational technologies, such as compute, storage, and networking, to our more advanced solutions and applications such as artificial intelligence, IoT, machine learning, and serverless computing. We look forward to continuing to broaden this portfolio to include more services into the future. For more information about the new AWS EU (Paris) Region, or to get started now, I would encourage you to visit: https://aws.amazon.com/fr/paris/.
Today, I am happy to announce the general availability of AWS China (Ningxia) Region, operated by Ningxia Western Cloud Data Technology Co. Ltd. (NWCD). This is our 17th Region globally, and the second in China. To comply with China's legal and regulatory requirements, AWS has formed a strategic technology collaboration with NWCD to operate and provide services from the AWS China (Ningxia) Region. Founded in 2015, NWCD is a licensed data center and cloud services provider, based in Ningxia, China.
Coupled with the AWS China (Beijing) Region operated by Sinnet, the AWS China (Ningxia) Region, operated by NWCD, serves as the foundation for new cloud initiatives in China, especially in Western China. Both Regions are helping to transform businesses, increase innovation, and enhance the regional economy.
Thousands of customers in China are already using AWS services operated by Sinnet, to innovate in diverse areas such as energy, education, manufacturing, home security, mobile and internet platforms, CRM solutions, and the dairy industry, among others. These customers include large Chinese enterprises such as Envision Energy, Xiaomi, Lenovo, OPPO, TCL, Hisense, Mango TV, and Mengniu; well-known, fast growing startups including iQiyi, VIPKID, musical.ly, Xiaohongshu, Meitu, and Kunlun; and multinationals such as Samsung, Adobe, ThermoFisher Scientific, Dassault Systemes, Mapbox, Glu, and Ayla Networks. With AWS, Chinese customers can leverage world-class technologies both within China and around the world.
As this breadth of customers shows, we believe that AWS can and will serve China's innovation agenda. We are excited to collaborate with NWCD in Ningxia and Sinnet in Beijing to offer a robust portfolio of services. Our offerings range from our foundational service stack for compute, storage, and networking to our more advanced solutions and applications.
Starting today, China-based developers, startups, and enterprises, as well as government, education, and non-profit organizations, can use AWS to run their applications and store their data in the new AWS China (Ningxia) Region, operated by NWCD. Customers already using the AWS China (Beijing) Region, operated by Sinnet, can select the AWS China (Ningxia) Region directly from the AWS Management Console. New customers can request an account at www.amazonaws.cnto begin using both AWS China Regions.
Applications based on machine learning (ML) can provide tremendous business value. However, many developers find them difficult to build and deploy. As there are few individuals with this expertise, an easier process presents a significant opportunity for companies who want to accelerate their ML usage.
Though the AWS Cloud gives you access to the storage and processing power required for ML, the process for building, training, and deploying ML models has unique challenges that often block successful use of this powerful new technology.
The challenges begin with collecting, cleaning, and formatting training data. After the dataset is created, you must scale the processing to handle the data, which can often be a blocker. After this, there is often a long process of training that includes tuning the knobs and levers, called hyperparameters, that control the different aspects of the training algorithm. Finally, figuring out how to move the model into a scalable production environment can often be slow and inefficient for those that do not do it routinely.
At Amazon Web Services, we've committed to helping you unlock the value of your data through ML, through a set of supporting tools and resources that improve the ML model development experience. From the Deep Learning AMI and the distributed Deep Learning AWS CloudFormation template, to Gluon in Apache MXNet, we've focused on improvements that remove the roadblocks to development.
We also recently announced the Amazon ML Solutions Lab, which is a program to help you accelerate your use of ML in products and processes. As the adoption of these technologies continues to grow, customers have demanded a managed service for ML, to make it easier to get started.
Today, we are announcing the general availability of Amazon SageMaker. This new managed service enables data scientists and developers to quickly and easily build, train, and deploy ML models without getting mired in the challenges that slow this process down today.
Amazon SageMaker provides the following features:
- Hosted Jupyter notebooks that require no setup, so that you can start processing your training dataset and developing your algorithms immediately.
- One-click, on-demand distributed training that sets up and tears down the cluster after training.
- Built-in, high-performance ML algorithms, re-engineered for greater, speed, accuracy, and data-throughput.
- Built-in model tuning (hyperparameter optimization) that can automatically adjust hundreds of different combinations of algorithm parameters.
- An elastic, secure, and scalable environment to host your models, with one-click deployment.
In the hosted notebook environment, Amazon SageMaker takes care of establishing secure network connections in your VPC and launching an ML instance. This development workspace also comes pre-loaded with the necessary Python libraries and CUDA drivers, attaches an Amazon EBS volume to automatically persist notebook files, and installs TensorFlow, Apache MXNet, and Keras deep learning frameworks. Amazon SageMaker also includes common examples to help you get started quickly.
For training, you simply indicate the type and quantity of ML instances you need and initiate training with a single click. Amazon SageMaker then sets up the distributed compute cluster, installs the software, performs the training, and tears down the cluster when complete. You only pay for the resources that you use and never have to worry about the underlying infrastructure.
Amazon SageMaker also reduces the amount of time spent tuning models using built-in hyperparameter optimization. This technology automatically adjusts hundreds of different combinations of parameters, to quickly arrive at the best solution for your ML problem. With high-performance algorithms, distributed computing, managed infrastructure, and hyperparameter optimization, Amazon SageMaker drastically decreases the training time and overall cost of building production systems.
When you are ready to deploy, Amazon SageMaker offers an elastic, secure, and scalable environment to host your ML models, with one-click deployment. After training, Amazon SageMaker provides the model artifacts for deployment to EC2 or anywhere else. You then specify the type and number of ML instances. Amazon SageMaker takes care of launching the instances, deploying the model, and setting up the HTTPS endpoint for your application to achieve low latency / high throughput prediction.
In production, Amazon SageMaker manages the compute infrastructure to perform health checks, apply security patches, and conduct other routine maintenance, all with built-in Amazon CloudWatch monitoring and logging.
Before Amazon SageMaker, you were faced with a tradeoff between the flexibility to use different frameworks and the ease of use of a single platform. At AWS, we believe in giving choices, so Amazon SageMaker removes that problem. You can now use the tools of your choice, with a single environment for training and hosting ML models.
Amazon SageMaker provides a set of built-in algorithms for traditional ML. For deep learning, Amazon SageMaker provides you with the ability to submit MXNet or TensorFlow scripts, and use the distributed training environment to generate a deep learning model. If you use Apache Spark, you can use Amazon SageMaker's library to leverage the advantages of Amazon SageMaker from a familiar environment. You can even bring your own algorithms and frameworks, in Docker containers, and use Amazon SageMaker to manage the training and hosting environments. Just like in Amazon RDS, where we support multiple engines like MySQL, PostgreSQL, and Aurora, we support multiple frameworks in Amazon SageMaker.
Finally, one of the best aspects of Amazon SageMaker is its modular architecture. You can use any combination of its building, training, and hosting capabilities to fit your workflow. For instance, you may use the build and training capabilities to prepare a production-ready ML model, and then deploy the model to a device on the edge, such as AWS DeepLens. Or, you may use only its hosting capabilities to simplify the deployment of models that you've already trained elsewhere. The flexibility of Amazon SageMaker's architecture enables you to easily incorporate its benefits into your existing ML workflows in whatever combination is best.
Amazon ElastiCache embodies much of what makes fast data a reality for customers looking to process high volume data at incredible rates, faster than traditional databases can manage. Developers love the performance, simplicity, and in-memory capabilities of Redis, making it among the most popular NoSQL key-value stores. Redis's microsecond latency has made it a de facto choice for caching. Its support for advanced data structures (for example, lists, sets, and sorted sets) also enables a variety of in-memory use cases such as leaderboards, in-memory analytics, messaging, and more.
Four years ago, as part of our AWS fast data journey, we introduced Amazon ElastiCache for Redis, a fully managed, in-memory data store that operates at microsecond latency. Since then, we have added support for Redis clusters, enabling customers to run faster and more scalable workloads. ElastiCache for Redis cluster configuration supports up to 15 shards and enables customers to run Redis workloads with up to 6.1 TB of in-memory capacity in a single cluster. While Redis cluster configuration enabled larger deployments with high performance, resizing the cluster required backup and restore, which meant taking the cluster offline.
Earlier this month, we announced online cluster resizing within ElastiCache. ElastiCache for Redis now provides the ability to add and remove shards from a running cluster. You can now dynamically scale out and even scale in your Redis cluster workloads to adapt to changes in demand. ElastiCache resizes the cluster by adding or removing shards and redistributing keys uniformly across the new shard configuration, all while the cluster continues to stay online and serve requests. No application changes are needed.
Scaling with elasticity
Having closely watched ElastiCache evolve over the years, I am delighted to see ElastiCache being used by thousands of customers – including the likes of Airbnb, Hulu, McDonalds, Adobe, Expedia, Hudl, Grab, Duolingo, PBS, HERE, and Ubisoft. ElastiCache for Redis delivers predictable microsecond latencies and is super easy to use. Our customers are using ElastiCache for Redis in their most demanding applications, supporting millions of users. Whether it is gaming, adtech, travel, or retail—speed wins, it's simple.
As the use cases for Redis continue to grow, customers have demanded more flexibility in scaling their workloads dynamically, while continuing to be highly available and serving incoming traffic. To give you some examples, I've been talking to a few gaming companies lately, and their conversations are about the need for speed and flexibility in scaling, both in and out. They deal with high variability in workloads based on game adoption or seasonality, such as upcoming holidays. If a game leaderboard surges because of a new game title, and tons of players flock to play the game, gaming platforms want to resize the cluster online to handle the bigger load. But as demand decreases, they should just as easily be able to scale-in the environment to optimize costs, all while staying online and serving incoming requests.
Our retail customers have shared similar challenges about managing workload surges and declines driven by big sale events. Some customers have also shared their experiences of trying to self-manage Redis workloads and implement online cluster resizing, for workloads where offline cluster resizing was not an option. While open source Redis comes with primitives to help reshard a cluster, they are inadequate. In addition to the cost of self-management, customers have to deal with failures during cluster resizing. Failures can leave the cluster in an irrecoverable state, potentially causing data loss and extended downtime until the cluster can be fixed manually.
At Amazon, we have always focused on innovating on behalf of the customer. With online cluster resizing, our goal was to design a fully managed experience for cluster resharding, which would support both scale-out and scale-in and retain open source compatibility. It has been an exciting journey—one of thought leadership and innovation—that has enabled us to bring the promise of more elasticity and the flexibility to resize workloads, while retaining availability, consistency, and performance.
Under the hood
In a Redis cluster, the key space is split into slots (16,384 slots) and slots are distributed across shards. When a cluster is resharded, these slots need to be redistributed. Apps using Redis are able to pick this up, as Redis clients can auto-discover and keep up-to-date with changes in slot assignment. However, the slots must be moved manually on the server side. Cluster resizing is a complex problem as it involves changing the number of shards and migrating data, while serving read and write requests on the same dataset. A resharding operation to scale out involves adding shards, creating a plan for redistributing slots, migrating the slots, and finally transferring slot ownership across shards, after the slots are migrated.
Atomic slot migration
Online cluster resizing in ElastiCache uses atomic slot migration instead of the atomic key migration that open source Redis comes with. When a key is migrated to the target shard, ElastiCache maintains a copy of the key at the source shard, which retains ownership of the key until the entire slot and all its keys are migrated. This has several benefits:
- Because all the keys in the slot continue to be owned by the source shard, the dataset is never in a slot-split situation. This makes it easy to support operations such as multi-key commands, transactions, and LUA scripts, thereby providing full API coverage while cluster resharding is in progress.
- While slot migration is in progress, the source shard continues to support requests related to keys that have been migrated. This minimizes the time window requiring client redirection, improving latency during migration operation.
- Key ownership stays with the source shard, so replicas in the source shard have up-to-date information on the keys. If there is a failover, the replicas can continue serving commands with the latest key status and there is no data loss.
- The system is more robust. Any errors such as target out of memory, which may halt migration, are easy to recover from, because the source shard has full ownership of the key.
We have also made other enhancements along the way. One important addition is the use of multi-threaded operations at the source shard. Slot migration at the source shard is executed in parallel as a separate thread from the main I/O thread. As a result, key migration no longer blocks I/O on the source, ensuring no availability impact. Additionally, to maintain data consistency, all data mutations during the migration operation are asynchronously replicated to the target shard.
Online cluster resizing is a fantastic addition for our ElastiCache for Redis customers. You can resize your ElastiCache for Redis 3.2.10 cluster to scale- out or scale in, without any application side changes. For more information about getting started with clustered Redis and trying to reshard a cluster, see Online Cluster Resizing.
Many of our customers share my excitement:
Duolingo is the free, science-based, language education platform that has organically become the most popular way to learn languages online. With over 200 million users and seven billion language exercises completed each month, the company's mission is to make education free, fun, and accessible to all. "Amazon ElastiCache has played an absolutely critical part in our infrastructure from the beginning," said Max Blaze, Staff Operations Engineer at Duolingo. "As we have grown, we have pushed the limits of what is possible with single-shard clusters. ElastiCache for Redis online resharding will allow us to easily scale our Redis clusters horizontally as we grow, greatly simplifying the management of our many Redis clusters, empowering us to scale quickly while also reducing cost across our caching layers, and continue to grow with minimal changes to our current services.
Dream11 , India's #1 fantasy sports platform with a growing user base, has over 14 million users in South Asia. "We have been using ElastiCache for Redis with sharded configuration since its launch last year, supporting over 14 million users playing fantasy games of cricket, football, and kabaddi. With peak demand of 1.5 million requests per minute and workloads surging by 10X quickly, our platform requires scaling on-demand and without downtime. This feature enables us to scale-in and scale-out our platform to support the fluctuating game demand, and not having to over provision," said Abhishek Ravi, CIO.
"At SocialCode , our data and intelligence allow Fortune 500 marketers to know and connect with their customers by harnessing the most important digital media platforms -- like Facebook, Instagram, Twitter, Pinterest, Snapchat, and YouTube." Using the new online resharding feature of ElastiCache for Redis will allow us to scale out our ever-growing Audience Intelligence product as we continue to on-board brand data. The ability to perform these scaling operations without downtime is priceless!"
For the What's New announcement, see Amazon ElastiCache for Redis introduces dynamic addition and removal of shards while continuing to serve workloads.