Rethinking the 'production' of data

| Comments ()

This article titled "Daten müssen strategischer Teil des Geschäfts werden" appeared in German last week in the "IT und Datenproduktion" column of Wirtschaftwoche.

How companies can use ideas from mass production to create business with data

Strategically, IT doesn't matter. That was the provocative thesis of a much-talked-about article from 2003 in the Harvard Business Review by the US publicist Nicolas Carr. Back then, companies spent more than half of their entire investment for their IT, in a non-differentiating way. In a world in which tools are equally accessible for every company, they wouldn't offer any competitive advantage – so went the argument. The author recommended steering investments toward strategically relevant resources instead. In the years that followed, many companies outsourced their IT activities because they no longer regarded them as being part of the core business.

A new age

Nearly 15 years later, the situation has changed. In today's era of global digitalization there are many examples that show that IT does matter. Developments like cloud computing, the internet of things, artificial intelligence, and machine learning are proving that IT has (again) become a strategic business driver. This is transforming the way companies offer products and services to their customers today. Take the example of industrial manufacturing: in prototyping, drafts for technologically complex products are no longer physically produced; rather, their characteristics can be tested in a purely virtual fashion at every location across the globe by using simulations. The German startup SimScale makes use of this trend. The founders had noticed that in many companies, product designers worked in a very detached manner from the rest of production. The SimScale platform can be accessed through a normal web browser. In this way, designers are part of an ecosystem in which the functionalities of simulations, data and people come together, enabling them to develop better products faster.

Value-added services are also playing an increasingly important role for both companies and their customers. For example, Kärcher, the maker of cleaning technologies, manages its entire fleet through the cloud solution "Kärcher Fleet". This transmits data from the company's cleaning devices e.g. about the status of maintenance and loading, when the machines are used, and where the machines are located. The benefit for customers: Authorized users can view this data and therefore manage their inventories across different sites, making the maintenance processes much more efficient.

Kärcher benefits as well: By developing this service, the company gets exact insight into how the machines are actually used by its customers. By knowing this, Kärcher can generate new top-line revenue in the form of subscription models for its analysis portal.

More than mere support

These examples underline that the purpose of software today is not solely to support business processes, but that software solutions have broadly become an essential element in multiple business areas. This starts with integrated platforms that can manage all activities, from market research to production to logistics. Today, IT is the foundation of digital business models, and therefore has a value-added role in and of itself. That can be seen when sales people, for example, interact with their customers in online shops or via mobile apps. Marketers use big data and artificial intelligence to find out more about the future needs of their customers. Breuninger, a fashion department store chain steeped in tradition, has recognized this and relies on a self-developed e-commerce platform in the AWS Cloud. Breuninger uses modern templates for software development, such as Self-Contained Systems (SCS), so that it can increase the speed of software development with agile and autonomous teams and quickly test new features. Each team acts according to the principle: "You build it, you run it". Hence, the teams are themselves responsible for the productive operation of the software. The advantage of this approach is that when designing new applications, there is already a focus on the operating aspects.

Value creation through data

In a digital economy, data are at the core of value creation, whereas physical assets are losing their significance in business models. Until 1992, the most highly valued companies in the S&P 500 Index were those that made or distributed things (for example the pharmaceutical industry, trade). Today, developers of technology (for example medical technology, software) and platform operators (social media enablers, credit card companies) are at the top. Also, trade with data contributes more to global growth than trade with goods. Therefore, IT has never been more important for strategy than it is now – not only for us, but for every company in the digital age. Anyone who wants to further develop his business digitally can't do that today without at the same time thinking about which IT infrastructure, which software and which algorithms he needs in order to achieve his plans.

If data take center stage then companies must learn how to create added value out of it – namely by combining the data they own with external data sources and by using modern, automated analytics processes. This is done through software and IT services that are delivered through software APIs.

Companies that want to become successful and innovative digital players need to get better at building software solutions.We should ponder how we can organize the 'production' of data in such a way so that we ultimately come out with a competitive advantage. We need mechanisms that enable the mass production of data using software and hardware capabilities. These mechanisms need to be lean, seamless and effective. At the same time, we need to ensure that quality requirements can be met. Those are exactly the challenges that were solved for physical goods through the industrialization of manufacturing processes. A company that wants to industrialize 'software production' needs to find ideas on how to achieve the same kind of lean and qualitatively first-class mass production that has already occurred for industrial goods. And inevitably, the first place to look will be lean production approaches such as Kanban and Kaizen, or total quality management. In the 1980s, companies like Toyota revolutionized the production process by reengineering the entire organization and focusing the company on similar principles. Creating those conditions, both from an organizational and IT- standpoint, is one of the biggest challenges that companies face in the digital age.

Learn from lean

Can we transfer this success model to IT as well? The answer is yes. In the digital world, it is decisive to activate data-centric processes and continuously improve them. Thus, any obstacles that stand in the way of experimentation and the further development of new ideas should be removed as fast as possible. Every new IT project should be regarded as an idea that must go through a data factory – a fully equipped production site with common processes that can be easily maintained. The end-product is high-quality services or algorithms that support digital business models. Digital companies differentiate themselves through their ideas, data and customer relationships. Those that find a functioning digital business model the fastest will have a competitive edge. Above all, the barrier between software development and the operating business has to be overcome. The reason is that the success and speed and frequency of these experiments depend on the performance of IT development, and at the same time on the relevance of the solutions for business operations. Autoscout24 has gained an enormous amount of agility through its cloud solution. The company meanwhile has 15 autonomous interdisciplinary teams working constantly to test and explore new services. The main goal in all this is to have the possibility to quickly iterate experiments through the widest range of architectures, combine services with each other, and compare approaches.

In order to become as agile as Autoscout24, companies need a "machine" that produces ideas. Why not transfer the success formulas from industrial manufacturing and the principles of quality management to the creation of software?

German industrial companies in particular possess a manufacturing excellence that has been built up over many decades. Where applicable, they should do their best to transfer this knowledge to their IT, and in particular to their software development.

In many companies, internal IT knowhow has not developed fast enough in the last few years – quite contrary to the technological possibilities. Customers provide feedback online immediately after their purchase. Real-time analyses are possible through big data and software updates are generated daily through the cloud. Often, the IT organization and its associated processes couldn't keep up. As a consequence, specialist departments with the structures of yesterday are supposed to fulfill customer requirements of tomorrow. Bringing innovative products and services quickly to market is not possible with long-term IT sourcing cycles. It's no wonder that many of specialist departments try to circumvent their own IT department, for example by shifting activities to the cloud, which offers many powerful IT building blocks through easy-to-use APIs for which companies previously had to operate complicated software and infrastructure. Such a decentralized 'shadow IT' delivers no improvements. The end effect is that the complexity of the system increases, which is not efficient. This pattern should be broken. Development and Operations need to work hand in hand instead of working sequentially after each other, as in the old world. And ideally, this should be done in many projects running parallel. Under the heading of DevOps – the combination of "Development and Operations" – IT guru Gene Kim has described the core characteristics of this machinery.

Ensuring the flow

Kim argues that theorganization must be built around the customer benefit and that the flow of projects must be as smooth as possible. Hurdles that block the creation of client benefits should be identified and removed. At Amazon this starts by staffing projects with cross-functional and interdisciplinary teams as a rule. Furthermore, for the sake of agility the teams should not exceed a certain size. We have a rule that teams should be exactly the size that allows everyone to feel full after eating two (large!) pizzas. This approach reduces the number of necessary handovers, increases responsibility, and allows the team to provide customers with software faster.

Incorporating feedback

The earlier client feedback flows back into the "production process", the better. In addition, companies must ensure that every piece of feedback is applied to future projects. To avoid getting lost in endless feedback loops, this should be organized in a lean way: Obtaining the feedback of internal and external stakeholders must by no means hamper the development process.

Learning to take risks

"Good intentions never work, you need good mechanisms to make anything happen," says Jeff Bezos. For that, you need a corporate culture that teaches employees to experiment constantly and deliver. With every new experiment, one should risk yet another small step forward behind the previous step. At the same time, from every team we need data based on predefined KPIs about the impact of the experiments. And we need to establish mechanisms that take effect immediately if we go too far or if something goes wrong, for example if the solution never reached the customer.

Anyone who has tried this knows it's not easy to start your own digital revolution in the company and keep the momentum going. P3 advises cellular operators and offers its customers access to data that provide information about the quality of cellular networks (for example signal strength, broken connection and the data throughput) – worldwide and independent of the network operator and cellular provider. This allows the customers to come up with measures in order to expand their networks or new offerings for a more efficient utilization of their capacity. By introducing DevOps tools, P3 can define an automated process that implements the required compute infrastructure in the AWS Cloud and deploys project-specific software packages with the push of a button. Moreover, the process definition can be revised by developers, the business or data scientists at any time, for example in order to develop new regions, add analytics software or implement new AWS services. Now P3 can focus fully on its core competence, namely developing its proprietary software. Data scientists can use their freed-up resources to analyze in real time data that are collected from around the world and put insights from the analysis at the disposal of their clients

The cloud offers IT limitless possibilities on the technical side, from which new opportunities have been born. But it's becoming ever clearer what is required in order to make use of these opportunities. Technologies change faster than people. And individuals faster than entire organizations. Tackling these challenges is a strategic necessity. Changing the organization is the next bottleneck on the way to becoming a digital champion.

'Paris s'éveille'! Introducing the AWS EU (Paris) Region

| Comments ()

Today, I'm happy to announce that the AWS EU (Paris) Region, our 18th technology infrastructure Region globally, is now generally available for use by customers worldwide. With this launch, AWS now provides 49 Availability Zones, with another 12 Availability Zones and four Regions in Bahrain, Hong Kong, Sweden, and a second AWS GovCloud (US) Region expected to come online by early 2019.

In France, you can find one of the most vibrant startup ecosystems in the world, a strong research community, excellent energy, telecom, and transportation infrastructure, a very strong agriculture and food industry, and some of the most influential luxury brands in the world. The cloud is an opportunity to stay competitive in each of these domains by giving companies freedom to innovate quickly. This is why tens of thousands of French customers already use AWS in Regions around the world. Starting today, developers, startups, and enterprises, as well as government, education, and non-profit organizations can leverage AWS to run applications and store data in France.

French companies are using AWS to innovate in a secure way across industries as diverse as energy, financial services, manufacturing, media, pharmaceuticals and health sciences, retail, and more. Companies of all sizes across France are also using AWS to innovate and grow, from startups like AlloResto, CaptainDash, Datadome, Drivy, Predicsis, Payplug, and Silkke to enterprises like Decathlon, Nexity, Soitec, TF1 as well as more than 80 percent of companies listed on the CAC 40, like Schneider Electric, Societe Generale, and Veolia.

We are also seeing a strong adoption of AWS within the public sector with organizations using AWS to transform the services they deliver to the citizens of France.Kartable, Les Restos du Coeur, OpenClassrooms, Radio France, SNCF, and many more are using AWS to lower costs and speed up their rate of experimentation so they can deliver reliable, secure, and innovative services to people across the country.

The opening of the AWS EU (Paris) Region adds to our continued investment in France. Over the last 11 years, AWS has expanded its physical presence in the country, opening an office in La Defense and launching Edge Network Locations in Paris and Marseille. Now, we're opening an infrastructure Region with three Availability Zones. We decided to locate the AWS data centers in the area of Paris, the capital and economic center of France because it is home to many of the world's largest companies, the majority of the French public sector, and some of Europe's most dynamic startups.

To give customers the best experience when connecting to the new Region, today we are also announcing the availability of AWS Direct Connect. Today, customers can connect to the AWS EU (Paris) Region via Telehouse Voltaire. In January 2018, customers will be able to connect via Equinix Paris in January and later in the year via Interxion Paris. Customers that have equipment within these facilities can use Direct Connect to optimize their connection to AWS.

In addition to physical investments, we have also continually invested in people in France. For many years, we have been growing teams of account managers, solutions architects, trainers, business development, and professional services specialists, as well as other job functions. These teams are helping customers and partners of all sizes, including systems integrators and ISVs, to move to the cloud.

We have also been investing in helping to grow the entire French IT community with training, education, and certification programs. To continue this trend, we recently announced plans for AWS to train, at no charge, more than 25,000 people in France, helping them to develop highly sought-after skills. These people will be granted access to AWS training resources in France via existing programs such as AWS Academy, AWS Educate, AWSome days. They also get access to webinars delivered in French by AWS Technical Trainers and AWS Certified Trainers. To learn more about these trainings or discover when the next event will take place, visit: https://aws.amazon.com/fr/events/

All around us, we see AWS technologies fostering a culture of experimentation. I have been humbled by how much our French customers have been able to achieve using AWS technology. Over the past few months we've had Engie and Radio France at the AWS Summit, as well as Decathlon, Smatis, Soitec and Veolia at the AWS Transformation Days in Lille, Lyon, Nantes, Paris, and Toulouse. Everyone was talking about how they are using AWS to transform and scale their organizations. I, for one, look forward to seeing many more innovative use cases enabled by the cloud at the next AWS Summit in France!

Our AWS EU (Paris) Region is open for business now. We are excited to offer a complete portfolio of services, from our foundational technologies, such as compute, storage, and networking, to our more advanced solutions and applications such as artificial intelligence, IoT, machine learning, and serverless computing. We look forward to continuing to broaden this portfolio to include more services into the future. For more information about the new AWS EU (Paris) Region, or to get started now, I would encourage you to visit: https://aws.amazon.com/fr/paris/.

Expanding the AWS Cloud: Introducing the AWS China (Ningxia) Region

| Comments ()

Today, I am happy to announce the general availability of AWS China (Ningxia) Region, operated by Ningxia Western Cloud Data Technology Co. Ltd. (NWCD). This is our 17th Region globally, and the second in China. To comply with China's legal and regulatory requirements, AWS has formed a strategic technology collaboration with NWCD to operate and provide services from the AWS China (Ningxia) Region. Founded in 2015, NWCD is a licensed data center and cloud services provider, based in Ningxia, China.

Coupled with the AWS China (Beijing) Region operated by Sinnet, the AWS China (Ningxia) Region, operated by NWCD, serves as the foundation for new cloud initiatives in China, especially in Western China. Both Regions are helping to transform businesses, increase innovation, and enhance the regional economy.

Thousands of customers in China are already using AWS services operated by Sinnet, to innovate in diverse areas such as energy, education, manufacturing, home security, mobile and internet platforms, CRM solutions, and the dairy industry, among others. These customers include large Chinese enterprises such as Envision Energy, Xiaomi, Lenovo, OPPO, TCL, Hisense, Mango TV, and Mengniu; well-known, fast growing startups including iQiyi, VIPKID, musical.ly, Xiaohongshu, Meitu, and Kunlun; and multinationals such as Samsung, Adobe, ThermoFisher Scientific, Dassault Systemes, Mapbox, Glu, and Ayla Networks. With AWS, Chinese customers can leverage world-class technologies both within China and around the world.

As this breadth of customers shows, we believe that AWS can and will serve China's innovation agenda. We are excited to collaborate with NWCD in Ningxia and Sinnet in Beijing to offer a robust portfolio of services. Our offerings range from our foundational service stack for compute, storage, and networking to our more advanced solutions and applications.

Starting today, China-based developers, startups, and enterprises, as well as government, education, and non-profit organizations, can use AWS to run their applications and store their data in the new AWS China (Ningxia) Region, operated by NWCD. Customers already using the AWS China (Beijing) Region, operated by Sinnet, can select the AWS China (Ningxia) Region directly from the AWS Management Console. New customers can request an account at www.amazonaws.cnto begin using both AWS China Regions.

Accelerate Machine Learning with Amazon SageMaker

| Comments ()

Applications based on machine learning (ML) can provide tremendous business value. However, many developers find them difficult to build and deploy. As there are few individuals with this expertise, an easier process presents a significant opportunity for companies who want to accelerate their ML usage.

Though the AWS Cloud gives you access to the storage and processing power required for ML, the process for building, training, and deploying ML models has unique challenges that often block successful use of this powerful new technology.

The challenges begin with collecting, cleaning, and formatting training data. After the dataset is created, you must scale the processing to handle the data, which can often be a blocker. After this, there is often a long process of training that includes tuning the knobs and levers, called hyperparameters, that control the different aspects of the training algorithm. Finally, figuring out how to move the model into a scalable production environment can often be slow and inefficient for those that do not do it routinely.

At Amazon Web Services, we've committed to helping you unlock the value of your data through ML, through a set of supporting tools and resources that improve the ML model development experience. From the Deep Learning AMI and the distributed Deep Learning AWS CloudFormation template, to Gluon in Apache MXNet, we've focused on improvements that remove the roadblocks to development.

We also recently announced the Amazon ML Solutions Lab, which is a program to help you accelerate your use of ML in products and processes. As the adoption of these technologies continues to grow, customers have demanded a managed service for ML, to make it easier to get started.

Today, we are announcing the general availability of Amazon SageMaker. This new managed service enables data scientists and developers to quickly and easily build, train, and deploy ML models without getting mired in the challenges that slow this process down today.

Amazon SageMaker provides the following features:

  • Hosted Jupyter notebooks that require no setup, so that you can start processing your training dataset and developing your algorithms immediately.
  • One-click, on-demand distributed training that sets up and tears down the cluster after training.
  • Built-in, high-performance ML algorithms, re-engineered for greater, speed, accuracy, and data-throughput.
  • Built-in model tuning (hyperparameter optimization) that can automatically adjust hundreds of different combinations of algorithm parameters.
  • An elastic, secure, and scalable environment to host your models, with one-click deployment.

In the hosted notebook environment, Amazon SageMaker takes care of establishing secure network connections in your VPC and launching an ML instance. This development workspace also comes pre-loaded with the necessary Python libraries and CUDA drivers, attaches an Amazon EBS volume to automatically persist notebook files, and installs TensorFlow, Apache MXNet, and Keras deep learning frameworks. Amazon SageMaker also includes common examples to help you get started quickly.

For training, you simply indicate the type and quantity of ML instances you need and initiate training with a single click. Amazon SageMaker then sets up the distributed compute cluster, installs the software, performs the training, and tears down the cluster when complete. You only pay for the resources that you use and never have to worry about the underlying infrastructure.

Amazon SageMaker also reduces the amount of time spent tuning models using built-in hyperparameter optimization. This technology automatically adjusts hundreds of different combinations of parameters, to quickly arrive at the best solution for your ML problem. With high-performance algorithms, distributed computing, managed infrastructure, and hyperparameter optimization, Amazon SageMaker drastically decreases the training time and overall cost of building production systems.

When you are ready to deploy, Amazon SageMaker offers an elastic, secure, and scalable environment to host your ML models, with one-click deployment. After training, Amazon SageMaker provides the model artifacts for deployment to EC2 or anywhere else. You then specify the type and number of ML instances. Amazon SageMaker takes care of launching the instances, deploying the model, and setting up the HTTPS endpoint for your application to achieve low latency / high throughput prediction.

In production, Amazon SageMaker manages the compute infrastructure to perform health checks, apply security patches, and conduct other routine maintenance, all with built-in Amazon CloudWatch monitoring and logging.

Before Amazon SageMaker, you were faced with a tradeoff between the flexibility to use different frameworks and the ease of use of a single platform. At AWS, we believe in giving choices, so Amazon SageMaker removes that problem. You can now use the tools of your choice, with a single environment for training and hosting ML models.

Amazon SageMaker provides a set of built-in algorithms for traditional ML. For deep learning, Amazon SageMaker provides you with the ability to submit MXNet or TensorFlow scripts, and use the distributed training environment to generate a deep learning model. If you use Apache Spark, you can use Amazon SageMaker's library to leverage the advantages of Amazon SageMaker from a familiar environment. You can even bring your own algorithms and frameworks, in Docker containers, and use Amazon SageMaker to manage the training and hosting environments. Just like in Amazon RDS, where we support multiple engines like MySQL, PostgreSQL, and Aurora, we support multiple frameworks in Amazon SageMaker.

Finally, one of the best aspects of Amazon SageMaker is its modular architecture. You can use any combination of its building, training, and hosting capabilities to fit your workflow. For instance, you may use the build and training capabilities to prepare a production-ready ML model, and then deploy the model to a device on the edge, such as AWS DeepLens. Or, you may use only its hosting capabilities to simplify the deployment of models that you've already trained elsewhere. The flexibility of Amazon SageMaker's architecture enables you to easily incorporate its benefits into your existing ML workflows in whatever combination is best.

Amazon SageMaker is available today to all customers, in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU West (Ireland). Try Amazon SageMaker for free and get started today!

Scaling Amazon ElastiCache for Redis with Online Cluster Resizing

| Comments ()

Amazon ElastiCache embodies much of what makes fast data a reality for customers looking to process high volume data at incredible rates, faster than traditional databases can manage. Developers love the performance, simplicity, and in-memory capabilities of Redis, making it among the most popular NoSQL key-value stores. Redis's microsecond latency has made it a de facto choice for caching. Its support for advanced data structures (for example, lists, sets, and sorted sets) also enables a variety of in-memory use cases such as leaderboards, in-memory analytics, messaging, and more.

Four years ago, as part of our AWS fast data journey, we introduced Amazon ElastiCache for Redis, a fully managed, in-memory data store that operates at microsecond latency. Since then, we have added support for Redis clusters, enabling customers to run faster and more scalable workloads. ElastiCache for Redis cluster configuration supports up to 15 shards and enables customers to run Redis workloads with up to 6.1 TB of in-memory capacity in a single cluster. While Redis cluster configuration enabled larger deployments with high performance, resizing the cluster required backup and restore, which meant taking the cluster offline.

Earlier this month, we announced online cluster resizing within ElastiCache. ElastiCache for Redis now provides the ability to add and remove shards from a running cluster. You can now dynamically scale out and even scale in your Redis cluster workloads to adapt to changes in demand. ElastiCache resizes the cluster by adding or removing shards and redistributing keys uniformly across the new shard configuration, all while the cluster continues to stay online and serve requests. No application changes are needed.

Scaling with elasticity

Having closely watched ElastiCache evolve over the years, I am delighted to see ElastiCache being used by thousands of customers – including the likes of Airbnb, Hulu, McDonalds, Adobe, Expedia, Hudl, Grab, Duolingo, PBS, HERE, and Ubisoft. ElastiCache for Redis delivers predictable microsecond latencies and is super easy to use. Our customers are using ElastiCache for Redis in their most demanding applications, supporting millions of users. Whether it is gaming, adtech, travel, or retail—speed wins, it's simple.

As the use cases for Redis continue to grow, customers have demanded more flexibility in scaling their workloads dynamically, while continuing to be highly available and serving incoming traffic. To give you some examples, I've been talking to a few gaming companies lately, and their conversations are about the need for speed and flexibility in scaling, both in and out. They deal with high variability in workloads based on game adoption or seasonality, such as upcoming holidays. If a game leaderboard surges because of a new game title, and tons of players flock to play the game, gaming platforms want to resize the cluster online to handle the bigger load. But as demand decreases, they should just as easily be able to scale-in the environment to optimize costs, all while staying online and serving incoming requests.

Our retail customers have shared similar challenges about managing workload surges and declines driven by big sale events. Some customers have also shared their experiences of trying to self-manage Redis workloads and implement online cluster resizing, for workloads where offline cluster resizing was not an option. While open source Redis comes with primitives to help reshard a cluster, they are inadequate. In addition to the cost of self-management, customers have to deal with failures during cluster resizing. Failures can leave the cluster in an irrecoverable state, potentially causing data loss and extended downtime until the cluster can be fixed manually.

At Amazon, we have always focused on innovating on behalf of the customer. With online cluster resizing, our goal was to design a fully managed experience for cluster resharding, which would support both scale-out and scale-in and retain open source compatibility. It has been an exciting journey—one of thought leadership and innovation—that has enabled us to bring the promise of more elasticity and the flexibility to resize workloads, while retaining availability, consistency, and performance.

Under the hood

In a Redis cluster, the key space is split into slots (16,384 slots) and slots are distributed across shards. When a cluster is resharded, these slots need to be redistributed. Apps using Redis are able to pick this up, as Redis clients can auto-discover and keep up-to-date with changes in slot assignment. However, the slots must be moved manually on the server side. Cluster resizing is a complex problem as it involves changing the number of shards and migrating data, while serving read and write requests on the same dataset. A resharding operation to scale out involves adding shards, creating a plan for redistributing slots, migrating the slots, and finally transferring slot ownership across shards, after the slots are migrated.

Atomic slot migration

Online cluster resizing in ElastiCache uses atomic slot migration instead of the atomic key migration that open source Redis comes with. When a key is migrated to the target shard, ElastiCache maintains a copy of the key at the source shard, which retains ownership of the key until the entire slot and all its keys are migrated. This has several benefits:

  • Because all the keys in the slot continue to be owned by the source shard, the dataset is never in a slot-split situation. This makes it easy to support operations such as multi-key commands, transactions, and LUA scripts, thereby providing full API coverage while cluster resharding is in progress.
  • While slot migration is in progress, the source shard continues to support requests related to keys that have been migrated. This minimizes the time window requiring client redirection, improving latency during migration operation.
  • Key ownership stays with the source shard, so replicas in the source shard have up-to-date information on the keys. If there is a failover, the replicas can continue serving commands with the latest key status and there is no data loss.
  • The system is more robust. Any errors such as target out of memory, which may halt migration, are easy to recover from, because the source shard has full ownership of the key.

We have also made other enhancements along the way. One important addition is the use of multi-threaded operations at the source shard. Slot migration at the source shard is executed in parallel as a separate thread from the main I/O thread. As a result, key migration no longer blocks I/O on the source, ensuring no availability impact. Additionally, to maintain data consistency, all data mutations during the migration operation are asynchronously replicated to the target shard.

Online cluster resizing is a fantastic addition for our ElastiCache for Redis customers. You can resize your ElastiCache for Redis 3.2.10 cluster to scale- out or scale in, without any application side changes. For more information about getting started with clustered Redis and trying to reshard a cluster, see Online Cluster Resizing.

Many of our customers share my excitement:

  • Duolingo is the free, science-based, language education platform that has organically become the most popular way to learn languages online. With over 200 million users and seven billion language exercises completed each month, the company's mission is to make education free, fun, and accessible to all. "Amazon ElastiCache has played an absolutely critical part in our infrastructure from the beginning," said Max Blaze, Staff Operations Engineer at Duolingo. "As we have grown, we have pushed the limits of what is possible with single-shard clusters. ElastiCache for Redis online resharding will allow us to easily scale our Redis clusters horizontally as we grow, greatly simplifying the management of our many Redis clusters, empowering us to scale quickly while also reducing cost across our caching layers, and continue to grow with minimal changes to our current services.

  • Dream11 , India's #1 fantasy sports platform with a growing user base, has over 14 million users in South Asia. "We have been using ElastiCache for Redis with sharded configuration since its launch last year, supporting over 14 million users playing fantasy games of cricket, football, and kabaddi. With peak demand of 1.5 million requests per minute and workloads surging by 10X quickly, our platform requires scaling on-demand and without downtime. This feature enables us to scale-in and scale-out our platform to support the fluctuating game demand, and not having to over provision," said Abhishek Ravi, CIO.

  • "At SocialCode , our data and intelligence allow Fortune 500 marketers to know and connect with their customers by harnessing the most important digital media platforms -- like Facebook, Instagram, Twitter, Pinterest, Snapchat, and YouTube." Using the new online resharding feature of ElastiCache for Redis will allow us to scale out our ever-growing Audience Intelligence product as we continue to on-board brand data. The ability to perform these scaling operations without downtime is priceless!"

For the What's New announcement, see Amazon ElastiCache for Redis introduces dynamic addition and removal of shards while continuing to serve workloads.

Today marks the 10 year anniversary of Amazon's Dynamo whitepaper, a milestone that made me reflect on how much innovation has occurred in the area of databases over the last decade and a good reminder on why taking a customer obsessed approach to solving hard problems can have lasting impact beyond your original expectations.

It all started in 2004 when Amazon was running Oracle's enterprise edition with clustering and replication. We had an advanced team of database administrators and access to top experts within Oracle. We were pushing the limits of what was a leading commercial database at the time and were unable to sustain the availability, scalability and performance needs that our growing Amazon business demanded.

Our straining database infrastructure on Oracle led us to evaluate if we could develop a purpose-built database that would support our business needs for the long term. We prioritized focusing on requirements that would support high-scale, mission-critical services like Amazon's shopping cart, and questioned assumptions traditionally held by relational databases such as the requirement for strong consistency. Our goal was to build a database that would have the unbounded scalability, consistent performance and the high availability to support the needs of our rapidly growing business.

A deep dive on how we were using our existing databases revealed that they were frequently not used for their relational capabilities. About 70 percent of operations were of the key-value kind, where only a primary key was used and a single row would be returned. About 20 percent would return a set of rows, but still operate on only a single table.

With these requirements in mind, and a willingness to question the status quo, a small group of distributed systems experts came together and designed a horizontally scalable distributed database that would scale out for both reads and writes to meet the long-term needs of our business. This was the genesis of the Amazon Dynamo database.

The success of our early results with the Dynamo database encouraged us to write Amazon's Dynamo whitepaper and share it at the 2007 ACM Symposium on Operating Systems Principles (SOSP conference), so that others in the industry could benefit. The Dynamo paper was well-received and served as a catalyst to create the category of distributed database technologies commonly known today as "NoSQL."

Of course, no technology change happens in isolation, and at the same time NoSQL was evolving, so was cloud computing. As we began growing the AWS business, we realized that external customers might find our Dynamo database just as useful as we found it within Amazon.com. So, we set out to build a fully hosted AWS database service based upon the original Dynamo design.

The requirements for a fully hosted cloud database service needed to be at an even higher bar than what we had set for our Amazon internal system. The cloud-hosted version would need to be:

  • Scalable – The service would need to support hundreds of thousands, or even millions of AWS customers, each supporting their own internet-scale applications.
  • Secure – The service would have to store critical data for external AWS customers which would require an even higher bar for access control and security.
  • Durable and Highly-Available – The service would have to be extremely resilient to failure so that all AWS customers could trust it for their mission-critical workloads as well.
  • Performant – The service would need to be able to maintain consistent performance in the face of diverse customer workloads.
  • Manageable – The service would need to be easy to manage and operate. This was perhaps the most important requirement if we wanted a broad set of users to adopt the service.

With these goals in mind, In January, 2012 we launched Amazon DynamoDB, our cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance and manageability needed to run mission-critical workloads.

Today, DynamoDB powers the next wave of high-performance, internet-scale applications that would overburden traditional relational databases. Many of the world's largest internet-scale businesses such as Lyft, Tinder and Redfin as well as enterprises such as Comcast, Under Armour, BMW, Nordstrom and Toyota depend on DynamoDB's scale and performance to support their mission-critical workloads.

DynamoDB is used by Lyft to store GPS locations for all their rides, Tinder to store millions of user profiles and make billions of matches, Redfin to scale to millions of users and manage data for hundreds of millions of properties, Comcast to power their XFINITY X1 video service running on more than 20 million devices, BMW to run its car-as-a-sensor service that can scale up and down by two orders of magnitude within 24 hours, Nordstrom for their recommendations engine reducing processing time from 20 minutes to a few seconds, Under Armour to support its connected fitness community of 200 million users, Toyota Racing to make real time decisions on pit-stops, tire changes, and race strategy, and another 100,000+ AWS customers for a wide variety of high-scale, high-performance use cases.

With all the real-world customer use, DynamoDB has proven itself on those original design dimensions:

  • Scalable – DynamoDB supports customers with single tables that serve millions of requests per second, store hundreds of terabytes, or contain over 1 trillion items of data. In support of Amazon Prime Day 2017, the biggest day in Amazon retail history, DynamoDB served over 12.9 million requests per second. DynamoDB operates in all AWS regions (16 geographic regions now with announced plans for six more Regions in Bahrain, China, France, Hong Kong, Sweden), so you can have a scalable database in the geographic region you need.
  • Secure – DynamoDB provides fine-grained access control at the table, item, and attribute level, integrated with AWS Identity and Access Management. VPC Endpoints give you the ability to control whether network traffic between your application and DynamoDB traverses the public Internet or stays within your virtual private cloud. Integration with AWS CloudWatch, AWS CloudTrail, and AWS Config enables support for monitoring, audit, and configuration management. SOC, PCI, ISO, FedRAMP, HIPAA BAA, and DoD Impact Level 4 certifications allows customers to meet a wide range of compliance standards.
  • Durable and Highly-Available – DynamoDB maintains data durability and 99.99 percent availability in the event of a server, a rack of servers, or an Availability Zone failure. DynamoDB automatically re-distributes your data to healthy servers to ensure there are always multiple replicas of your data without you needing to intervene.
  • Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases. In addition, DynamoDB Accelerator (DAX) a fully managed, highly available, in-memory cache further speeds up DynamoDB response times from milliseconds to microseconds and can continue to do so at millions of requests per second.
  • Manageable – DynamoDB eliminates the need for manual capacity planning, provisioning, monitoring of servers, software upgrades, applying security patches, scaling infrastructure, monitoring, performance tuning, replication across distributed datacenters for high availability, and replication across new nodes for data durability. All of this is done for you automatically and with zero downtime so that you can focus on your customers, your applications, and your business.
  • Adaptive Capacity –DynamoDB intelligently adapts to your table's unique storage needs, by scaling your table storage up by horizontally partitioning them across many servers, or down with Time To Live (TTL) that deletes items that you marked to expire. DynamoDB provides Auto Scaling, which automatically adapts your table throughput up or down in response to actual traffic to your tables and indexes. Auto Scaling is on by default for all new tables and indexes.

Ten years ago, we never would have imagined the lasting impact our efforts on Dynamo would have. What started out as an exercise in solving our own needs in a customer obsessed way, turned into a catalyst for a broader industry movement towards non-relational databases, and ultimately, an enabler for a new class of internet-scale applications.

As we say at AWS, It is still Day One for DynamoDB. We believe we are in the midst of a transformative period for databases, and the adoption of purpose-built databases like DynamoDB is only getting started. We expect that the next ten years will see even more innovation in databases than the last ten. I know the team is working on some exciting new things for DynamoDB – I can't wait to share them with you over the upcoming months.

As-Salaam-Alaikum: The cloud arrives in the Middle East!

| Comments ()

Today, I am excited to announce plans for Amazon Web Services (AWS) to bring an infrastructure Region to the Middle East! This move is another milestone in our global expansion and mission to bring flexible, scalable, and secure cloud computing infrastructure to organizations around the world. Based in Bahrain, this will be the first Region for AWS in the Middle East. The Region will be in the heart of Gulf Cooperation Council (GCC) countries, and we're aiming to have it ready by early 2019. This Region will consist of three Availability Zones at launch, and it will provide even lower latency to users across the Middle East.

This news marks the 22nd AWS Region we have announced globally. We already have 44 Availability Zones across 16 geographic Regions that customers can use today. We still have another five AWS Regions (and 14 Availability Zones) in China, France, Hong Kong, and Sweden. Plus another AWS GovCloud (US) Region in the United States is coming online by the end of 2018.

I'm also excited to announce today that we are launching an AWS Edge Network Location in the United Arab Emirates (UAE) in the first quarter of 2018. This will bring Amazon CloudFront, Amazon Route 53, AWS Shield, and AWS WAF to the region and add to the 84 points of presence AWS has around the world. Despite this rapid growth, we don't plan to slow down or stop there: we will bring infrastructure everywhere needed to meet our customers' expectations.

2017 continues a busy year for AWS in the Middle East. Back in January we opened offices in the region to serve our rapidly growing customer base. We now have a presence in Dubai, UAE and Manama, Bahrain with teams of account managers, solutions architects, partner managers, professional services consultants, support staff, and various other functions, so that customers can directly engage with AWS. For the new AWS infrastructure Region we will also be hiring datacenter engineers, support engineers, engineering operations managers, security specialists, and many more. We are continually hiring in the Middle East, so those people looking to join our dynamic and rapidly growing team should visit www.amazon.jobs.

In addition to infrastructure, offices, and jobs another investment AWS is making for its customers in the Middle East, and around the world is to run our business in the most environmentally friendly way. One of the important criteria in launching this AWS Region is the opportunity to power it with renewable energy. We chose Bahrain in part due to the country's focus on executing renewable energy goals and its readiness to construct a new solar power facility to meet our power needs. I'm pleased to announce that the Bahrain Energy and Water Authority (EWA) will construct a solar farm that will supply renewable energy to power this infrastructure Region. EWA expects to bring the 100 MW solar farm online in 2019, making it the country's first utility-scale renewable energy project.

You might not know that AWS has a long history of working with customers in the Middle East. We have been supporting the growth of organizations in this part of the world since the early days of our business. We have supported the development of technology skills across the region with Training and Certification programs to help customers develop skills to design, deploy, and operate their infrastructure and applications on the AWS Cloud. We run a range of programs to give people cloud skills, from AWSome Days – a one-day workshop-based training for technical professionals - to online resources such as webinars, whitepapers, articles, and tutorials that help to educate people about AWS.

In the education sector we have been supporting the development of technology and cloud skills amongst tertiary institutes in the Middle East through the AWS Educate program. This provides students and educators with the resources needed to accelerate cloud-related learning. AWS Educate is already available for students attending institutes such as King Abdullah University of Science and Technology in Saudi Arabia, the Higher Colleges of Technology in UAE, Bahrain Polytechnic, University of Bahrain, as well as Oman College of Management and Technology, the Jordan University of Science and Technology, and many others across the region.

For those not in tertiary education, but looking to launch their own business in the Middle East, we have AWS Activate. This gives startups access to guidance and 1:1 time with AWS experts, as well as web-based training, self-paced labs, customer support, third-party offers, and up to $100,000 USD in AWS service credits. We also work with a number of incubators and accelerators in the region. In Saudi Arabia AWS works with the Badir Program for Technology Incubators and Accelerators at King Abdulaziz City for Science and Technology (KACST). Working with Badir, AWS is providing startups access to technology resources as well as expert advice to help Saudi youth entrepreneurship and to grow new businesses in the Kingdom. We also work with AstroLabs in the UAE and the Cloud 10 Scalerator in Bahrain, as well as a number of international startup accelerator and incubator organizations active in the region, such as 500 Startups, Startupbootcamp, and Techstars. For more details on AWS Activate visit https://aws.amazon.com/activate/.

Through supporting new and existing businesses across the Middle East, organizations of all sizes – in UAE, Saudi Arabia, Kuwait, Jordan, Egypt, Bahrain, and other countries – have been increasingly moving their mission-critical applications to AWS. Some of the Middle East's most established enterprises, such as Actel, Al Tayer Group, Batelco, flydubai, Hassan Allam, Middle East Broadcasting Center, Silah Gulf, Souq.com, Union Insurance, United Arab Shipping Company, and many others, are using AWS to drive cost savings, accelerate innovation, and speed time-to-market.

One story from the Middle East I particularly like is flydubai, the leading low-cost airline in the region. flydubai chose to build their online check-in platform on AWS and went from design to production in four months. It is now being used by thousands of passengers a day. They have also reduced the lead time for new infrastructure services from up to 10 weeks to a matter of hours. Given the seasonal fluctuations in demand for flights, flydubai also needs IT infrastructure that allows it to cope with spikes in demand, making this a great use case for cloud.

AWS also works with a number of government organizations across the Middle East. The Bahrain Ministry of Education, the Ministry of Justice, and the Bahrain Institute of Public Administration (BIPA) are moving workloads to AWS. The BIPA has moved their Learning Management System to AWS, reducing costs by over 90%. Another government organization using AWS to reduce costs and increase agility is the Bahrain Information & eGovernment Authority (iGA). The iGA is the government department in charge of moving all government services online and is also responsible for ICT governance and procurement for the Bahrain government. Earlier this year the iGA launched a cloud first policy, requiring new government workloads evaluate cloud-based services first. Through adopting a cloud first policy, they have helped to reduce the government procurement process for new technology from months to less than two weeks. They are also migrating 700 government websites, with more than 50 TB of data, onto AWS, helping them to meet their goal of decommissioning their hosting platform by the end of 2017.

In addition to enterprises and government institutions, startups in the region are also choosing AWS as the foundation for their business and to scale rapidly and expand their geographic reach in minutes. These startups include Alpha Apps, Blu Loyalty, Cequens, DevFactory, Dubizzle, Fetchr, Genie9, Mawdoo3.com, Namshi, OneGCC, Opensooq.com, Payfort, Tajawal, and Ubuy, as well as Middle Eastern Unicorn, Careem. Careem runs totally on AWS and over the past five years has grown 10 times in size every year. Another cool startup that comes from the Middle East is Anghami. Anghami is a music service that uses AWS to add over 10,000 tracks a day to its catalogue using Amazon S3. Anghami serves over 750 million monthly streams. Having only been founded in 2012, Anghami has grown rapidly and now has over 50 million users, and offers instant access to over 26 million songs, making it the number one music platform in the Middle East and North Africa (MENA) region.

Alongside customers, we also work with a vibrant partner ecosystem across the Middle East, including AWS Partner Network (APN) Partners that have built cloud practices and innovative technology solutions on AWS. AWS APN Consulting and Technology Partners in the Middle East that are helping customers to migrate to the cloud include Al Moayyed Computers, Batelco, C5, du, DXC Technology, Falcon 9, Infonas, Integra Technologies, ITQAN Cloud, Human Technologies, Kaar Technologies, Navlink, Redington, Zain, and many others. As we head toward the opening of the AWS Middle East Region we look forward to working with many more partners.

Despite being active in the Middle East for many years, and the rapid growth we have seen, it is still Day One for us at AWS. We are excited to see the new applications, businesses, and entire industries that will grow in the Middle East in the coming years thanks to the cloud. We also look forward to working with customers from startups to enterprise, public to private sector, and many more as we grow our business in the Middle East and around the world. For more information on our activities in the Middle East, including webinars, meetups, customer case studies, and more, please visit the AWS Middle East page at https://aws.amazon.com/aws-me/.

This article titled "Wie Unternehmen vom Vormarsch des maschinellen Lernens profitieren können" appeared in German last week in the "Digitaliserung" column of Wirtschaftwoche.

When a technology has its breakthrough, can often only be determined in hindsight. In the case of artificial intelligence (AI) and machine learning (ML), this is different. ML is that part of AI that describes rules and recognizes patterns from large amounts of data in order to predict future data. Both concepts are virtually omnipresent and at the top of most buzzword rankings.

Personally, I think – and this is clearly linked to the rise of AI and ML – that there has never been a better time than today to develop smart applications and use them. Why? Because three things are coming together. First: Users across the globe are capturing data digitally, whether this is in the physical world through sensors or GPS, or online through click stream data. As a result, there is a critical mass of data available. Secondly, there is enough affordable computing capacity in the cloud for companies and organizations, no matter what their size, to use intelligent applications. And thirdly, an "algorithmic revolution" has taken place, meaning it is now possible to train trillions of algorithms simultaneously, making the whole machine learning process much faster. This has allowed for more research, which has resulted in reaching the "critical mass" in knowledge that is needed to kick off an exponential growth in the development of new algorithms and architectures.

We may have come a relatively long way with AI, but the progress came quietly. After all, during the last 50 years, AI and ML were fields that had only been accessible to an exclusive circle of researchers and scientists. That is now changing, as packages of AI and ML services, frameworks and tools are today available to all sorts of companies and organizations, including those that don't have dedicated research groups in this field. The management consultants at McKinsey expect that the global market for AI-based services, software and hardware will grow annually by 15-25% and reach a volume of around USD 130 billion in 2025. A number of start-ups are using AI algorithms for all things imaginable – searching for tumors in medical images, helping people learn foreign languages, or automating claims handling at insurance companies. At the same time, entirely new categories of applications are being created whereby a natural conversation between man and machine is taking center-stage.

Progress through machine learning

Is the hype surrounding AI and ML even justified? Definitely, because they offer business and society fascinating possibilities. With the help of digitization and high-performance computers, we are able to replicate human intelligence in some areas, such as computer vision, and even surpass the intelligence of humans. We are creating very diverse algorithms for a wide range of application areas and turning these individual pieces into services so that ML is available for everyone. Packaged into applications and business models, ML can make our life more pleasant or safer. Take autonomous driving: 90% of car accidents in the US can be traced to "human failure". The assumption is that the number of accidents will decline over the long term if vehicles drive autonomously. In aviation, this has already been reality for a long time.

MIT pioneers Erik Brynjolfsson and Andrew McAfee predict that the macroeconomic effect of the so-called "second machine age" will be comparable to what the steam engine once unleashed when it replaced humans' muscular strength ("the first machine age"). Many are uncomfortable with the idea that an artificial intelligence exists alongside human intelligence. That is understandable. We must therefore discuss – parallel to the technological developments – how humans and AI can co-exist in the future; the moral and ethical aspects that arise; how to ensure we have a good grip on AI; and which legal parameters we need in order to manage all this. Answering these questions will be just as important as the effort to solve the technological challenges, and neither dogmas nor ideologies will help. Instead, what's needed is an objective, broad-based debate that takes into account the wellbeing of society as a whole.

Machine Leaning at Amazon

For the past 20 years, thousands of software engineers at Amazon have been working on ML. We dare to claim that we are the company that has been applying AI and ML as a business technology the longest. We know that innovative technologies always take off whenever barriers to entry fall for market participants.

That is the case right now with AI and ML. In the past, anyone who wanted to use AI for himself had to start from scratch: develop algorithms and feed them with enormous amounts of data – even if he later needed an application for a strictly confined context. This is referred to as so-called "weak" AI. Many of the consumer interfaces that everyone is familiar with today, such as recommendations, similarities or autofill functions for search prediction – they are all ML driven. In the meantime, they can predict inventory levels or vendor lead times, detect customer problems and automatically deduct how to solve them; and discover counterfeit goods and sort out abusive reviews, thereby protecting our customers from fraud. But that is only the tip of the iceberg. At Amazon, we are sitting on billions of historical order information data, which allows us to create other AI/ML-based models based on AI for many different kinds of functionalities. For example programming interfaces that developers can use to analyze images, change text into true-to-life language or create chatbots. But ultimately, there is something to be found for everyone who wants to define models, train them, and then scale. Pre-configured, attuned libraries and deep learning frameworks are widely available, which allow anyone to get started very fast.

Companies like Netflix, Nvidia, or Pinterest use our capabilities in ML and deep learning. More and more layers are being created in a kind of ecosystem on which companies and organizations can 'dock' their business – depending on how deep they want to, and are able to, immerse themselves in the subject matter. Decisive is the openness of the layers and the reliable availability of the infrastructure. In the past, AI technologies were so expensive that it was hardly worth it to use them. Today, AI and ML technologies are available off the shelf, and they can be called up according to one's individual requirements. They form the basis for new business models. Even users who are not AI specialists can very easily and affordably incorporate the building blocks into their own services. In particular small and medium-sized companies with innovative strength can benefit. They do not have to learn any complex ML algorithms and technologies, and they can experiment without incurring high costs.

Artificial intelligence helps to satisfy the customer

One of the most advanced areas of application is e-commerce. AI-supported pre-selection mechanisms help companies to free their customers' decision making from complexity. The ultimate goal is customer satisfaction. If there are only three types of toothpaste, the customer can easily pick one and feel good about it. When more than 50 kinds are on offer, the choice becomes complicated. You have to decide, but you're not sure if the decision is the right one. The more possibilities there are, the more difficult it becomes for the customer. Our best-known algorithms come from this field: filtering product suggestions based on one's purchase history of products with similar attributes, or on the behavior of other customers who were interested in similar things.

Of course, consistent quality also contributes to the satisfaction of the customer. Intelligent support makes life easier for the provider and the customer. For Amazon Fresh, for example, we have developed algorithms that learn how fresh groceries have to look, how long this state lasts, and when food should no longer be sold. Airlines or rail transport companies could also use this for their quality control by running an algorithm based on the image data of the freight; the algorithm would recognize damaged goods and automatically sort them out.

If you can predict demand, you can plan more efficiently

In B2B and B2C businesses, it is critical that goods are available quickly. It is for this reason that we at Amazon have developed algorithms that can predict the daily demand of goods. This is particularly complex for fashion goods, which are always available in many different sizes and variations and for which reorder possibilities are very limited. Information about past demand, among others, is fed into our system, as well as fluctuations that can occur with seasonal goods, the effect of special offers, and the sensitivity of customers to price shifts. Today we can predict precisely how many shirts in a certain size and color will be sold on a defined day. We have tackled this issue and made the technology available to other companies as a web service. MyTaxi, for example, benefits from our ML-based service to plan at what time and at which place the customer will need the vehicle.

New division of labor

But AI is much more than just forecasting. In the field of fulfillment, which is relevant for numerous industry sectors, we are thinking of ideas of how AI can contribute the most to taking another step away from a Tayloristic work pattern. Applied in robots, AI can free people from routine activities that are physically difficult and often stressful. Machines are very good at, and sometimes even outperform, tasks that are complicated for a human to do, such as finding the optimal route in a warehouse for a certain number of orders and transporting heavy goods to the point where it is sent to the customer. For supposedly easy tasks, by contrast, the robot is overwhelmed; an example is recognizing a box that has landed on the wrong shelf. So how to bring together the best of both players? By letting intelligent robots learn from humans how to identify the right goods, take on various orders and navigate their way autonomously through the warehouse on the most efficient route. This is how we take away the most tedious part of the task and shift resources to more interaction with the customer.

Our client SCDM uses the core idea of freeing up resources for "human" strengths, but in a completely different context. SCDM is a service provider that supports banks and insurance companies with digitization. Using AI, SCDM enables its customers to classify documents that are of very different formats (PDF, Excel or photos), for example a report about the performance of an investment product that contains hundreds of pages. By scanning hundreds of thousands of documents simultaneously, SCDM's algorithm recognizes which document is relevant for a specific request, finds out where relevant data for a specific type of preparation is located, and then extracts the data from the document. As a result, there is less bias and fewer errors in the number crunching, and more time for human interaction with important stakeholders like investors, analysts and other customers.

Machine learning in education, medicine and development aid

In addition to their potential for things like efficiency and productivity, ML and AI can also be used in education. Duolingo, which offers free language course apps, uses text-to-language algorithms to assess and correct learners' pronunciation. In medicine, AI supports doctors in analyzing X-Ray CTs or MRT images. The World Bank also uses AI in order to implement infrastructure programs, development aid and other measures in a more targeted manner in the future.

More room for optimism

Despite all these developments, many people from academia, business and government have a critical view of ML and AI. There have been warnings that a new super-intelligence is jeopardizing our civilization – and these warnings have been effective in attracting publicity.

However, neither hysteria nor euphoria should be allowed to get the upper hand in the public debate. What we need instead is a pragmatic-optimistic view of the emerging possibilities. AI enables us to get rid of tasks in our work which damage our health or where machines are better than we are. Not with the goal of making ourselves redundant. Rather, in order to gain more personal and economic freedom – for interpersonal relationships, for our creativity and for everything that we humans can do better than machines. That is what we should strive for. If we don't, we will ultimately forego the economic and societal opportunities that we could have grasped.

Improving Customer Service with Amazon Connect and Amazon Lex

| Comments ()

Customer service is central to the overall customer experience that all consumers are familiar with when communicating with companies. That experience is often tested when we need to ask for help or have a question to be answered. Unfortunately, we've become accustomed to providing the same information multiple times, waiting on hold, and generally spending a lot more time than we expected to resolve our issue when we call customer service.

When you call for customer assistance, you often need to wait for an agent to become available after navigating a set of menus. This means that you're going to wait on hold regardless of whether your issue is simple or complex. Once connected, the systems that power call centers generally don't do a good job of using and sharing available information. Therefore, you often start out anonymous and can't be recognized until you've gone through a scripted set of questions. If your issue is complex, you may end up repeating the same information to each person you talk to, because context is not provided with the handoff. It's easy to end up frustrated by the experience, even if your issue is successfully resolved.

At Amazon, customer obsession is a fundamental principle of how we operate, and it drives the investments we make. Making sure that customers have a great experience when they need to call us is something that we've invested a lot of time in. So much so, that in March 2017, we announced Amazon Connect, which is the result of nearly ten years of work to build cloud-based contact centers at scale to power customer service for more than 50 Amazon teams and subsidiaries, including Amazon.com, Zappos, and Audible. The service allows any business to deliver better over-the-phone customer service at lower cost.

When we set out to build Amazon Connect, we thought deeply about how artificial intelligence could be applied to improve the customer experience. AI has incredible potential in this area. Today, AWS customers are using the cloud to better serve their customers in many different ways. For instance, Zillow trains and retrains 7.5 million models every day to provide highly specific home value estimates to better inform buyers and sellers. KRY is helping doctors virtually visit patients and accurately diagnose aliments by applying machine learning to symptoms. Netflix is using machine learning to provide highly personalized recommendations to over 100 million subscribers. There are really exciting projects everywhere you look, including call centers.

When Amazon Connect launched, we spoke about the integration with Amazon Lex. One of the really interesting trends in machine learning lately has been the rise of chatbots, because they are well suited to fulfilling customer requests with natural language. Amazon Lex, which uses the same conversational technology as Amazon Alexa, is Amazon Web Services' deep-learning powered chatbot platform. By linking Amazon Lex chatbots into the Amazon Connect contact flow, customers are able to get help immediately without relying on menus or specific voice commands. For example, an Amazon Lex driven conversation with your dentist's office might look like this…

Connect: "Hello, thanks for calling. Is this Jeff?"

Jeff: "Yes"

Connect: "I see you have a cleaning appointment this Friday. Are you calling to confirm?"

Jeff: "No, actually."

Connect: "Ok, what are you calling about?"

Jeff: "I'd like to change my appointment to be next Monday."

Connect: "No problem, I have availability on Monday July 3rd at 11:00 AM. Does that work?

Jeff: "That's fine."

Connect: "Great. I have booked an appointment for you on Monday, July 3rd at 11:00 AM. Is there anything else I can help you with?

Jeff: "Can you send me a text confirmation?"

Connect: "Sure. I have sent a text message confirmation of your appointment to your cell. Can I do anything more for you?"

Jeff: "No, that's great. Bye."

The chatbot is able to quickly and naturally handle the request without waiting for an agent to become available, and the customer was never presented with menus or asked for information the office already had. AWS Lambda functions made the corresponding calls to the database and scheduling software, making sure that the interaction happened quickly and at extremely low cost. The workflow-based functionality of Amazon Lex and Amazon Connect also helps to reduce mistakes by making sure interactions play out consistently every time.

If the customer's issue is not able to be resolved by the chatbot, Amazon Lex is able to pass on the full context of the conversation to a human representative. This keeps the customer from wasting time repeating answers to questions and lets the representative focus 100% of their time on solving the problem, which increases the odds the customer is going complete the call feeling positive about the experience.

Today, we're announcing the general availability of Amazon Lex integration with Amazon Connect. We've also enhanced the speech recognition models used by Amazon Lex to support integration with other call center providers as well, so that all telephony systems can start using AI to improve customer interactions.

We think artificial intelligence has a lot of potential to improve the experience of both customers and service operations. Customers can get to a resolution fast with more personalization, and human representatives will be able to spend more time resolving customer questions.

Getting Started: Amazon Connect is available to all customers in the US East (N. Virginia) region. You can get started by visiting https://aws.amazon.com/connect. Additional information on Amazon Lex integration can be found at https://aws.amazon.com/connect/connect-lexchatbot.

Stop waiting for perfection and learn from your mistakes

| Comments ()

This article titled "Wartet nicht auf Perfektion – lernt aus euren Fehlern!" appeared in German last week in the "Digitaliserung" column of Wirtschaftwoche.

"Man errs as long as he doth strive." Goethe, the German prince of poets, knew that already more than 200 years ago. His words still ring true today, but with a crucial difference: Striving alone is not enough. You have to strive faster than the rest. And while there's nothing wrong with striving for perfection, in today's digital world you can no longer wait until your products are near perfection before offering them to your customers. If so, you will fall behind in your market.

So if you can't wait for perfection, what should you do instead? I believe the answer is to experiment aggressively with your product development, accepting the possibility that some of your experiments will fail.

Anyone who has listened to, or worked with, management gurus know their mantra: Failure is a necessary part of progress. That's true, but there's often a big gap between the management theory and the reality on the ground. People want to experiment and learn from things that go wrong. But in the flurry of day-to-day business, they're not given enough time to really reflect on the cause of an error and what to do differently next time.

The solution is to find a systematic approach that prevents errors from repeating themselves.

From perfection to anti-fragility

In finding such a systematic way, you first need to distinguish between two types of errors that can happen in your company: those of technology and those of human decision-making. The nice thing is: if you know how to deal effectively with the first, you might end up being better in the second, making better decisions. The financial mathematician and essayist Nassim Taleb offers an interesting take on this issue. He has argued that errors are incredibly valuable because they lead to innovation. He uses the term 'anti-fragility' to make his point. Today's digital business models require smaller, frequent releases to reduce risk. That means the technologies underpinning these new business models must be more than just robust. They must be 'anti-fragile'. The main feature of anti-fragile technology is that it can 'err' without falling apart. In fact, a crisis can make it even better.

At Amazon, we also require our systems and customer solutions to be anti-fragile, and we do that by designing our systems to stand the test of time. Our systems must be able to evolve and become more resilient to failure. They must become more powerful and more feature-rich over time as a result of learning from customer feedback and any failure modes they may encounter while operating the systems.

An example of a German company that has become 'anti-fragile' is HARTING, the world's leading provider of heavy pluggable connectors for machines and plants. HARTING shows how to think a step ahead about the meaning of quality standards in the digital world. Quality and trust are the most important values for this traditional company, and Industry 4.0 and the digital transformation have already been important focus areas for them since 2011. Even though it was hard to accept at first, HARTING has meanwhile realized that errors are inevitable. For that reason, its development switched to agile methods. It also uses the "minimum-viable-product" approach and relies on microservices for its software. Working this way, HARTING can discard things and create new things more easily. All in all, HARTING has become faster.

That can be seen with HARTING MICA, an edge computing solution that enables older machines and plants to get a digital retrofit. The body and hardware still reflect HARTING's standard of perfection. But for the software, the goal is "good enough", because a microservice is neither ever finished nor perfect. As a result, wrong decisions and mistakes can be corrected very quickly and systems can mature faster, approaching the state of antifragility. If the requirements change or better software technologies become available, each microservice can be thrown out and a new one created. That's how you gain speed and quickly digitize old machines and connect them to the cloud within a manageable cost framework.

Taking the dread out of mistakes

If you want to become anti-fragile, more than robust, like HARTING and other companies, you need to proactively look for the weak spots in a system as you experiment. In a system that should evolve, all sorts of errors will happen that you weren't able to predict, especially when systems need to scale into unknown territories. So subject your system to continuous failures and make subsystems artificially fail using tools like Netflix's Chaos Monkey.

If you do all of this, you will start to objectify errors at your company and make dealing with errors a matter of normality. And when errors become 'business as usual', no one will be afraid of taking a risk, trying out a new idea, a new product or a new service and seeing what happens when customers interact with it. That's how you quickly find solutions that really work in the future.

At Amazon, our approach for systematically and constructively dealing with errors is called the "cause of error" method. It refrains from seeking "culprits". Instead it documents learning experiences and derives actions that ultimately improve the availability of our systems.

From root cause to innovation

The method first calls for fixing an error by analyzing its immediate root cause and taking steps to mitigate the damage and restore the initial running state as quickly as possible.But we are not content with that result. We go further, trying to extract the maximum amount of insight from the incident. And this process begins as soon everything is working again for the client.

A key element of our cause-of-error method is asking 5 'Why?' questions (a technique that originated in quality control in manufacturing). This is important because it determines the fundamental root of the problem.

Take the case of a website: Why was it down last Friday? The web servers reported timeouts. Why were there timeouts? Because our web services are overloaded and couldn't cope with the high traffic. Why were the web servers overloaded? Because we don't have enough web servers to handle all requests at peak times. Why don't we have enough web servers? Because we didn't consider possible peaks in demand in our planning. Why didn't we take peaks in demand into account in our planning? By the end of this process, we know exactly what happened and which clients were affected. Then we're in a position to distill an action plan that ensures that specific error doesn't happen again.

Quite often, applying this cause-of-error approach allows us to find breakthrough innovations, in the spirit of Nassim Taleb. That's how the solution Auto Scaling was created, after a certain client segment was fighting with strongly fluctuating hits on their website. When the load increases for a website, Auto Scaling automatically spins up an additional web server to service the rising number of requests. Conversely, when the load subsides, Auto Scaling turns off web servers that are not needed in order to save cost.

What it reveals is: Organizations need to look beyond superficial success. This is true for the development of systems as well as business models. If you want to remain agile in a complex environment, you must follow this path, even if it means leaving the comfort zone. If we transfer these ideas into an organizational context, three aspects might be worth considering:

1. Embrace error as a matter of fact

Jeff Bezos once said about Amazon: "I believe we are the best place in the world to fail." That inspires a lot of our people to experiment, find errors and turn them into something innovative. A statement like this encourages your people to actively look for errors, and to turn them into pieces of innovation. And: reward employees when they find errors. What we have learned from our development work at Amazon is that you need to always look beyond the surface of an error. Some of our best products have been born from errors.

2. Make due with incomplete information

German companies have a tradition of being thorough and perfectionist. In the digital world, however, you need to loosen those principles a bit. Technology is changing so fast; you need to be fast too. Make decisions even if the information you have is not as complete as you would like.Jeff Bezos put his finger on that when he wrote in his most recent letter to shareholders that "most decisions should probably be made with somewhere around 70% of the information you wish you had. If you wait for 90%, in most cases, you're probably being slow. Plus, either way, you need to be good at quickly recognizing and correcting bad decisions. If you're good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure."

3. Praise the value of learning

I've stressed the need for companies to have a systematic approach to how they deal with errors. But your approach will only work if it's part of your overall culture. Make sure you understand your DNAandknow what people are thinking and talking about on the work floor. Openly praising experimentation in product development and encouraging people to find errors will come across as empty rhetoric if your employees really do have reason to fear repercussions for themselves personally if they make mistakes.

It is a matter of leadership to foster and shape a culture of experimentation that is practiced day in, day out.

Whatever companies come up with in order to systematically learn from mistakes, it will make them better in competing in the digital world. And it will give them the freedom and courage to take their systems, solutions and business models to a higher level.