Back when Jeff Bezos filled orders in his garage and drove packages to the post office himself, crunching the numbers on costs, tracking inventory, and forecasting future demand was relatively simple. Fast-forward 25 years, Amazon's retail business has more than 175 fulfillment centers (FC) worldwide with over 250,000 full-time associates shipping millions of items per day.
Amazon's worldwide financial operations team has the incredible task of tracking all of that data (think petabytes). At Amazon's scale, a miscalculated metric, like cost per unit, or delayed data can have a huge impact (think millions of dollars). The team is constantly looking for ways to get more accurate data, faster.
That's why, in 2019, they had an idea: Build a data lake that can support one of the largest logistics networks on the planet. It would later become known internally as the Galaxy data lake. The Galaxy data lake was built in 2019 and now all the various teams are working on moving their data into it.
A data lake is a centralized secure repository that allows you to store, govern, discover, and share all of your structured and unstructured data at any scale. Data lakes don't require a pre-defined schema, so you can process raw data without having to know what insights you might want to explore in the future. The following figure shows the key components of a data lake.
Have you ever received a call from your bank because they suspected fraudulent activity? Most banks can automatically identify when spending patterns or locations have deviated from the norm and then act immediately. Many times, this happens before victims even noticed that something was off. As a result, the impact of identity theft on a person's bank account and life can be managed before it's even an issue.
Having a deep understanding of the relationships in your data is powerful like that.
Consider the relationships between diseases and gene interactions. By understanding these connections, you can search for patterns within protein pathways to find other genes that may be associated with a disease. This kind of information could help advance disease research.
The deeper the understanding of the relationships, the more powerful the insights. With enough relationship data points, you can even make predictions about the future (like with a recommendation engine). But as more data is connected, and the size and complexity of the connected data increases, the relationships become more complicated to store and query.
During AWS re:Invent 2019, we announced a number of High Performance Computing (HPC) innovations including the Amazon EC2 M6g, C6g, and R6g instances powered by next-generation Arm-based AWS Graviton2 Processors. We also recently announced that new AMD-powered, compute-optimized EC2 instances are in the works.
Today, I'm happy to share some exciting news about our HPC solutions. On November 18, AWS won six HPCwire Readers' and Editors' Choice Awards at SC19, the International Conference for High Performance Computing, Networking, Storage, and Analysis.
Today, I am happy to announce our plans to open a new AWS Region in Spain in late 2022 or early 2023! I'm excited by the opportunities the availability of hyper scale infrastructure will bring to Spanish organizations of all sizes. When the AWS Europe (Spain) Region is launched, developers, startups, and enterprises, as well as government, education, and non-profit organizations will be able to run their applications and serve end users across the region from data centers located in Spain.
Currently, AWS provides 69 Availability Zones across 22 infrastructure regions worldwide, with announced plans for thirteen more Availability Zones and four more Regions in Indonesia, Italy, South Africa, and Spain in the next few years. The new AWS Europe (Spain)Region will consist of three Availability Zones (AZs) at launch, and will be AWS's seventh region in Europe, joining existing regions in Dublin, Frankfurt, London, Paris, Stockholm, and the upcoming Milan region launching in early 2020. AZs refer to data centers in separate distinct locations within a single Region that are engineered to be operationally independent of other AZs, with independent power, cooling, physical security, and are connected via a low latency network. AWS customers focused on running highly available applications can architect their applications to run in multiple AZs to achieve even higher fault-tolerance.
Today is another milestone for us in Spain. This Region adds to other investments we have been making, over the past years, to provide customers with advanced and secure cloud technologies.
There are places so remote, so harsh that humans can't safely explore them (for example, hundreds of miles below the earth, areas that experience extreme temperatures, or on other planets). These places might have important data that could help us better understand earth and its history, as well as life on other planets. But they usually have little to no internet connection, making the challenge of exploring environments inhospitable for humans seem even more impossible.
How do we push the boundaries of what's possible?
The answer to this question is actually on your phone, your smartwatch, and billions of other places on earth—it's the Internet of Things (IoT). Connected devices allow us to extend our senses to remote locations, such as a robot carrying out work on Mars or monitoring remote oil wells.
This is the exciting future for IoT, and it's closer than you think. Already, IoT is delivering deep and precise insights to improve virtually every aspect of our lives. Here's a few examples:
- IoT sensors in a factory can monitor and predict equipment failure before an accident.
- Healthcare providers can provide remote monitoring of patient health—improving patient care.
- Security cameras can better protect people with real-time notifications.
Because these IoT devices are powered by microprocessors or microcontrollers that have limited processing power and memory, they often rely heavily on AWS and the cloud for processing, analytics, storage, and machine learning. But as the number of IoT devices and use cases grow, people are finding that managing these connected devices presents new challenges. Sometimes an internet connection is weak or not available at all, as is often the case in remote locations. For some applications, a trip to the cloud and back isn't possible because of latency requirements (for example, an autonomous car interpreting its environment in real time).
There's also the cost to send data to the cloud to consider. Some sensors, like those in factories, are collecting an incredible amount of data and sending it all to the cloud could get expensive. These barriers are driving some people to the edge—literally.
In this post, I want to talk about edge computing, the power to have compute resources and decision-making capabilities in disparate locations, often with intermittent or no connectivity to the cloud. In other words, process the data closer to where it's created.
Innovation has always been part of the Amazon DNA, but about 20 years ago, we went through a radical transformation with the goal of making our iterative process—"invent, launch, reinvent, relaunch, start over, rinse, repeat, again and again"—even faster. The changes we made affected both how we built applications and how we organized our company.
Back then, we had only a small fraction of the number of customers that Amazon serves today. Still, we knew that if we wanted to expand the products and services we offered, we had to change the way we approached application architecture.
The giant, monolithic "bookstore" application and giant database that we used to power Amazon.com limited our speed and agility. Whenever we wanted to add a new feature or product for our customers, like video streaming, we had to edit and rewrite vast amounts of code on an application that we'd designed specifically for our first product—the bookstore. This was a long, unwieldy process requiring complicated coordination, and it limited our ability to innovate fast and at scale.
I'm happy to announce today that the new AWS Middle East (Bahrain) Region is now open! This is our first AWS Region in the Middle East and I'm excited by the opportunities the availability of hyper scale infrastructure will bring to organizations of all sizes. Starting today, developers, startups, and enterprises, as well as government, education, and non-profit organizations can run their applications and serve end users across the region from data centers located in the Middle East.
With this launch, our infrastructure now spans 69 Availability Zones across 22 geographic regions around the world. We have also announced plans for nine more Availability Zones in three more AWS Regions in Indonesia, Italy, and South Africa coming online in the next few years. The new AWS Middle East (Bahrain) Region offers three Availability Zones (AZs) at launch. AZs refer to data centers in separate distinct locations within a single Region that are engineered to be operationally independent of other AZs, with independent power, cooling, physical security, and are connected via a low latency network. AWS customers focused on running highly available applications can architect their applications to run in multiple AZs to achieve even higher fault-tolerance.
A few months ago, I wrote the post "Amazon Aurora ascendant: How we designed acloud-native relational database," and now I'm excited to share some news about the people behind the service. This week, the developers of Amazon Aurora have won the 2019 Association for Computing Machinery's (ACM) Special Interest Group on Management of Data (SIGMOD) Systems Award. The award recognizes "an individual or set of individuals for the development of a software or hardware system whose technical contributions have had significant impact on the theory or practice of large-scale data management systems."
Customers often ask me how AWS maintains security at scale as we continue to grow so rapidly. They want to make sure that their data is secure in the AWS Cloud, and they want to understand how to better secure themselves as they grow.
Last year, I spent some time in Jakarta visiting HARA, an AWS customer. They've created a way to connect small farms in developing nations to banks and distributers of goods, like seeds, fertilizer, and tools. Traditionally, rural farms have been ignored by the financial world, because they don't normally have the information required to open an account or apply for credit. With HARA, this hard-to-obtain data on small farms is collected and authenticated, giving these farmers access to resources they've never had before.
A major component to the system that HARA created is blockchain. This is a technology used to build applications where multiple parties can interact through a peer-to-peer-network and record immutable transactions with no central trusted authority. HARA has had to develop additional technologies to make their application work on Ethereum, a popular, open source, blockchain framework.