Trusting The Cloud

Trusting The Cloud


Published: August 31st, 2016

After returning from the first-ever manned space shuttle flight, John Young was asked if he was worried about making the mission. He responded, "Anyone who sits on top of the largest hydrogen-oxygen fueled system in the world; knowing they're going to light the bottom--and doesn't get a little worried--does not fully understand the situation."

In the coming years, the world would learn how risky space travel would really be.

People take all kinds of risks. Most of us drive to work each day, risking everything and putting our trust in complete strangers to play a controlled game of chicken, passing just a few feet away from each other at 90 miles an hour.

In business we also take risks. We want to reach higher, so we invest our time and our money in the hopes of getting a big return. Sure, we sometimes call it the the stock market, or an employee, or a service, or even a client. But in the end we're putting our trust in people.

At Regtix, we put our trust in Amazon (Amazon's AWS cloud service). They have many powerful products and services. But when it all comes down to it, we're not just putting our trust in Amazon's products; we're trusting their people.

Last week, Amazon suffered a catastrophic outage. It was nothing short of historic. For more than 3 days, thousands of customers were completely shut down with no access to their data. At the time of writing this, I'm still not sure exactly what caused the outage.

While 100% of our infrastructure is provided by Amazon, Regtix was not affected at all by this outage. The primary reason for this is that, unlike the sites that went down during the outage, Regtix is set up to withstand failures across multiple data centers run by Amazon. And we're not the only ones. Countless other services (including some very popular ones like Netflix) did not go down because they were set up to handle multiple failures.

I could go into a great deal of technical detail about how Amazon dropped the ball and how this outage is unacceptable. But there are plenty of people doing that. And I don't disagree with the masses of angry customers and critics. Amazon needs to answer to them, and their response will be important in telling us more about the people behind Amazon.

Instead, I'd like to explain that, at Regtix, we're still proud to work with Amazon and we're planning to stick with them, even after a crisis like this.

There are dozens (maybe hundreds now) of cloud providers. New ones are popping up every day. But most experts agree that when it comes to innovation, Amazon is light-years ahead of the pack.

When I first started working with Amazon, they had already built up an impressive and reliable feature set. I was always happy with what I could do on their platform. But after a while, they started picking up momentum when it came to developing and deploying new capabilities that made true cloud infrastructure better. Soon they were deploying features and functionality faster than I could learn how to use them.

We're not talking about new "buttons" in an interface here. We're talking about major new systems that answered questions and problems I thought were impossible to resolve. Almost on a monthly basis for the last few years, Amazon has been pushing the limits on what you can do with their systems.

In addition to an ever-expanding set of highly valuable features, Amazon also provides the ultimate solutions for integration. More than any other service provider I've ever heard of, Amazon provides complete control of their cloud services by allowing customers (like Regtix) to connect and make changes through programming.

In fact, when Amazon launches most new features, they don't provide any way to access these features other than through their programming API. This power makes it possible for us to solve any problem with deploying, monitoring, and scaling our infrastructure and content.

Amazon, unlike every other cloud provider out there, extensively uses their own platform on a massive scale. And even if other providers wanted to do this, they really couldn't. Amazon.com provides the perfect test environment to experiment with and learn 1) what features are needed, 2) how to build them, and 3) whether or not they are actually working.

Another one of my favorite things about Amazon is something that most people don't like about them. It's the fact that they don't provide technical support for free. The reason I love this is because it makes me realize that Amazon is building an infrastructure as a service--NOT "support" as a service. They build and provide access to their products in a way that gives their customers complete control. You can't call in and ask them to do something that you don't have access to because they've already given you access to everything.

I realize that this kind of service is not for everyone. And Amazon does provide Premium Support options for a reasonable cost. But for me, I love knowing that if we get into it enough, we can understand and manage everything in my infrastructure the same way Amazaon engineers could.

Our dedication to Amazon all comes down to their ability, their drive, and their track record for innovation. In fact, I believe Amazon is doing some of the most important work in the world. Second only to energy technology, I believe that the kinds of ideas Amazon (and other technology companies like them) develop and deploy are driving the future of communication and commerce.

Are we going to make changes in what we were doing in reaction to this outage? Yes. We learned a lot by watching what happened with this failure and we're going to react.

Are we going to leave Amazon? No.

Are we going to spread our service across multiple cloud providers? Probably not. Sure, we could try to increase redundancy by doing that, but it's kind of like asking whether or not we should put parachutes in commercial airliners (which of course, we don't). The truth is, balancing across multiple cloud providers is difficult and expensive. And it's questionable as to whether that that kind of setup would even save you in a crisis. Sure, it could save some applications in some specific situations, but it can't save everyone.

Am I worried? Of course I am. Anyone who "doesn't get a little worried--does not fully understand the situation." I'm worried that Amazon might have another catastrophic failure. I'm worried that they might go out of business. I have lots of fears about what could happen with Amazon.

Do I trust them? Yes, I trust them at least enough to take the risk. What am I risking? Personally, I'm risking a lot for myself and the people with whom I work (and our families). But I'm trusting that Amazon (i.e., Amazon's people) will continue to take risks, continue to be innovative, and continue to provide the best solutions out there.

I wasn't born when the first manned space mission took flight. But I do remember watching the Challenger explosion in a classroom as a young boy. It was devastating. It made us question, "Why are we doing this?"

The Amazon outage may seam insignificant compared to the Challenger accident, and in many ways it can't be compared. But I think there are some important similarities. My biggest fear about Amazon is that people might undermine their innovation by attacking the people who are pushing the limits of a technology that is so important for everyone's future.

For the Challenger, an answer to "Why are we doing this?" came from Ronald Reagan:

"Our nation is indeed fortunate that we can still draw on an immense reservoir of courage, character, and fortitude, that we are still blessed with heroes like those of the Space Shuttle Challenger. Man will continue his conquest of space to reach out for new goals and ever-greater achievements."

It takes a certain kind of courage to launch yourself into space, trusting your life to technology, mechanics, ideas, and predictions. But it also takes courage to continue putting your time and energy into building something great that has inherent risks, especially if one of those risks is that the world will come after you if you make a mistake.

I hope that, in addition to reacting to the reasons behind the recent failure, Amazon will continue their pattern of innovation without missing a step. The truth is that Regtix couldn't be what it is without Amazon--and other companies like them--that are pushing technology and innovation to the outer limits.


Post a Comment: