Back

The Paralysis of Choice in Cloud

14 November 2021

15 minute read

Rak Siv
Engineering
developer losing time making decisions

Software Development in the era of Cloud Computing has grown in complexity. The challenges to be solved aren't limited to code or infrastructure management, they now involve disparate systems and services managed by Cloud Providers and 3rd Parties.

Security, performance, and scalability still require significant time and expertise to handle appropriately. Cloud providers recognize the challenges faced by teams and provide a variety of services to solve typical development problems such as API gateways, publish/subscribe message processing, queue management, document storage, secret management, etc. promising to be reliable, performant, secure, and scalable. This can lead to the assumption that productivity will spike. Your team will not have to exert additional effort to build, test, or maintain these foundational services required by most projects.

These innovations give development teams a huge head start in migrating or developing new software in the Cloud. However, that development productivity comes with a trade-off - due to the proprietary nature of the services, the team must make early design decisions that may impact their product for years. This decision making is further complicated by the vast number of service options, even from a single Cloud Provider.

In 2004, an American psychologist Barry Schwartz published a book titled 'The Paradox of Choice'. It presented a view that while having choices was critical to the well-being of an individual, an overwhelming amount of choices manifested stress and reduced overall happiness. While this is a slight oversimplification, we can certainly see parallels here when thinking about software development.

Consider some of the assessments needed for each software component or service, each requiring careful consideration before a choice can be made:

  • Upfront and ongoing costs
  • Vendor relationship management
  • Security implications
  • Architectural fit
  • Future organization goals
  • Support and Maintenance (both internal and external)

Even after we've analyzed all of the above and made our choice, we still face one or more of the following risks:

  • Being locked in and reliant on a specific technology
  • Limited ability to migrate software to other platforms without significant refactor/reauthoring of code
  • Requiring Cloud Provider skill-sets in the development team (security, infrastructure, and operations teams will likely always require an element of this)

Additionally, we're often forced to make these decisions early in the development lifecycle, when the system's future needs are unknown and challenging to predict.

Illustrating the number of choices

AWS, GCP, and Azure each boast well over 100 services. Even within a single Cloud, many of these services overlap or provide alternatives to achieve similar goals, each with its own subtle strengths and weaknesses.

Let's take a look at one of the most important services 'Compute' - where and how you execute the code for your application. We'll include Digital Ocean in this list, as they do support Compute however it should be noted that they specialize in Infrastructure as a Service (IaaS) and have limited offerings for Platform as a Service (PaaS) or Software as a Service (SaaS).

Compute yields several different options and the cloud providers themselves have noticed the complexity in choice here. GCP and Azure both offer decision trees to help developers get acquainted with the options and how they might go about narrowing them down. Both also explicitly state that each option should be analyzed in much greater detail and offer specialized consulting with a Cloud Architect to make the appropriate choice. Interestingly, a handful of newer compute services are missing from these documents, meaning developers need to dig even deeper to assess the options.

What can we do to help development teams break free from the paralysis of choice?

Delaying decisions

One advantage of Nitric is minimizing upfront decisions, by maintaining inter-service and inter-cloud portability. The ability to change services, without changing application code means you can start building immediately. Instead of beginning with a detailed analysis of the services to be used, and their tradeoffs, you can start building your product, service or prototype. Nitric provides compute, events, documents, blob store, secrets, APIs and many other cloud services, implemented as interchangeable plugins, with sensible defaults. As your system is built, tested, and used its needs will naturally become clear. If it becomes necessary to use an alternate service for reasons such as cost, scalability, security, etc. this can be done by changing the selected plugin, without modifying the application code.

We firmly believe that development teams should never be forced to lock themselves into a cloud provider or service. Software should be portable and be able to be moved to another provider or service if/when the need arises. When the upfront lock-in of a cloud provider is taken out of the equation, you can realize the promised benefits of cloud-native development, such as speed to market and reduced operating costs.

The following table shows the decisions we've made to support fundamental cloud service offerings. The services named in the 'Nitric Services' column are provided by the Nitric APIs, which interact with managed-cloud services on behalf of your applications to achieve API consistency. This enables you to write one version of code that will have functional equivalence across any of the listed cloud providers.

Going back to our 'Compute' example from before - instead of deciding between 27 or more options from 4 cloud providers, you now just use 'Compute' as a service from the Nitric framework and start writing code. You can stick with the defaults to get started and still decide on a different cloud or service later.

FeatureAWSAzureGoogle CloudLocal
APIsAPI GatewayAPI ManagementAPI GatewayCustom
CollectionsDynamoDBCosmos DBFireStoreBoltDB
Messaging: TopicsSNSEvent GridPubSubCustom
Messaging: QueuesSQSStorage QueuesPubSub
Pull Subscription
Custom
SchedulesCloudWatch Event Bridge🚧 In ProgressCloud Schedulercurrently unavailable
SecretsSecrets ManagerKey VaultSecret ManagerCustom
StorageS3Blob StorageCloud StorageMinIO
Compute (Handlers)
APIs, Schedules, Topics
LambdaContainer AppsCloudRunDocker

Gains in Productivity

We've discussed previously having too many choices can be detrimental, now let's discuss the benefits to productivity when upfront choices are minimized.

In any development scenario, there are infinite ways to solve problems, one way of achieving a healthy level of productivity is accomplished through establishing a toolset of well-understood and trusted tools. By taking a look at the commonalities of solutions in Fin-Tech and other verticals, we've identified the Cloud services we know are regularly used in an implementation. A comprehensive list of supported services can be found at our references page.

A development team is most powerful when they have a well-defined goal and exactly the right tools to execute with. Context switching and a development team's ability to stay focussed have been hot topics for years. The constant need for debates and meetings around a technology change can adversely affect a development team's ability to focus on application architecture. Teams working within well-defined boundaries can stay focused and can get on with working on the tasks that they are most skilled in.

Nitric encourages teams to develop code in the languages they are most comfortable with. We currently support TypeScript/JavaScript, Java, Python, and GoLang with extensibility to other languages including .NET available by feature request or community development. Support for these languages is achieved via a lightweight API.

We've also implemented a local run feature that can help you validate and test your applications running Nitric and your application locally, avoiding the expensive publish operations for testing or simply spot-checking and showing off your work along the way.

Let's see an example

This example demonstrates publishing events to a 'user-registered' topic and configuring a 'welcome-user' function, which will subscribe to the topic.

When a developer wants to publish an event for services utilizing a publish/subscribe pattern, they simply fire an event - all they need to worry about is the name of the topic, the payload, and when they'd like to fire it! Nitric handles the choice of service, invocation permissions, configuration, and the dispatch operation to publish an event. On the flip side, Nitric also handles the permissions and configuration required for subscribers.

code and configuration example

Human readable topic names are mapped to internal topic names and system-level identifiers in the various providers so that developers are not required to understand how topics are managed in the various cloud providers. Most importantly and conveniently, all permissions required to fire/handle the event are automatically generated, no need to log into an administration console and map topic identifiers to services with specific permission levels.

As you can see in this example, the code does not have any reliance on any one cloud provider. When deployed it will function seamlessly in the provider which you ultimately determine is right for you. The option to change your mind is always available to you, you'll be able to deploy to a different cloud provider. (As long as there is equivalent support in the new cloud provider for the services you have used in your projects).

Finally, let's not forget about the impact on our friends from IT security!

Standardization of Cloud SaaS offerings via Nitric can also help bring service visibility to the IT departments. Shadow IT has become an issue for many organizations who struggle to keep on top of monitoring the security, compliance and testing required by their organization or industry.

When a security team is presented with a fairly concrete list of available services, there are fewer surprises. They can pro-actively proceed with verification/validation procedures to ensure that the technology being used is safe and appropriate. At the end of the day, we all want to keep our IT security teams happy!