The likelihood of dealing with enterprise IT gremlins1 is heightened during certain times of the year for any DevOps team. My brother, who works in IT Disaster Recovery for a healthcare agency, reminded me of this during our most recent Thanksgiving gathering. He had to address four hours of downtime right before the holiday, as something DevOps related pushed a change to the production system instead of in a test environment. Sound familiar?

Whether it’s a holiday, close of the quarter, or “go live” day, any number of factors can put a little extra stress on IT staff with more of a chance for network gremlins to plague any enterprise. Although not as mischievous as mythical gremlins, sloppiness causes trouble, difficulties, or unexpected failures—threatening security as well as contributing to downtime and poor performance.

Self-Service Resources and IT Automation

Keeping gremlins at bay can be achieved with a solid plan for self-service options and IT automation. End users need to have access to hardened resources and processes when others who have the keys to these resources are on PTO or swamped by other high priority projects.

Leaving users in the dust while waiting for resources or an update can make them turn to workarounds or short cuts. The idea is that you don’t want anyone in your organization going rogue during the stressful times. The more that enterprise IT and DevOps teams have self-service IT enabled, the less likely the chance for folks to fend for themselves.

Making any DevOps practice or IT process bulletproof for occasional mishaps is nearly impossible, but reducing the likelihood is worth the effort needed by using the following approaches:

A centrally managed platform like CloudBolt can get any IT organization on the right path to avoiding the “gremlin” effect, especially as we approach another holiday season and schedules and priorities will undoubtedly be different for many enterprises.

1—Gremlins are unexplained problems or faults (↑BACK↑)

Over time computing has gone from mainframes to bare metal servers to on-premises virtualization to cloud server instances and containerization to serverless computing.  What’s next, codeless computing? Probably not, but luckily we’re not talking about something as bizarre as that with serverless computing. The server element for executing code is essentially abstracted away from its developers, and it’s new enough that we’re in the Wild West.

Serverless Computing Explained

Serverless computing is a fancy way of saying that you don’t have to worry about the servers when you want to execute code—often referred to as a Function-as-a-Service (FaaS). Major cloud providers have compute capacity ready for anyone to reserve and run virtual machines (VMs) and containerization of microservices.

For public cloud providers, why not take it one step further and isolate running code on demand as a way to make more money? This is great for developers who need to continuously add services and features to their application stack but don’t want to fuss with managing the infrastructure.

Major cloud providers offer these serverless computing options with an emphasis on the payment model:

As great as these services are, though, we still have to contend with The Good, The Bad, & The Ugly

The Good

The good is the on-demand nature of this computing strategy at low cost. Suppose an application developer wants to give their aging application architecture a quick lift with a small feature that checks an Internet of Things (IoT) sensor in a smart home, like air quality to automatically suggest or order a new air filter. Instead of adding the compute power of infrastructure needed for many thousands of subscribers to the application, they can develop this on-demand function that only needs to run occasionally.

The Bad

The bad is that these functions can get complicated and hard to manage, especially if they must run for more than five minutes at a time in an application process. They must also be accessed by a private API gateway and will require the dependencies from common libraries to be packaged into them. This can be terribly inefficient compared to containerization. The more complicated the coding required the less likely a serverless function is going to suit the application architecture well. For more information, see What is Serverless Architecture? What are its Pros and Cons?

The Ugly

The ugly is that there is currently no standardization of serverless computing across the different public providers. Vendor lock-in will be at risk when these enticing functions as codewith low pricesbecome addicting to some developers and enterprises. They cannot be easily ported around like in the same was as containers can.

As Rick Kilcoyne, VP Solutions Architecture at CloudBolt stated in a recent article:

“…tantalizing as serverless computing is, one must be fully aware that moving code between serverless platforms is extremely difficult and only made more so by cloud vendor specific libraries, paradigms, and IAM. Serverless computing is the technological equivalent of a snare trap as there’s virtually no way to easily migrate from one platform to another once committed.”

Roundup

Serverless computing should definitely be a part of any enterprise hybrid cloud strategy. Just as a hybrid cloud application has a mix of public and private clouds, it can also have a mix of infrastructure technologies such as virtualization, containerization, and serverless computing with functions. Our CloudBolt hybrid cloud management platform helps you manage it all from one place.  

To see how CloudBolt makes serverless computing easier, check out a demo.

At CloudBolt, we believe that software solutions should be easy to maintain, manage, and understand. We also believe they should be self-regulating and self-healing, when possible. You will see a focus on this starting in 8.4—Tallman but also continuing through our 9.x releases, which will give you better visibility into CloudBolt’s internal status, management capabilities directly from the web UI, and reduce the number of times you need to ssh to the CB VM to check things or perform actions.

CloudBolt 8.4—Tallman introduces a new Admin page called “System Status” which provides several tools for checking on the health of CloudBolt itself.

The System Status Page in 8.4—Tallman

To see the System Status page in your newly installed/upgraded CloudBolt 8.4-Tallman, navigate to Admin > Support Tools > System Status. You will see a page that looks a bit like this:

There are three main parts of this page.

1. CloudBolt Mode

This section provides a way to put CloudBolt into admin-only maintenance mode. This prevents any user who is not a Super Admin or CloudBolt admin from logging in or navigating in this CloudBolt instance. This is useful for times when you need to perform maintenance on CloudBolt (eg. upgrading it, making changes to the database, etc), and you want to prevent users from accessing it while in an intermediate state, but you yourself need to perform some preparation and verification within the CB UI before and after the maintenance.

2. Job Engine

This section shows the status of each job engine worker, each running on a different CloudBolt VM now that active-active Job Engines are supported. It also shows a chart of all jobs run in the last hour and day per job engine. When things are healthy, and the job engines are not near their max concurrency limit, there should be a fairly even split of how many jobs are being run by each worker.

3. Health Checks

This section has several kinds of checks:

Ensuring the health of the systems that underlie CloudBolt can help you quickly hone in on the root cause of an issue, and we hope that the system status page will help narrow the time it takes to troubleshoot and resolve issues with CloudBolt.

What’s Next for the System Status Page

We have some ideas for what we might add next:

If there are any of these that seem like they would be especially useful to you, we’d love to hear that to help us prioritize. We’d also love to hear any additional ideas you have for this new page!