Automatic scaling of Azure SQL Databases

Image

Automatic scaling of Azure SQL Databases

Build It BetterFebruary 21, 2019

Introduction

Enable hosts most of its DealTrack clients in Microsoft Azure, taking advantage of the wide range of resources it offers, including Azure SQL Databases. In general, each DealTrack client will have their own SQL database within Azure containing their own data, varying wildly in usage patterns and performance needs. Unfortunately, this can make it difficult to decide on the appropriate performance tiers, often resulting in the over-provisioning of resources based on peak usage and over paying. Azure attempts resolve this issue by allowing these databases to be placed in an elastic pool, where all databases in said pool share resources, allowing performance to stay high while keeping costs low.

Enable has taken this one step further by taking advantage of these elastic pools, coupled with the fact that the performance tier of the elastic pools can be scaled dynamically without any down time for the system. We have implemented a solution capable of providing high performance as and when the system requires it, while scaling down the performance tier during periods of low activity.

This results in Enable being able to offer its clients a solution with a solid user experience and strong computational performance, while truly minimizing the associated running costs.

What is database scaling?

In the case of DealTrack, it is not the physical size of the database that fluctuates much, but rather the resources required when operating. Resources in Azure SQL Database are defined using the Database Transaction Unit (DTU), which represents a blended measure of CPU, memory, reads, and writes. Similarly, shared resources in an elastic pool are measured by elastic Database Transaction Units or eDTUs. Azure allows its users to assign different numbers of the eDTUs to be assigned to elastic pools, where the user pays for the number that are assigned, and it is this number od eDTUs that we are dynamically scaling here.

The above is known as the DTU-based purchasing model. Azure also offers a vCore-based purchasing model, which simply offers a different way of assigning resources. Although this post uses DTUs as an example, everything in this article applies to vCores in the same way.

Enable would like to be able to automatically scale its Azure databases based on a set of rules, for example, to scale the database up if eDTU usage exceeds 70% for 10 minutes straight, as this may be a sign that more resources are required. Another example would be to scale the database on a schedule, providing more resources during working hours, and less at night when a system is not being used.

Unfortunately, although Azure does provide the ability to manually scale databases through their Azure portal, there is no way to configure the elastic pool to automatically scale in such a way. While alerts can be set up to notify if certain conditions are met, this still requires someone to log in to the portal and manually scale the database, remembering to scale the database back down at a later date.

Scaling up databases manually would not only take up a lot of staff time, it would also require a member of staff to be available 24/7 to scale databases, as well as introducing the possibility of human error. Enable has therefore developed an automated solution to this problem, capable of scaling entirely without human intervention.

Automated solution

The Automated solution has been implemented as a console application, hosted in Azure where it runs as a WebJob. Using the Quartz job scheduling library, a job runs every 5 minutes to determine whether scaling is required, and, if it is, what eDTU value to scale up or down to. The solution uses the Azure Management Libraries for .NET Core to obtain current data on the state of the Azure elastic pools, and is used to tell Azure to scale the elastic pools.

To determine whether or not scaling is required, the solution evaluates an ordered collection of IAutoScaleRules. The individual evaluation of such a rule will always return one of two types of IAutoScaleResults:

  • If NoAutoScaleResult is returned it means a rule did not apply or was not satisfied, and indicates that the next rule in the collection should be evaluated (it does not necessarily mean we shouldn’t scale).
  • If AutoScaleResult is returned, this conclusively tells the system whether or not it should scale. If such a result is returned, no subsequent rules will be evaluated, and the system will carry out the action as suggested by the returned result. This could be either to scale the database up, scale it down, or not scale it at all.

The first rule to be evaluated in our solution is the ScalingInProgressAutoScaleRule, which simply determines whether the elastic pool in question is currently in the process of scaling. If it is, then an AutoScaleResult is returned telling the system not to take any action. If NoAutoScaleResult is returned, the system will move on to the next rule, which is the CoolDownPeriodAutoScaleRule. This rule demands that no scaling has taken place within the previous x minutes (this is configurable), and will tell the system not to scale if such recent activity has been found.

Custom rules

Assuming neither of the above two rules have been satisfied, the ordered, “custom” rules are next. The custom rules are configured using json, and can be based on almost any triggers desired. To illustrate how these rules work, an example json configuration of three rules is given below.

"Rules": [
    {
        "Name": "Auto scale rule #1",
        "Schedule": {
            "StartTime": "07:00:00",
            "EndTime": "19:00:00",
            "DaysOfWeek": [ "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" ]
        },
        "Trigger": {
            "Metric": "dtu_consumption_percent",
            "Operator": "GreaterThanOrEqual",
            "Aggregation": "Average",
            "Threshold": 70,
            "TimeWindow": "00:10:00"
        },
        "Action": {
            "Capacity": 200
        }
    },
    {
        "Name": "Auto scale rule #2",
        "Schedule": {
            "StartTime": "07:00:00",
            "EndTime": "19:00:00",
            "DaysOfWeek": [ "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" ]
        },
        "Trigger": {
        },
        "Action": {
            "Capacity": 100
        }
    },
    {
        "Name": "Default rule",
        "Schedule": {
            "StartTime": "00:00:00",
            "EndTime": "23:59:59",
            "DaysOfWeek": [ "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" ]
        },
        "Trigger": {
        },
        "Action": {
            "Capacity": 50
        }
    }
]

It may seem confusing at first glance, but it’s not so bad when you understand what they are saying. All rules come with two conditions, a Schedule and a Trigger. When a rule is evaluated, if both the Schedule and the Trigger are satisfied, the Action will be enforced.

Let’s break down the above rules:

  • Auto scale rule #1 — The Schedule is set from 7am to 7pm on weekdays only, meaning that the rule has to be evaluated at those times for it to be satisfied. The Trigger indicates that the average DTU consumption needs to be greater than or equal to 70% for it to be satisfied. If both of these conditions are met, the Action returns a desired eDTU capacity of 200, meaning the elastic pool will be scaled to a value of 200 eDTUs.
  • Auto scale rule #2 — This rule has no Trigger configured, but it does have the same 7am to 7pm on weekdays only Schedule. In this case, if the current time falls anywhere inside that Schedule, the Action returns a desired eDTU capacity of 100.
  • Default rule — This is the final rule of any configuration. The Schedule is configured for all day and every day, without any Trigger, meaning if the first two rules are not satisfied (e.g. it’s currently a Sunday), the eDTU capacity should be set to 50.

At its core, these rules are based on, and evaluated using, specifications built up using the specification pattern. An example of this can be found in the rules responsible for evaluating whether or not the current time schedule applies.

var daysOfWeek = _schedule.DaysOfWeek.ToArray();
var interval = new Interval(_schedule.StartTime, _schedule.EndTime);

_specification = new DayOfWeekSpecification(daysOfWeek)
    .And(new TimeOfDaySpecification(interval));

if (_specification.IsSatisfiedBy(context.UtcNow))
{
    var result = new AutoScaleResult(
        shouldScale: true,
        capacity: _capacity);
}

In the above, two specifications (DayOfWeekSpecification and TimeOfDaySpecification) are chained together using the And specification. If the current time falls within the scheduled days and times, the rule is satisfied, and an AutoScaleResult is returned.

Conclusion

Enable has built up a flexible and extendable solution capable of automatically scaling Azure elastic pools based on highly configurable rules and triggers. The solution is capable of scaling databases on a simple schedule if that is all that is desired, as well as being able to tack more complex scenarios as required by each solution.

Using this custom solution, Enable is able to offer users a smooth user experience with performance wherever and whenever needed, while keeping hosting costs low.

Read other articles from our blog

Latest

Are your rebate dollars being gobbled up?

Most distributors fail to claim all rebate due, leaving millions of dollars on the table each year. Here are five ways to get more from your supplier agreements. Thanksgiving is a time for catching up with family, eating way too much, and forgetting about work for a couple of days....