Scheduling and Autoscaling

Scheduling and autoscaling go hand in hand, as where in the cluster GameServers are provisioned impacts how to autoscale fleets up and down (or if you would even want to)

Cluster Autoscaler

Kubernetes has a cluster node autoscaler that works with a wide variety of cloud providers.

The default scheduling strategy (Packed) is designed to work with the Kubernetes autoscaler out of the box.

The autoscaler will automatically add Nodes to the cluster when GameServers don’t have room to be scheduled on the clusters, and then scale down when there are empty Nodes with no GameServers running on them.

This means that scaling Fleets up and down can be used to control the size of the cluster, as the cluster autoscaler will adjust the size of the cluster to match the resource needs of one or more Fleets running on it.

To enable and configure autoscaling on your cloud provider, check their connector implementation, or their cloud specific documentation.

Google Kubernetes Engine

Amazon Elastic Kubernetes Service

Azure Kubernetes Service

Fleet Autoscaling

Fleet autoscaling is the only type of autoscaling that exists in Code Blind. It is currently available as a buffer autoscaling strategy or as a webhook driven strategy, such that you can provide your own autoscaling logic.

Have a look at the Create a Fleet Autoscaler quickstart, the Create a Webhook Fleet Autoscaler quickstart, and the Fleet Autoscaler Specification for details.

Autoscaling Concepts

To facilitate autoscaling, we need to combine several concepts and functionality, as described below.

Allocation Scheduling

Allocation scheduling refers to the order in which GameServers, and specifically their backing Pods are chosen from across the Kubernetes cluster within a given Fleet when allocation occurs.

Pod Scheduling

Each GameServer is backed by a Kubernetes Pod. Pod scheduling refers to the strategy that is in place that determines which node in the Kubernetes cluster the Pod is assigned to, when it is created.

Fleet Scale Down Strategy

Fleet Scale Down strategy refers to the order in which the GameServers that belong to a Fleet are deleted, when Fleets are shrunk in size.

Fleet Scheduling

There are two scheduling strategies for Fleets - each designed for different types of Kubernetes Environments.

Packed

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server
spec:
  replicas: 100
  scheduling: Packed
  template:
    spec:
      ports:
      - containerPort: 7654
      template:
        spec:
          containers:
          - name: simple-game-server
            image: us-docker.pkg.dev/codeblind/examples/simple-server:0.27

This is the default Fleet scheduling strategy. It is designed for dynamic Kubernetes environments, wherein you wish to scale up and down as load increases or decreases, such as in a Cloud environment where you are paying for the infrastructure you use.

It attempts to pack as much as possible into the smallest set of nodes, to make scaling infrastructure down as easy as possible.

This affects the Cluster autoscaler, Allocation Scheduling, Pod Scheduling and Fleet Scale Down Scheduling.

Cluster Autoscaler

When using the “Packed” strategy, Code Blind will ensure that the Cluster Autoscaler doesn’t attempt to evict and move GameServer Pods onto new Nodes during gameplay.

If a gameserver can tolerate being evicted (generally in combination with setting an appropriate graceful termination period on the gameserver pod) and you want the Cluster Autoscaler to compact your cluster by evicting game servers when it would allow the Cluster Autoscaler to reduce the number of nodes in the cluster, Controlling Disruption describes how to choose the .eviction setting appropriate for your GameServer or Fleet.

Allocation Scheduling Strategy

Under the “Packed” strategy, allocation will prioritise allocating GameServers to nodes that are running on Nodes that already have allocated GameServers running on them.

Pod Scheduling Strategy

Under the “Packed” strategy, Pods will be scheduled using the PodAffinity with a preferredDuringSchedulingIgnoredDuringExecution affinity with hostname topology. This attempts to group together GameServer Pods within as few nodes in the cluster as it can.

Fleet Scale Down Strategy

With the “Packed” strategy, Fleets will remove Ready GameServers from Nodes with the least number of Ready and Allocated GameServers on them. Attempting to empty Nodes so that they can be safely removed.

Distributed

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server
spec:
  replicas: 100
  scheduling: Distributed
  template:
    spec:
      ports:
      - containerPort: 7654
      template:
        spec:
          containers:
          - name: simple-game-server
            image: us-docker.pkg.dev/codeblind/examples/simple-server:0.27

This Fleet scheduling strategy is designed for static Kubernetes environments, such as when you are running Kubernetes on bare metal, and the cluster size rarely changes, if at all.

This attempts to distribute the load across the entire cluster as much as possible, to take advantage of the static size of the cluster.

This affects Allocation Scheduling, Pod Scheduling and Fleet Scale Down Scheduling.

Cluster Autoscaler

Since this strategy is not aimed at clusters that autoscale, this strategy does nothing for the cluster autoscaler.

Allocation Scheduling Strategy

Under the “Distributed” strategy, allocation will prioritise allocating GameServers to nodes that have the least number of allocated GameServers on them.

Pod Scheduling Strategy

Under the “Distributed” strategy, Pod scheduling is provided by the default Kubernetes scheduler, which will attempt to distribute the GameServer Pods across as many nodes as possible.

Fleet Scale Down Strategy

With the “Distributed” strategy, Fleets will remove Ready GameServers from Nodes with at random, to ensure a distributed load is maintained.


Last modified February 28, 2024: initial publish (7818be8)