Package for auto setting GOMAXPROCS based on ECS task and container CPU limits.
Due to Go not being CFS aware golang/go#33803 and because uber automaxprocs is unable to set GOMAXPROCS for ECS uber-go/automaxprocs#66. This lead to gomaxecs.
go get -u github.com/rdforte/gomaxecs
import _ "github.com/rdforte/gomaxecs"
func main() {
// Your application logic here.
}
GOMAXPROCS is an env variable and function from the runtime package that limits the number of operating system threads that can execute user-level Go code simultaneously. If GOMAXPROCS is not set then it will default to runtime.NumCPU which is the number of logical CPU cores available by the current process. For example if I decide to run my Go application on my shiny new 8 core Mac Pro, then GOMAXPROCS will default to 8. We are able to configure the number of system threads our Go application can execute by using the runtime.GOMAXPROCS function to override this default.
CFS was introduced to the Linux kernel in version 2.6.23 and is the default process scheduler used in Linux. The main purpose behind CFS is to help ensure that each process gets its own fair share of the CPU proportional to its priority. In Docker every container has access to all the hosts resources, within the limits of the kernel scheduler. Though Docker also provides the means to limit these resources through modifying the containers cgroup on the host machine.
Lets imagine a scenario where we configure our ECS Task to use 8 CPU's and our container to use 4 vCPU's.
{
"containerDefinitions": [
{
"cpu": 4096, // Limit container to 4 vCPU's
}
],
"cpu": "8192", // Task uses 8 CPU's
"memory": "16384",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
}
The ECS Task CPU period is locked into 100ms
The CPU Period refers to the time period in microseconds, where the kernel will do some calculations to figure out the allotted amount of CPU time to provide each task. In the above configuration this would be 4 vCPU's multiplied by 100ms giving the task 400ms (4 x 100ms).
If all is well and good with our Go application then we would have go routines scheduled on 4 threads across 4 cores.
Threads scheduled on cores 1, 3, 6, 8
For each 100ms period our Go application consumes the full 400 out of 400ms, therefore 100% of the CPU quota.
Now Go is NOT CFS aware golang/go#33803 therefore GOMAXPROCS will default to using all 8 cores of the Task.
Now we have our Go application using all 8 cores resulting in 8 threads executing go routines. After 50ms of execution we reach our CPU quota 50ms _ 8 threads giving us 400ms (8 _ 50ms). As a result CFS will throttle our CPU resources, meaning that no more CPU resources will be allocated till the next period. This means our application will be sitting idle doing nothing for a full 50ms.
If our Go application has an average latency of 50ms this now means a request to our service can take up to 150ms to complete, which is a 300% increase in latency.
In Kubernetes this issue is quite easy to solve as we have uber automaxprocs to solve this issue. So why not use Uber's automaxprocs then and whats the reason
behind gomaxecs package? Well Ubers automaxprocs does not work for ECS uber-go/automaxprocs#66 because the cgroup cpu.cfs_quota_us
is set to -1 🥲. The workaround for this
is to then leverage ECS Metadata as a means to sourcing the container limits and setting GOMAXPROCS at runtime.
If anyone has any good ideas on how this package can be improved, all contributions are welcome.
100 Go Mistakes was the main source of inspiration for this package. The examples were borrowed from the book and modified to suit ECS.
Released under the MIT License.