Dotnet Core Docker Container Leaks RAM on Linux and causes OOM

C#LinuxDockerUbuntu.Net Core

C# Problem Overview


I am running Dotnet Core 2.2 in a Linux container in Docker.

I've tried many different configuration/environment options - but I keep coming back to the same problem of running out of memory ('docker events' reports an OOM).

In production I'm hosting on Ubuntu. For Development, I'm using a Linux container (MobyLinux) on Docker in Windows.

I've gone back to running the Web API template project, rather than my actual app. I am literally returning a string and doing nothing else. If I call it about 1,000 times from curl, the container will die. The garbage collector does not appear to be working at all.

Tried setting the following environment variables in the docker-compose:

DOTNET_RUNNING_IN_CONTAINER=true
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
ASPNETCORE_preventHostingStartup=true

Also tried the following in the docker-compose:

mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m

(these only make it die faster)

Tried setting the following to true or false, no difference:

ServerGarbageCollection

I have tried instead running as a Windows container, this doesn't OOM - but it does not seem to respect the memory limits either.

I have already ruled out use of HttpClient and EF Core - as I'm not even using them in my example. I have read a bit about listening on port 443 as a problem - as I can leave the container running idle all day long, if I check at the end of the day - it's used up some more memory (not a massive amount, but it grows).

Example of what's in my API:

// GET api/values/5
[HttpGet("{id}")]
public ActionResult<string> Get(int id)
{
return "You said: " + id;
}

Calling with Curl example:

curl -X GET "https://localhost:44329/api/values/7" -H  "accept: text/plain" --insecure

(repeated 1,000 or so times)

Expected: RAM usage to remain low for a very primitive request

Actual: RAM usage continues to grow until failure

Full Dockerfile:

FROM microsoft/dotnet:2.2-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM microsoft/dotnet:2.2-sdk AS build
WORKDIR /src
COPY ["WebApplication1/WebApplication1.csproj", "WebApplication1/"]
RUN dotnet restore "WebApplication1/WebApplication1.csproj"
COPY . .
WORKDIR "/src/WebApplication1"
RUN dotnet build "WebApplication1.csproj" -c Release -o /app

FROM build AS publish
RUN dotnet publish "WebApplication1.csproj" -c Release -o /app

FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "WebApplication1.dll"]

docker-compose.yml

version: '2.3'

services:
  webapplication1:
    image: ${DOCKER_REGISTRY-}webapplication1
    mem_reservation: 128m
    mem_limit: 256m
    memswap_limit: 256m
    cpu_percent: 25
    build:
      context: .
      dockerfile: WebApplication1/Dockerfile

docker-compose.override.yml

version: '2.3'

services:
  webapplication1:
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
      - ASPNETCORE_URLS=https://+:443;http://+:80
      - ASPNETCORE_HTTPS_PORT=44329
      - DOTNET_RUNNING_IN_CONTAINER=true
      - DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
      - ASPNETCORE_preventHostingStartup=true
    ports:
      - "50996:80"
      - "44329:443"
    volumes:
      - ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
      - ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro

I'm running Docker CE Engine 18.0.9.1 on Windows and 18.06.1 on Ubuntu. To confirm - I have also tried in Dotnet Core 2.1.

I've also given it a try in IIS Express - the process gets to around 55MB, that's literally spamming it with multiple threads, etc.

When they're all done, it goes down to around 29-35MB.

C# Solutions


Solution 1 - C#

This could be because garbage collection (GC) is not executed.

Looking at this open issue it looks very similar:

https://github.com/dotnet/runtime/issues/851

One solution that made Ubuntu 18.04.4 work on a virtualized machine was using Workstation garbage collection (GC):

<PropertyGroup>
    <ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>

https://github.com/dotnet/runtime/issues/851#issuecomment-644648315

https://github.com/dotnet/runtime/issues/851#issuecomment-438474207

https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc

This is another finding:

> After further investigations I've noticed that there is big difference > between my servers in amount of available logical CPUs count (80 vs > 16). After some googling I came across this topic dotnet/runtime#622 > that leads me to an experiments with CPU/GC/Threads settings. > > I was using --cpus constraint in stack file; explicitly set > System.GC.Concurrent=true, System.GC.HeapCount=8, > System.GC.NoAffinitize=true, System.Threading.ThreadPool.MaxThreads=16 > in runtimeconfig.template.json file; update image to a 3.1.301-bionic > sdk and 3.1.5-bionic asp.net runtime — I made all this things in a > various combinations and all of this had no effect. Application just > hangs until gets OOMKilled. > > The only thing that make it work with Server GC is --cpuset-cpus > constraint. Of course, explicit setting of available processors is not > an option for a docker swarm mode. But I was experimenting with > available cpus to find any regularity. And here I got a few > interesting facts. > > What is interesting, previously I have mirgated 3 other backend > services to a new servers cluster and they all go well with a default > settings. Their memory limit is set to 600 Mb but in fact they need > about 400 Mb to run. Things go wrong only with memory-consuming > applications (I have two of those), it requires 3 Gb to build > in-memory structures and runs with a 6 Gb constraint. > > It keeps working in any range between [1, 35] available cpus and gets > hanging when cpus count is 36.

https://github.com/dotnet/runtime/issues/851#issuecomment-645237830

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPeter MillsView Question on Stackoverflow
Solution 1 - C#OgglasView Answer on Stackoverflow