Dotnet Core Docker Container Leaks RAM on Linux and causes OOM
C#LinuxDockerUbuntu.Net CoreC# Problem Overview
I am running Dotnet Core 2.2 in a Linux container in Docker.
I've tried many different configuration/environment options - but I keep coming back to the same problem of running out of memory ('docker events' reports an OOM).
In production I'm hosting on Ubuntu. For Development, I'm using a Linux container (MobyLinux) on Docker in Windows.
I've gone back to running the Web API template project, rather than my actual app. I am literally returning a string and doing nothing else. If I call it about 1,000 times from curl, the container will die. The garbage collector does not appear to be working at all.
Tried setting the following environment variables in the docker-compose:
DOTNET_RUNNING_IN_CONTAINER=true
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
ASPNETCORE_preventHostingStartup=true
Also tried the following in the docker-compose:
mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m
(these only make it die faster)
Tried setting the following to true or false, no difference:
ServerGarbageCollection
I have tried instead running as a Windows container, this doesn't OOM - but it does not seem to respect the memory limits either.
I have already ruled out use of HttpClient and EF Core - as I'm not even using them in my example. I have read a bit about listening on port 443 as a problem - as I can leave the container running idle all day long, if I check at the end of the day - it's used up some more memory (not a massive amount, but it grows).
Example of what's in my API:
// GET api/values/5
[HttpGet("{id}")]
public ActionResult<string> Get(int id)
{
return "You said: " + id;
}
Calling with Curl example:
curl -X GET "https://localhost:44329/api/values/7" -H "accept: text/plain" --insecure
(repeated 1,000 or so times)
Expected: RAM usage to remain low for a very primitive request
Actual: RAM usage continues to grow until failure
Full Dockerfile:
FROM microsoft/dotnet:2.2-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443
FROM microsoft/dotnet:2.2-sdk AS build
WORKDIR /src
COPY ["WebApplication1/WebApplication1.csproj", "WebApplication1/"]
RUN dotnet restore "WebApplication1/WebApplication1.csproj"
COPY . .
WORKDIR "/src/WebApplication1"
RUN dotnet build "WebApplication1.csproj" -c Release -o /app
FROM build AS publish
RUN dotnet publish "WebApplication1.csproj" -c Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "WebApplication1.dll"]
docker-compose.yml
version: '2.3'
services:
webapplication1:
image: ${DOCKER_REGISTRY-}webapplication1
mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m
cpu_percent: 25
build:
context: .
dockerfile: WebApplication1/Dockerfile
docker-compose.override.yml
version: '2.3'
services:
webapplication1:
environment:
- ASPNETCORE_ENVIRONMENT=Development
- ASPNETCORE_URLS=https://+:443;http://+:80
- ASPNETCORE_HTTPS_PORT=44329
- DOTNET_RUNNING_IN_CONTAINER=true
- DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
- ASPNETCORE_preventHostingStartup=true
ports:
- "50996:80"
- "44329:443"
volumes:
- ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
- ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro
I'm running Docker CE Engine 18.0.9.1 on Windows and 18.06.1 on Ubuntu. To confirm - I have also tried in Dotnet Core 2.1.
I've also given it a try in IIS Express - the process gets to around 55MB, that's literally spamming it with multiple threads, etc.
When they're all done, it goes down to around 29-35MB.
C# Solutions
Solution 1 - C#
This could be because garbage collection (GC) is not executed.
Looking at this open issue it looks very similar:
https://github.com/dotnet/runtime/issues/851
One solution that made Ubuntu 18.04.4
work on a virtualized machine was using Workstation garbage collection (GC):
<PropertyGroup>
<ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>
https://github.com/dotnet/runtime/issues/851#issuecomment-644648315
https://github.com/dotnet/runtime/issues/851#issuecomment-438474207
https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc
This is another finding:
> After further investigations I've noticed that there is big difference
> between my servers in amount of available logical CPUs count (80 vs
> 16). After some googling I came across this topic dotnet/runtime#622
> that leads me to an experiments with CPU/GC/Threads settings.
>
> I was using --cpus constraint in stack file; explicitly set
> System.GC.Concurrent=true, System.GC.HeapCount=8,
> System.GC.NoAffinitize=true, System.Threading.ThreadPool.MaxThreads=16
> in runtimeconfig.template.json
file; update image to a 3.1.301-bionic
> sdk and 3.1.5-bionic asp.net runtime — I made all this things in a
> various combinations and all of this had no effect. Application just
> hangs until gets OOMKilled.
>
> The only thing that make it work with Server GC is --cpuset-cpus
> constraint. Of course, explicit setting of available processors is not
> an option for a docker swarm mode. But I was experimenting with
> available cpus to find any regularity. And here I got a few
> interesting facts.
>
> What is interesting, previously I have mirgated 3 other backend
> services to a new servers cluster and they all go well with a default
> settings. Their memory limit is set to 600 Mb
but in fact they need
> about 400 Mb
to run. Things go wrong only with memory-consuming
> applications (I have two of those), it requires 3 Gb
to build
> in-memory structures and runs with a 6 Gb
constraint.
>
> It keeps working in any range between [1, 35]
available cpus and gets
> hanging when cpus count is 36
.
https://github.com/dotnet/runtime/issues/851#issuecomment-645237830