Google Cloud Platform: how to monitor memory usage of VM instances

MemoryGoogle Cloud-PlatformMemory ManagementGoogle Compute-EngineGoogle Cloud-Dataproc

Memory Problem Overview


I have recently performed a migration to Google Cloud Platform, and I really like it.

However I can't find a way to monitor the memory usage of the Dataproc VM intances. As you can see on the attachment, the console provides utilization info about CPU, disk and network, but not about memory.

Without knowing how much memory is being used, how is it possible to understand if there is a need of extra memory?

enter image description here

Memory Solutions


Solution 1 - Memory

By installing the Stackdriver agent in GCE VMs additional metrics like memory can be monitored. Stackdriver also offers you alerting and notification features. Nevertheless agent metrics are only available for premium tier accounts.

See this answer for Dataproc VMs.

Solution 2 - Memory

The stackdriver agent only supports monitoring of RAM of the E2 family at the moment. Other instance types such as N1, N2,... are not supported.

See the latest documentation of what is supported; https://cloud.google.com/monitoring/api/metrics_gcp#gcp-compute

Stackdriver merics

Solution 3 - Memory

Well you can use the /proc/meminfo virtual file system to get information on current memory usage. You can create a simple bash script that reads the memory usage information from /proc/meminfo. The script can be run periodically as a cron job service. The script can send an alert email if the memory usage exceeds a given threshold.

See this link: https://pakjiddat.netlify.app/posts/monitoring-cpu-and-memory-usage-on-linux

Solution 4 - Memory

The most up-to-date answer here.

How to see memory usage in GCP?

  1. Install the agent on your virtual machine. Takes less than 5 minutes.
curl -sSO https://dl.google.com/cloudagents/add-monitoring-agent-repo.sh
sudo bash add-monitoring-agent-repo.sh
sudo apt-get update
sudo apt-get install stackdriver-agent

the code snippet should install the most recent version of the agent, but for up-to-date guide you can always refer to https://cloud.google.com/monitoring/agent/installation#joint-install.

  1. After it's installed, in a minute or two, you should see the additional metrics in Monitoring section of GCP. https://console.cloud.google.com/monitoring

where to search for memory metrics in GCP

Explanation and why it's invisible by default?

The metrics (such as CPU usage or memory usage) can be collected at different places. For instance, CPU usage is a piece of information that the host (machine with special software running your virtual machine) can collect. The thing with memory usage and virtual machines, is, it's the underlying operating system that manages it (the operating system of your virtual machine). Host cannot really know how much is used, for all it can see in the memory given to that virtual machine, is a stream of bytes.

That's why there's an idea to install agents inside of that virtual machine that would collect the metrics from inside and ship it somewhere where they can be interpreted. There are many types of agents available out there, but Google promotes their own - Monitoring Agent - and it integrates into the entire GCP suite well.

Solution 5 - Memory

The agent metrics page may be useful: https://cloud.google.com/monitoring/api/metrics_agent

You'll need to install stackdriver. See: https://app.google.stackdriver.com/?project="your project name"

The stackdriver metrics page will provide some guidance. You will need to change the "project name" (e.g. sinuous-dog-133823) to suit your account:

https://app.google.stackdriver.com/metrics-explorer?project=sinuous-dog-133823&timeSelection={"timeRange":"6h"}&xyChart={"dataSets":[{"timeSeriesFilter":{"filter":"metric.type="agent.googleapis.com/memory/bytes_used" resource.type="gce_instance"","perSeriesAligner":"ALIGN_MEAN","crossSeriesReducer":"REDUCE_NONE","secondaryCrossSeriesReducer":"REDUCE_NONE","minAlignmentPeriod":"60s","groupByFields":[],"unitOverride":"By"},"targetAxis":"Y1","plotType":"LINE"}],"options":{"mode":"COLOR"},"constantLines":[],"timeshiftDuration":"0s","y1Axis":{"label":"y1Axis","scale":"LINEAR"}}&isAutoRefresh=true

This REST call will get you the cpu usage. You will need to modify the parameters to suite your project name (e.g. sinuous-dog-133823) and other params to suit needs.

GET /v3/projects/sinuous-cat-233823/timeSeries?filter=metric.type="agent.googleapis.com/memory/bytes_used" resource.type="gce_instance"& aggregation.crossSeriesReducer=REDUCE_NONE& aggregation.alignmentPeriod=+60s& aggregation.perSeriesAligner=ALIGN_MEAN& secondaryAggregation.crossSeriesReducer=REDUCE_NONE& interval.startTime=2019-03-06T20:40:00Z& interval.endTime=2019-03-07T02:51:00Z& $unique=gc673 HTTP/1.1
Host: content-monitoring.googleapis.com
authorization: Bearer <your token>
cache-control: no-cache
Postman-Token: 039cabab-356e-4ee4-99c4-d9f4685a7bb2

Solution 6 - Memory

VM memory metrics is not available by default, it requires Cloud Monitoring Agent 1.

The UI you are showing is Dataproc, which already has the agent installed, but disabled by default, you don't have to reinstall it. To enable Cloud Monitoring Agent for Dataproc clusters, set --properties dataproc:dataproc.monitoring.stackdriver.enable=true 2 when creating the cluster. Then you can monitor VM memory and create alerts in the Cloud Monitoring UI (not integrated with Dataproc UI yet).

Also see this related question: https://stackoverflow.com/questions/68403172/dataproc-vm-memory-and-local-disk-usage-metrics

Solution 7 - Memory

This article is now out of date as Stackdriver is now a legacy agent. This has been replaced by the Ops Agent. Please read the latest articles on GCP about migrating to Ops Agent

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDaniele BView Question on Stackoverflow
Solution 1 - MemoryCarlosView Answer on Stackoverflow
Solution 2 - Memoryuser2314327View Answer on Stackoverflow
Solution 3 - MemoryNadir LatifView Answer on Stackoverflow
Solution 4 - MemoryAdam SibikView Answer on Stackoverflow
Solution 5 - MemoryStvnBrkdllView Answer on Stackoverflow
Solution 6 - MemoryDagangView Answer on Stackoverflow
Solution 7 - MemoryWDAdminView Answer on Stackoverflow