Tracking *maximum* memory usage by a Python function

PythonMemoryProfiling

Python Problem Overview


I want to find out what the maximum amount of RAM allocated during the call to a function is (in Python). There are other questions on SO related to tracking RAM usage:

https://stackoverflow.com/questions/110259/python-memory-profiler

https://stackoverflow.com/questions/552744/how-do-i-profile-memory-usage-in-python

but those seem to allow you more to track memory usage at the time the heap() method (in the case of guppy) is called. However, what I want to track is a function in an external library which I can't modify, and which grows to use a lot of RAM but then frees it once the execution of the function is complete. Is there any way to find out what the total amount of RAM used during the function call was?

Python Solutions


Solution 1 - Python

It is possible to do this with memory_profiler. The function memory_usage returns a list of values, these represent the memory usage over time (by default over chunks of .1 second). If you need the maximum, just take the max of that list. Little example:

from memory_profiler import memory_usage
from time import sleep

def f():
    # a function that with growing
    # memory consumption
    a = [0] * 1000
    sleep(.1)
    b = a * 100
    sleep(.1)
    c = b * 100
    return a

mem_usage = memory_usage(f)
print('Memory usage (in chunks of .1 seconds): %s' % mem_usage)
print('Maximum memory usage: %s' % max(mem_usage))

In my case (memory_profiler 0.25) if prints the following output:

Memory usage (in chunks of .1 seconds): [45.65625, 45.734375, 46.41015625, 53.734375]
Maximum memory usage: 53.734375

Solution 2 - Python

This question seemed rather interesting and it gave me a reason to look into Guppy / Heapy, for that I thank you.

I tried for about 2 hours to get Heapy to do monitor a function call / process without modifying its source with zero luck.

I did find a way to accomplish your task using the built in Python library resource. Note that the documentation does not indicate what the RU_MAXRSS value returns. Another SO user noted that it was in kB. Running Mac OSX 7.3 and watching my system resources climb up during the test code below, I believe the returned values to be in Bytes, not kBytes.

A 10000ft view on how I used the resource library to monitor the library call was to launch the function in a separate (monitor-able) thread and track the system resources for that process in the main thread. Below I have the two files that you'd need to run to test it out.

Library Resource Monitor - whatever_you_want.py

import resource
import time

from stoppable_thread import StoppableThread


class MyLibrarySniffingClass(StoppableThread):
    def __init__(self, target_lib_call, arg1, arg2):
        super(MyLibrarySniffingClass, self).__init__()
        self.target_function = target_lib_call
        self.arg1 = arg1
        self.arg2 = arg2
        self.results = None
        
    def startup(self):
        # Overload the startup function
        print "Calling the Target Library Function..."
        
    def cleanup(self):
        # Overload the cleanup function
        print "Library Call Complete"
        
    def mainloop(self):
        # Start the library Call
        self.results = self.target_function(self.arg1, self.arg2)

        # Kill the thread when complete
        self.stop()
        
def SomeLongRunningLibraryCall(arg1, arg2):
    max_dict_entries = 2500
    delay_per_entry = .005
    
    some_large_dictionary = {}
    dict_entry_count = 0
    
    while(1):
        time.sleep(delay_per_entry)
        dict_entry_count += 1
        some_large_dictionary[dict_entry_count]=range(10000)
        
        if len(some_large_dictionary) > max_dict_entries:
            break

    print arg1 + " " +  arg2
    return "Good Bye World"
    
if __name__ == "__main__":
    # Lib Testing Code
    mythread = MyLibrarySniffingClass(SomeLongRunningLibraryCall, "Hello", "World")
    mythread.start()
    
    start_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    delta_mem = 0
    max_memory = 0
    memory_usage_refresh = .005 # Seconds
    
    while(1):
        time.sleep(memory_usage_refresh)
        delta_mem = (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss) - start_mem
        if delta_mem > max_memory:
            max_memory = delta_mem
          
        # Uncomment this line to see the memory usuage during run-time 
        # print "Memory Usage During Call: %d MB" % (delta_mem / 1000000.0)
        
        # Check to see if the library call is complete
        if mythread.isShutdown():
            print mythread.results
            break;
            
    print "\nMAX Memory Usage in MB: " + str(round(max_memory / 1000.0, 3))

Stoppable Thread - stoppable_thread.py

import threading
import time

class StoppableThread(threading.Thread):
    def __init__(self):
        super(StoppableThread, self).__init__()
        self.daemon = True
        self.__monitor = threading.Event()
        self.__monitor.set()
        self.__has_shutdown = False
        
    def run(self):
        '''Overloads the threading.Thread.run'''
        # Call the User's Startup functions
        self.startup()
        
        # Loop until the thread is stopped
        while self.isRunning():
            self.mainloop()
            
        # Clean up
        self.cleanup()
        
        # Flag to the outside world that the thread has exited
        # AND that the cleanup is complete
        self.__has_shutdown = True
        
    def stop(self):
        self.__monitor.clear()
        
    def isRunning(self):
        return self.__monitor.isSet()
        
    def isShutdown(self):
        return self.__has_shutdown
        
        
    ###############################
    ### User Defined Functions ####
    ###############################
    
    def mainloop(self):
        '''
        Expected to be overwritten in a subclass!!
        Note that Stoppable while(1) is handled in the built in "run".
        '''
        pass
        
    def startup(self):
        '''Expected to be overwritten in a subclass!!'''
        pass
        
    def cleanup(self):
        '''Expected to be overwritten in a subclass!!'''
        pass

  

Solution 3 - Python

This appears to work under Windows. Don't know about other operating systems.

In [50]: import os

In [51]: import psutil

In [52]: process = psutil.Process(os.getpid())

In [53]: process.get_ext_memory_info().peak_wset
Out[53]: 41934848

Solution 4 - Python

You can use python library resource to get memory usage.

import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

It will give memory usage in kilobytes, to convert in MB divide by 1000.

Solution 5 - Python

Improvement of the answer of @Vader B (as it did not work for me out of box):

$ /usr/bin/time --verbose  ./myscript.py
        Command being timed: "./myscript.py"
        User time (seconds): 16.78
        System time (seconds): 2.74
        Percent of CPU this job got: 117%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.58
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 616092   # WE NEED THIS!!!
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 432750
        Voluntary context switches: 1075
        Involuntary context switches: 118503
        Swaps: 0
        File system inputs: 0
        File system outputs: 800
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Solution 6 - Python

Standard Unix utility time tracks maximum memory usage of the process as well as other useful statistics for your program.

Example output (maxresident is max memory usage, in Kilobytes.):

> time python ./scalabilty_test.py
45.31user 1.86system 0:47.23elapsed 99%CPU (0avgtext+0avgdata 369824maxresident)k
0inputs+100208outputs (0major+99494minor)pagefaults 0swaps

Solution 7 - Python

Reading the source of free's information, /proc/meminfo on a linux system:

~ head /proc/meminfo
MemTotal:        4039168 kB
MemFree:         2567392 kB
MemAvailable:    3169436 kB
Buffers:           81756 kB
Cached:           712808 kB
SwapCached:            0 kB
Active:           835276 kB
Inactive:         457436 kB
Active(anon):     499080 kB
Inactive(anon):    17968 kB

I have created a decorator class to measure memory consumption of a function.

class memoryit:

    def FreeMemory():
        with open('/proc/meminfo') as file:
            for line in file:
                if 'MemFree' in line:
                    free_memKB = line.split()[1]
                    return (float(free_memKB)/(1024*1024))    # returns GBytes float
            
    def __init__(self, function):    # Decorator class to print the memory consumption of a 
        self.function = function     # function/method after calling it a number of iterations
    
    def __call__(self, *args, iterations = 1, **kwargs):
        before = memoryit.FreeMemory()
        for i in range (iterations):
            result = self.function(*args, **kwargs)
        after = memoryit.FreeMemory()
        print ('%r memory used: %2.3f GB' % (self.function.__name__, (before - after) / iterations))
        return result

Function to measure consumption:

@memoryit
def MakeMatrix (dim):
    matrix = []   
    for i in range (dim):
        matrix.append([j for j in range (dim)])
    return (matrix)

Usage:

print ("Starting memory:", memoryit.FreeMemory()) 
m = MakeMatrix(10000)    
print ("Ending memory:", memoryit.FreeMemory() )

Printout:

Starting memory: 10.58599853515625
'MakeMatrix' memory used: 3.741 GB
Ending memory: 6.864116668701172

Solution 8 - Python

Have been struggling with this task as well. After experimenting with psutil and methods from Adam, I wrote a function (credits to Adam Lewis) to measure the memory used by a specific function. People may find it easier to grab and use.

  1. measure_memory_usage

  2. test measure_memory_usage

I found that materials about threading and overriding superclass are really helpful in understanding what Adam is doing in his scripts. Sorry I cannot post the links due to my "2 links" maximum limitation.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionastrofrogView Question on Stackoverflow
Solution 1 - PythonFabian PedregosaView Answer on Stackoverflow
Solution 2 - PythonAdam LewisView Answer on Stackoverflow
Solution 3 - PythonsethpView Answer on Stackoverflow
Solution 4 - PythonVPSView Answer on Stackoverflow
Solution 5 - PythonJenyaKhView Answer on Stackoverflow
Solution 6 - PythonVader BView Answer on Stackoverflow
Solution 7 - Pythonuser-asterixView Answer on Stackoverflow
Solution 8 - PythonGatsbyView Answer on Stackoverflow