Lazy evaluation in Python
PythonPython 3.xLazy EvaluationPython Problem Overview
What is lazy evaluation in Python?
One website said :
In Python 3.x the range()
function returns a special range object which computes elements of the list on demand (lazy or deferred evaluation):
>>> r = range(10)
>>> print(r)
range(0, 10)
>>> print(r[3])
3
What is meant by this?
Python Solutions
Solution 1 - Python
The object returned by range()
(or xrange()
in Python2.x) is known as a lazy iterable.
Instead of storing the entire range, [0,1,2,..,9]
, in memory, the generator stores a definition for (i=0; i<10; i+=1)
and computes the next value only when needed (AKA lazy-evaluation).
Essentially, a generator allows you to return a list like structure, but here are some differences:
- A list stores all elements when it is created. A generator generates the next element when it is needed.
- A list can be iterated over as much as you need, a generator can only be iterated over exactly once.
- A list can get elements by index, a generator cannot -- it only generates values once, from start to end.
A generator can be created in two ways:
(1) Very similar to a list comprehension:
# this is a list, create all 5000000 x/2 values immediately, uses []
lis = [x/2 for x in range(5000000)]
# this is a generator, creates each x/2 value only when it is needed, uses ()
gen = (x/2 for x in range(5000000))
(2) As a function, using yield
to return the next value:
# this is also a generator, it will run until a yield occurs, and return that result.
# on the next call it picks up where it left off and continues until a yield occurs...
def divby2(n):
num = 0
while num < n:
yield num/2
num += 1
# same as (x/2 for x in range(5000000))
print divby2(5000000)
Note: Even though range(5000000)
is a generator in Python3.x, [x/2 for x in range(5000000)]
is still a list. range(...)
does it's job and generates x
one at a time, but the entire list of x/2
values will be computed when this list is create.
Solution 2 - Python
In a nutshell, lazy evaluation means that the object is evaluated when it is needed, not when it is created.
In Python 2, range will return a list - this means that if you give it a large number, it will calculate the range and return at the time of creation:
>>> i = range(100)
>>> type(i)
<type 'list'>
In Python 3, however you get a special range object:
>>> i = range(100)
>>> type(i)
<class 'range'>
Only when you consume it, will it actually be evaluated - in other words, it will only return the numbers in the range when you actually need them.
Solution 3 - Python
A github repo named python patterns and wikipedia tell us what lazy evaluation is. >Delays the eval of an expr until its value is needed and avoids repeated evals.
range
in python3 is not a complete lazy evaluation, because it doesn't avoid repeated eval.
A more classic example for lazy evaluation is cached_property
:
import functools
class cached_property(object):
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __get__(self, obj, type_):
if obj is None:
return self
val = self.function(obj)
obj.__dict__[self.function.__name__] = val
return val
The cached_property(a.k.a lazy_property) is a decorator which convert a func into a lazy evaluation property. The first time property accessed, the func is called to get result and then the value is used the next time you access the property.
eg:
class LogHandler:
def __init__(self, file_path):
self.file_path = file_path
@cached_property
def load_log_file(self):
with open(self.file_path) as f:
# the file is to big that I have to cost 2s to read all file
return f.read()
log_handler = LogHandler('./sys.log')
# only the first time call will cost 2s.
print(log_handler.load_log_file)
# return value is cached to the log_handler obj.
print(log_handler.load_log_file)
To use a proper word, a python generator object like range are more like designed through call_by_need pattern, rather than lazy evaluation