Is there a clever way to pass the key to defaultdict's default_factory?
PythonDictionaryDefaultdictPython Problem Overview
A class has a constructor which takes one parameter:
class C(object):
def __init__(self, v):
self.v = v
...
Somewhere in the code, it is useful for values in a dict to know their keys.
I want to use a defaultdict with the key passed to newborn default values:
d = defaultdict(lambda : C(here_i_wish_the_key_to_be))
Any suggestions?
Python Solutions
Solution 1 - Python
It hardly qualifies as clever - but subclassing is your friend:
class keydefaultdict(defaultdict):
def __missing__(self, key):
if self.default_factory is None:
raise KeyError( key )
else:
ret = self[key] = self.default_factory(key)
return ret
d = keydefaultdict(C)
d[x] # returns C(x)
Solution 2 - Python
No, there is not.
The defaultdict
implementation can not be configured to pass missing key
to the default_factory
out-of-the-box. Your only option is to implement your own defaultdict
subclass, as suggested by @JochenRitzel, above.
But that isn't "clever" or nearly as clean as a standard library solution would be (if it existed). Thus the answer to your succinct, yes/no question is clearly "No".
It's too bad the standard library is missing such a frequently needed tool.
Solution 3 - Python
I don't think you need defaultdict
here at all. Why not just use dict.setdefault
method?
>>> d = {}
>>> d.setdefault('p', C('p')).v
'p'
That will of course would create many instances of C
. In case it's an issue, I think the simpler approach will do:
>>> d = {}
>>> if 'e' not in d: d['e'] = C('e')
It would be quicker than the defaultdict
or any other alternative as far as I can see.
ETA regarding the speed of in
test vs. using try-except clause:
>>> def g():
d = {}
if 'a' in d:
return d['a']
>>> timeit.timeit(g)
0.19638929363557622
>>> def f():
d = {}
try:
return d['a']
except KeyError:
return
>>> timeit.timeit(f)
0.6167065411074759
>>> def k():
d = {'a': 2}
if 'a' in d:
return d['a']
>>> timeit.timeit(k)
0.30074866358404506
>>> def p():
d = {'a': 2}
try:
return d['a']
except KeyError:
return
>>> timeit.timeit(p)
0.28588609450770264
Solution 4 - Python
Here's a working example of a dictionary that automatically adds a value. The demonstration task in finding duplicate files in /usr/include. Note customizing dictionary PathDict only requires four lines:
class FullPaths:
def __init__(self,filename):
self.filename = filename
self.paths = set()
def record_path(self,path):
self.paths.add(path)
class PathDict(dict):
def __missing__(self, key):
ret = self[key] = FullPaths(key)
return ret
if __name__ == "__main__":
pathdict = PathDict()
for root, _, files in os.walk('/usr/include'):
for f in files:
path = os.path.join(root,f)
pathdict[f].record_path(path)
for fullpath in pathdict.values():
if len(fullpath.paths) > 1:
print("{} located in {}".format(fullpath.filename,','.join(fullpath.paths)))
Solution 5 - Python
Another way that you can potentially achieve the desired functionality is by using decorators
def initializer(cls: type):
def argument_wrapper(
*args: Tuple[Any], **kwargs: Dict[str, Any]
) -> Callable[[], 'X']:
def wrapper():
return cls(*args, **kwargs)
return wrapper
return argument_wrapper
@initializer
class X:
def __init__(self, *, some_key: int, foo: int = 10, bar: int = 20) -> None:
self._some_key = some_key
self._foo = foo
self._bar = bar
@property
def key(self) -> int:
return self._some_key
@property
def foo(self) -> int:
return self._foo
@property
def bar(self) -> int:
return self._bar
def __str__(self) -> str:
return f'[Key: {self.key}, Foo: {self.foo}, Bar: {self.bar}]'
Then you can create a defaultdict
as so:
>>> d = defaultdict(X(some_key=10, foo=15, bar=20))
>>> d['baz']
[Key: 10, Foo: 15, Bar: 20]
>>> d['qux']
[Key: 10, Foo: 15, Bar: 20]
The default_factory
will create new instances of X
with the specified
arguments.
Of course, this would only be useful if you know that the class will be used in a default_factory
. Otherwise, in-order to instantiate an individual class you would need to do something like:
x = X(some_key=10, foo=15)()
Which is kind of ugly... If you wanted to avoid this however, and introduce a degree of complexity, you could also add a keyword parameter like factory
to the argument_wrapper
which would allow for generic behaviour:
def initializer(cls: type):
def argument_wrapper(
*args: Tuple[Any], factory: bool = False, **kwargs: Dict[str, Any]
) -> Callable[[], 'X']:
def wrapper():
return cls(*args, **kwargs)
if factory:
return wrapper
return cls(*args, **kwargs)
return argument_wrapper
Where you could then use the class as so:
>>> X(some_key=10, foo=15)
[Key: 10, Foo: 15, Bar: 20]
>>> d = defaultdict(X(some_key=15, foo=15, bar=25, factory=True))
>>> d['baz']
[Key: 15, Foo: 15, Bar: 25]