Get the name of a pandas DataFrame
PythonPandasDataframeAttributesPython Problem Overview
How do I get the name of a DataFrame and print it as a string?
Example:
boston
(var name assigned to a csv file)
import pandas as pd
boston = pd.read_csv('boston.csv')
print('The winner is team A based on the %s table.) % boston
Python Solutions
Solution 1 - Python
You can name the dataframe with the following, and then call the name wherever you like:
import pandas as pd
df = pd.DataFrame( data=np.ones([4,4]) )
df.name = 'Ones'
print df.name
>>>
Ones
Hope that helps.
Solution 2 - Python
Sometimes df.name
doesn't work.
you might get an error message:
> 'DataFrame' object has no attribute 'name'
try the below function:
def get_df_name(df):
name =[x for x in globals() if globals()[x] is df][0]
return name
Solution 3 - Python
In many situations, a custom attribute attached to a pd.DataFrame
object is not necessary. In addition, note that pandas
-object attributes may not serialize. So pickling will lose this data.
Instead, consider creating a dictionary with appropriately named keys and access the dataframe via dfs['some_label']
.
df = pd.DataFrame()
dfs = {'some_label': df}
Solution 4 - Python
From here what I understand DataFrames are: >DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.
And Series are: >Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.).
Series have a name
attribute which can be accessed like so:
In [27]: s = pd.Series(np.random.randn(5), name='something')
In [28]: s
Out[28]:
0 0.541
1 -1.175
2 0.129
3 0.043
4 -0.429
Name: something, dtype: float64
In [29]: s.name
Out[29]: 'something'
EDIT: Based on OP's comments, I think OP was looking for something like:
>>> df = pd.DataFrame(...)
>>> df.name = 'df' # making a custom attribute that DataFrame doesn't intrinsically have
>>> print(df.name)
'df'
Solution 5 - Python
Here is a sample function: 'df.name = file` : Sixth line in the code below
def df_list():
filename_list = current_stage_files(PATH)
df_list = []
for file in filename_list:
df = pd.read_csv(PATH+file)
df.name = file
df_list.append(df)
return df_list
Solution 6 - Python
I am working on a module for feature analysis and I had the same need as yours, as I would like to generate a report with the name of the pandas.Dataframe being analyzed. To solve this, I used the same solution presented by @scohe001 and @LeopardShark, originally in https://stackoverflow.com/a/18425523/8508275, implemented with the inspect library:
import inspect
def aux_retrieve_name(var):
callers_local_vars = inspect.currentframe().f_back.f_back.f_locals.items()
return [var_name for var_name, var_val in callers_local_vars if var_val is var]
Note the additional .f_back term since I intend to call it from another function:
def header_generator(df):
print('--------- Feature Analyzer ----------')
print('Dataframe name: "{}"'.format(aux_retrieve_name(df)))
print('Memory usage: {:03.2f} MB'.format(df.memory_usage(deep=True).sum() / 1024 ** 2))
return
Running this code with a given dataframe, I get the following output:
header_generator(trial_dataframe)
> --------- Feature Analyzer ----------
> Dataframe name: "trial_dataframe"
> Memory usage: 63.08 MB