Use TQDM Progress Bar with Pandas

PythonPandasTqdm

Python Problem Overview


Is it possible to use TQDM progress bar when importing and indexing large datasets using Pandas?

Here is an example of of some 5-minute data I am importing, indexing, and using to_datetime. It takes a while and it would be nice to see a progress bar.

#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index

eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))

Python Solutions


Solution 1 - Python

Find length by getting shape

for index, row in tqdm(df.iterrows(), total=df.shape[0]):
   print("index",index)
   print("row",row)

Solution 2 - Python

with tqdm(total=Df.shape[0]) as pbar:    
    for index, row in Df.iterrows():
        pbar.update(1)
        ...

Solution 3 - Python

There is a workaround for tqdm > 4.24. As per https://github.com/tqdm/tqdm#pandas-integration:

from tqdm import tqdm
        
# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")
eurusd_ask['t_stamp'] = eurusd_ask['Gmt time'].progress_apply(lambda x: pd.Timestamp)
eurusd_ask.set_index(['t_stamp'], inplace=True)

Solution 4 - Python

You could fill a pandas data frame in line by line by reading the file normally and simply add each new line as a new row to the dataframe, though this would be a fair bit slower than just using Pandas own reading methods.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionsslack88View Question on Stackoverflow
Solution 1 - PythonArjun KavaView Answer on Stackoverflow
Solution 2 - Pythonjmcgrath207View Answer on Stackoverflow
Solution 3 - PythonZeke ArneodoView Answer on Stackoverflow
Solution 4 - PythonZeerakWView Answer on Stackoverflow