Use TQDM Progress Bar with Pandas
PythonPandasTqdmPython Problem Overview
Is it possible to use TQDM progress bar when importing and indexing large datasets using Pandas?
Here is an example of of some 5-minute data I am importing, indexing, and using to_datetime. It takes a while and it would be nice to see a progress bar.
#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index
eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))
Python Solutions
Solution 1 - Python
Find length by getting shape
for index, row in tqdm(df.iterrows(), total=df.shape[0]):
print("index",index)
print("row",row)
Solution 2 - Python
with tqdm(total=Df.shape[0]) as pbar:
for index, row in Df.iterrows():
pbar.update(1)
...
Solution 3 - Python
There is a workaround for tqdm > 4.24. As per https://github.com/tqdm/tqdm#pandas-integration:
from tqdm import tqdm
# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")
eurusd_ask['t_stamp'] = eurusd_ask['Gmt time'].progress_apply(lambda x: pd.Timestamp)
eurusd_ask.set_index(['t_stamp'], inplace=True)
Solution 4 - Python
You could fill a pandas data frame in line by line by reading the file normally and simply add each new line as a new row to the dataframe, though this would be a fair bit slower than just using Pandas own reading methods.