When we're training a deep learning model, it helps to have a small progress bar giving us an estimation of how long the process would take to complete. To achieve this, we can use the Python external library tqdm. In this post, we will use tqdm to show a progress bar as we are loading data in the training loop.


Installation

You can install the tqdm package by

pip3 install tqdm


Important: This post is created in Jupyter Notebook. Some progress bars in this post will only SHOW if you’re running in Jupyter Notebook. If you’d like to follow along with this post, you can click Open in Colab and run the following codes.


Basic Usage

Installation

You can install the tqdm package by

pip3 install tqdm

How to import a tqdm object

  • use from tqdm import tqdm if you're using a terminal

  • use from tqdm.notebook import tqdm if you're using a Jupyter Notebook

Commonly used parameters in a tqdm object

  • total : the total number of expected iterations if not specified already or needs modification, ex: 300

  • desc : description of your progress bar, ex: "My Progress Bar"

  • disable: set this to True if you want to disable the progress bar

Syntax

tqdm(iterable, total= 100, desc="Text you want", disable = False)


Example 1

from tqdm.notebook import tqdm

for i in tqdm(range(int(10e6)),desc= "My Progress Bar"):
    pass


Use tqdm to keep track of batches in DataLoader

Step 1. Initiating a DataLoader

We will create a DataLoader with the training data of agnews dataset.

from datasets import load_dataset
agnews = load_dataset('ag_news')
train_dataset = agnews['train']
Using custom data configuration default
Downloading and preparing dataset ag_news/default (download: 29.88 MiB, generated: 30.23 MiB, post-processed: Unknown size, total: 60.10 MiB) to /root/.cache/huggingface/datasets/ag_news/default/0.0.0/bc2bcb40336ace1a0374767fc29bb0296cdaf8a6da7298436239c54d79180548...
Dataset ag_news downloaded and prepared to /root/.cache/huggingface/datasets/ag_news/default/0.0.0/bc2bcb40336ace1a0374767fc29bb0296cdaf8a6da7298436239c54d79180548. Subsequent calls will reuse this data.

The dataset has two data fields, text and label.

train_dataset
Dataset({
    features: ['text', 'label'],
    num_rows: 120000
})

Now, we can initiate a DataLoader with the agnews training data

from torch.utils.data import DataLoader
train_dataloader = DataLoader(train_dataset,batch_size=64)

If you need more detailed information about loading datasets and using DataLoader, you can check


Step 2: Using tqdm to add a progress bar while loading data

from tqdm.notebook import tqdm

for batch_index, data in tqdm(enumerate(train_dataloader),
                              total=len(train_dataloader),
                              desc ="My Progress Bar"):
  text = data['text']
  label = data['label']

  # print batch information every 700 batches
  if batch_index % 700 == 0:
    print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')
Batch 0
 firs text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.,
 first label: 2

Batch 700
 firs text: Sweden to Return Remains of Aborigines (AP) AP - The skeletal remains of 15 Aborigines are being returned home for reburial, nearly 90 years after a Swedish zoologist smuggled them out of Australia for display in a Stockholm museum.,
 first label: 3

Batch 1400
 firs text: Kashmiris waiting for festival and peace to come SRINAGAR, Nov. 13 (XinhuaNET) -- Markets are overcrowded, traffic jam is heavy and the shops are jostling with shoppers in the capital city of Srinagar in the Indian-administered Kashmir as theholy Moslem festival of Eid approaches here.,
 first label: 0


Issues: tqdm printing to new line in Jupyter notebook

Case 1: import from tqdm in a Jupyter Notebook

Below is what will happen if we use from tqdm import tqdm instead of from tqdm.notebook import tqdm in italicized text Jupyter notebook.

from tqdm import tqdm

for batch_index, data in tqdm(enumerate(train_dataloader),
                              total=len(train_dataloader),
                              desc ="My Progress Bar"):
  text = data['text']
  label = data['label']

  # print batch information every 700 batches
  if batch_index % 700 == 0:
    print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')

My Progress Bar:   1%|▏         | 27/1875 [00:00<00:06, 269.76it/s]
Batch 0
 firs text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.,
 first label: 2
My Progress Bar:  39%|███▉      | 730/1875 [00:03<00:04, 236.05it/s]
Batch 700
 firs text: Sweden to Return Remains of Aborigines (AP) AP - The skeletal remains of 15 Aborigines are being returned home for reburial, nearly 90 years after a Swedish zoologist smuggled them out of Australia for display in a Stockholm museum.,
 first label: 3
My Progress Bar:  77%|███████▋  | 1442/1875 [00:05<00:01, 241.85it/s]
Batch 1400
 firs text: Kashmiris waiting for festival and peace to come SRINAGAR, Nov. 13 (XinhuaNET) -- Markets are overcrowded, traffic jam is heavy and the shops are jostling with shoppers in the capital city of Srinagar in the Indian-administered Kashmir as theholy Moslem festival of Eid approaches here.,
 first label: 0
My Progress Bar: 100%|██████████| 1875/1875 [00:07<00:00, 238.31it/s]

You can see that if you print something within the for loop, the progress bar will show on a new line in each iteration.


Case 2: running a python script importing tqdm in Jupyter Notebook

If you are using some gpu cloud platforms, such as Colab, you may have to run your python script in Jupyter notebook.

In such a case, you can

  1. use from tqdm.auto import tqdm to import tqdm
  2. use %run instead of !python


Example

Suppose we have a file called example.py with the below code:

# example.py
from tqdm.auto import tqdm

for i in tqdm(range(int(10e6)),desc= "My Progress Bar"):
    pass

Here's what we get if we use the !python command to run example.py

You can see that the progress bar is not displaying and it prints a newline in every iteration.

!python example.py
My Progress Bar:   0% 0/10000000 [00:00<?, ?it/s]0 sample message
My Progress Bar:  20% 1981993/10000000 [00:00<00:03, 2552169.70it/s]2000000 sample message
My Progress Bar:  38% 3807601/10000000 [00:01<00:02, 2602892.40it/s]4000000 sample message
My Progress Bar:  59% 5937689/10000000 [00:02<00:01, 2665162.41it/s]6000000 sample message
My Progress Bar:  78% 7823203/10000000 [00:03<00:00, 2664702.85it/s]8000000 sample message
My Progress Bar: 100% 10000000/10000000 [00:03<00:00, 2601683.79it/s]

To fix this, we can use the %run command instead of !python to run example.py

%run example.py
0 sample message
2000000 sample message
4000000 sample message
6000000 sample message
8000000 sample message

Now the progress bar displays as expected.


Use trange to keep track of epochs

trange is a shortcut for tqdm(range(args), **kwargs)


Example

Using tqdm

from tqdm.notebook import tqdm
from time import sleep

for i in tqdm(range(10), desc="Text you want"):
  sleep(.1)

Using trange

from tqdm.notebook import trange

from time import sleep
for i in trange(10,desc="Text you want"):
  sleep(.1)

You can see that the outputs are the same.


Training with multiple epochs

We often train a deep learning model with more than one epoch. In this case, we can use trange to keep track of the progress of epochs.

Example

from tqdm.notebook import trange, tqdm


for i in trange(3,desc= 'Epoch'):
  print('\nEpoch', i)
  for batch_index, data in tqdm(enumerate(train_dataloader),
                                        total=len(train_dataloader),
                                        desc ="Text You Want",
                                        #disable = True,
                                        #file=sys.stdout,
                                        #initial=1000
                                        ):
    text = data['text']
    label = data['label']

    # print batch information every 700 batches
    if batch_index % 1000 == 0:
      print(f'\nBatch {batch_index}\n firs text: {text[0]},\n first label: {label[0]}')

Epoch 0

Batch 0
 firs text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.,
 first label: 2

Batch 1000
 firs text: Microsoft Set to Deliver New Windows Service Pack Beta Microsoft is poised to deliver a new interim build of its Windows Server 2003 SP1 (Service Pack 1) to testers. Windows Server 2003 SP1 is the server complement to the recently released Windows XP SP2 (Service ,
 first label: 3

Epoch 1

Batch 0
 firs text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.,
 first label: 2

Batch 1000
 firs text: Microsoft Set to Deliver New Windows Service Pack Beta Microsoft is poised to deliver a new interim build of its Windows Server 2003 SP1 (Service Pack 1) to testers. Windows Server 2003 SP1 is the server complement to the recently released Windows XP SP2 (Service ,
 first label: 3

Epoch 2

Batch 0
 firs text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.,
 first label: 2

Batch 1000
 firs text: Microsoft Set to Deliver New Windows Service Pack Beta Microsoft is poised to deliver a new interim build of its Windows Server 2003 SP1 (Service Pack 1) to testers. Windows Server 2003 SP1 is the server complement to the recently released Windows XP SP2 (Service ,
 first label: 3