Build an Image Classification Model using PyTorch

Build an Image Classification Model using PyTorch

Build an Image Classification Model using PyTorch

Image for post

Deep Learning has gained a lot of momentum in the past decade with tons of applications ranging from speech recognition to medical imaging and even predicting stock prices. Frameworks like Pytorch help us perform all the steps involved in building deep learning models within a few seconds whilst letting us leverage the power of graphical power units (GPU) and other high level features!

The National Institute on Deafness and Other Communications Disorders (NIDCD) indicates that the 200-year-old American Sign Language is a very complex language (of which letter gestures are only part) but is the primary language for many deaf North Americans. With that being said, on September 18th 2020, IBM Developer Advocates Sidra Ahmed and Fawaz Siddiqi conducted a workshop on how to build and train a deep learning model to classify American Sign Language (ASL) alphabets into 29 classes (26 ASL alphabet, space, Del, and nothing), which can be used later to help hard-of-hearing people communicate with others as well as with computers.

The webinar was divided into two parts, the first part was conducted by Sidra in which she explained the foundational concepts behind Artificial Intelligence, Machine Learning and Deep Learning using a timeline. She also talked about a few interesting applications of Deep Learning including Natural Language Processing (NLP), Drug Discovery, Speech Recognition etc.

Image for post

Image for post

Furthermore, she gave the audience a technical step by step walkthrough of the inner workings of a traditional neural network architecture as well as Convolutional Neural Networks architecture.

Image for post

Image for post

Lastly, she concluded the first half by discussing the importance of PyTorch which is an open source Deep Learning framework and a Python based scientific machine learning package. The different high level features of PyTorch which make it a compelling player in the field of deep learning were also brought forward in this half of the workshop.

After the first half of the workshop, Fawaz led the hands-on session which was towards building and training a deep learning model to classify American Sign Language (ASL) alphabets into 29 classes (26 ASL alphabet, space, Del, and nothing). The dataset used was publicly available at Kaggle. Fawaz ran the notebook cells and did an in-depth walkthrough of each cell within the notebook right from data exploration to the whole process of model building and training.

Here’s the flow and architecture of the workshop exercise:

Image for post

Image for post
  1. Log in to Watson Studio.
  2. Get your Kaggle API credentials.
  3. Run the Jupyter Notebook in Watson Studio.

The attendees followed along with the code pattern and interacted really well. Fawaz and Sidra solved queries and answered questions throughout the session keeping it highly engaging.

At the end of the webinar, the attendees were introduced to a few more tools and frameworks that can be used to build deep learning models and also some IBM Developer resources. Many of the attendees showed positive feedback and appreciation for the webinar, wanting to learn more about the usecase at hand.

Interested in doing this by yourself?

Sign up for IBM Cloud here

Find the hands-on code pattern here

You can find the workshop material (slides) here