Lecture 4 - Image Classification with Convolutional Neural Networks

Livestream Link: https://www.youtube.com/watch?v=TN9fMYQxw4E

Lecture Timing: June 13 (Sat), 8.30 AM PST/9:00 PM IST

Notebooks & References:

If the input images are colored is it better to convert them to black & white and train the model or is it better to train the model in color ?


Hello, I did not formally register, but I have submitted assignments 1-3, and plan to submit the remaining ones. Will I still get a certificate at the end?

Depends on what your goal is and what is the scenario.

  • Color are better for results.
  • B/W for speed and if color is not helping in any way.

Yes, since you’re completing the assignments.


@PrajwalPrashanth Please explain about conv2d and conv3d layers specifically and when to use one ?
Also can you through the documentation of this https://pytorch.org/docs/master/generated/torch.nn.Conv2d.html so that we learn how to learn to read this kind of math extensive documentation.

Conv2d - For images
Conv3d - For videos


This is tough :sweat_smile: i always use google to find blogs, videos to explain them in simple words.

Visualization are great tools to understand.


how significant it is to add non-linearity in a model?
and is it necessary to add it in any sort of layering and any type model.

Okay, but these blogs doesn’t help that much to me.

Highly, non linearity is one of the reason why deep learning models achieves good result.

Can you elaborate this

Regarding the kaggle competition:

  1. Is there any way to “commit” the local notebook into kaggle?
  2. How to download the dataset to run it locally? I see there’s an option for “Download all” but does kaggle provide some sort of API to download it directly in notebook?

Non-Linearity is very important. Think about a function that takes an input x and outputs y,
This means f(x) = y, Now if the function f is linear what you can get is to some linear-combination of the input x, examples f(x) = A*x + B . But think about a task in which you want to map a x==image_of_a_monkey to monkey. Since the input-space deals with images and output space deals with names. It becomes imperative to use non-linear functions that can do the task.


Yes, this https://github.com/Kaggle/kaggle-api contains all the information that you have asked and more.

can the concept of adding non-linearity always be used in models whether they are purely ML based or purely DL based.
does different types of layering(sequential, convolution, activation, cropping) affect this concept or vice-versa

can we use:

    classes = ( "../train") # since we set path to data dir

instead of:

classes = os.listdir(data_dir + "/train")


Why the Featured Blogs is not there?
Missing :upside_down_face:

I am not sure to understand this notation:


What is this star for?

What does the random_seed function do? How does this work?

We already have the batch_size variable

do we have to install the data each time we ran, is there a way to know and use the already existing dataset created in the kaggle.