Share Your Work - Assignment 3

Share your notebooks and blog posts from the third assignment here.

IMPORTANT NOTE: There was an error in the assignment notebook. We were using the full dataset for training, validation and test loaders.

Please update this cell in our notebook: https://jovian.ml/aakashns/03-cifar10-feedforward/v/6#C29

5 Likes

Here’s my 3rd assignment notebook.

https://jovian.ml/sebgolos/03-cifar10-feedforward

Managed to achieve 55.46% accuracy, but I think I’m gonna push it to the limits later.

I wanted to use conv layers, but I felt this would be like a cheat :stuck_out_tongue: So this is linear layers only.

BTW I highly suggest to use local setup. The GPU on kaggle seems to be a bit slow.

11 Likes

My Assignment 3

I will work more with this over the week adjusting the hyper parameters. I used local setup.

Hey, here is my submission for the 3rd assignment.

I used Google Colab.
My model is 3 layers (1024, 1024, 10) and trained with 0.1, 0.01, 0.001, 0.0001 learning rates and 60, 25, 20, 10 epochs.

Here are the results:
accuracy: %57 loss: 2.16

4 Likes

Hi everyone,

here is my notebook for the 3rd Assignment!
I used a feed forward model with 3 hidden layers. The size of my model is [ 3x32x32, 1024, 512, 64, 10] and I used the Adam optimizer.
I managed to achieve a 80% accuracy on the test data.

Thanks for the lecture today, it was amazing to build my first deep neural network!

Aris

12 Likes

Some great work up there. Mine’s here

3 Likes

Hi everyone, model accuracy is heavily dependent on your epochs and learning rates. I’ve achieved 99.25% accuracy in my model after tweaking hyper-parameters a few times. I gave an attempt to explain why in my medium blog post here. You can also find my notebook here. I will probably experiment the use of neural networks on pneumonia image classification so stay tuned throughout the rest of the week.

EDIT: As stated below from @aakashns, there was a reference error in the dataloader. For anyone that reviewed my notebook and blog post, I apologize. I should’ve been a bit more unsettled since the model was showing the same accuracy for both training and testing. The new accuracy from my model is around 55% which is much more understandable since the model is only looking at each pixel rather than the full picture. I assume that this is why we have convolution networks. Again, I apologize for any misunderstandings. I’ve updated my notebook here and updated my blog post here.

12 Likes

Wow, that’s insane! And I thought my final accuracy was “good” :sweat:. Is there a reason for the input sizes you chose for your hidden layers (1536, 768, 384, 128)?

Node sizes between layers should be between the size of your input and the size of your output, so in this case between a number between 10 and 3072. However, I don’t think node sizes matter as much compared to the optimal amount of epochs combined with the correct learning rate. With the correct number of epochs and learning rate, I could probably create a one layer model that will achieve a comparable accuracy.

EDIT: Above, in terms of accuracy I was speaking about the initial accuracy I had come up with which was about 99%. That was incorrect due to the dataloader reference. My new accuracy was around 55%. I still believe that a single layer network with the correct hyper-parameters will be able to achieve a comparable accuracy of 55%.

1 Like

Hey guys,
My submission: https://jovian.ml/glmiotto/03-cifar10-feedforward

I tried four models. The last one got 99.69% accuracy, however as I didn’t experiment a lot with varying epochs, this was likely a bit brute-force since I used a 6-layer structure (output sizes 1024, 512, 256, 128, 32, 10).

I was afraid the model would not focus on the important stuff with these node sizes but it worked out, though training was understandably a bit slow.

I used decreasing learning rates starting fairly high at 0.6 and going down to 0.005 until the final plateau. Always 5-10 epochs per alpha. Would love some input about how to design a good learning-rate / epoch number progression in a way that uses fewer layers.

Edit: well, duh. I did notice that the last training-validation set results were equal to the testing set results but didn’t really look into why - I guess the 99% accuracy was too juicy to question. Turns out the loaders all had the same data…

I’ll re-submit today with the update @aakashns, thank you for clarifying.

3 Likes

I’m sorry but there is literally no way you can achieve 99% using feedforward networks(the state of the art itself is 99.37% on cifar10, infact your result is 2nd state-of-the art result if your result is correct), you must be doing something wrong here.

1 Like

Remember to DETACH the validation loss you guys achieving 99+% are overfitting! @allenkong221 @glmiotto

5 Likes

Damn, that’s a good catch!

I haven’t noticed this myself. I usually apply torch.no_grad() to everything that shouldn’t participate in grad computation, but since this model has been provided I assumed it is correct :stuck_out_tongue:

Now I’ll have to fight a bit again :smiley:

1 Like

Updated my notebook… And achieved 99.804% :open_mouth:
Either I’m providing state-of-the-art network or there is something wrong:

  • this is not the real CIFAR10 dataset for some reason,
  • the grads are still leaking somewhere.

https://jovian.ml/sebgolos/03-cifar10-feedforward

1 Like

IMPORTANT NOTE: There was an error in the assignment notebook. We were using the full dataset for training, validation and test loaders.

Please update this cell in your notebook: https://jovian.ml/aakashns/03-cifar10-feedforward/v/6#C29

3 Likes

I was about to comment that… XD

@Sebgolos @bryankeithflynn @baroooo @richardso21 @anivorlis @jazz215 @allenkong221 @glmiotto @vijayabhaskar96

Please see my update above. There was an error in the assignment notebook. Please try again with the updated data loaders.

2 Likes

@aakashns Please also update and announce to detach the validation loss function

That shouldn’t be a problem, as far as I can tell. Detaching the loss should not affect model performance.

4 Likes

Hey guys, I did a little tweaking around, can’t seem to get the network to go beyond 49%. If it can do any better, what else can I try? And what’s the approx upper limit; so I know that it’s time to probably venture into new architectures.

(I’ve detached the loss as @vijayabhaskar96 reminded, and I’ve also wrapped a torch.no_grad() on the validation part - if that’s correct)

Note: Changes done to the dataset as mentioned by @aakashns have been taken into account

Accuracy: 51% (after a bit more tweaking)

Here’s my work (forgot to upload it at the time of posting the question :stuck_out_tongue:):

1 Like