Share Your Work - Assignment 3

Hey guys,
My submission:

I tried four models. The last one got 99.69% accuracy, however as I didn’t experiment a lot with varying epochs, this was likely a bit brute-force since I used a 6-layer structure (output sizes 1024, 512, 256, 128, 32, 10).

I was afraid the model would not focus on the important stuff with these node sizes but it worked out, though training was understandably a bit slow.

I used decreasing learning rates starting fairly high at 0.6 and going down to 0.005 until the final plateau. Always 5-10 epochs per alpha. Would love some input about how to design a good learning-rate / epoch number progression in a way that uses fewer layers.

Edit: well, duh. I did notice that the last training-validation set results were equal to the testing set results but didn’t really look into why - I guess the 99% accuracy was too juicy to question. Turns out the loaders all had the same data…

I’ll re-submit today with the update @aakashns, thank you for clarifying.


I’m sorry but there is literally no way you can achieve 99% using feedforward networks(the state of the art itself is 99.37% on cifar10, infact your result is 2nd state-of-the art result if your result is correct), you must be doing something wrong here.

1 Like

Remember to DETACH the validation loss you guys achieving 99+% are overfitting! @allenkong221 @glmiotto


Damn, that’s a good catch!

I haven’t noticed this myself. I usually apply torch.no_grad() to everything that shouldn’t participate in grad computation, but since this model has been provided I assumed it is correct :stuck_out_tongue:

Now I’ll have to fight a bit again :smiley:

1 Like

Updated my notebook… And achieved 99.804% :open_mouth:
Either I’m providing state-of-the-art network or there is something wrong:

  • this is not the real CIFAR10 dataset for some reason,
  • the grads are still leaking somewhere.


IMPORTANT NOTE: There was an error in the assignment notebook. We were using the full dataset for training, validation and test loaders.

Please update this cell in your notebook:


I was about to comment that… XD

@Sebgolos @bryankeithflynn @baroooo @richardso21 @anivorlis @jazz215 @allenkong221 @glmiotto @vijayabhaskar96

Please see my update above. There was an error in the assignment notebook. Please try again with the updated data loaders.


@aakashns Please also update and announce to detach the validation loss function

That shouldn’t be a problem, as far as I can tell. Detaching the loss should not affect model performance.


Hey guys, I did a little tweaking around, can’t seem to get the network to go beyond 49%. If it can do any better, what else can I try? And what’s the approx upper limit; so I know that it’s time to probably venture into new architectures.

(I’ve detached the loss as @vijayabhaskar96 reminded, and I’ve also wrapped a torch.no_grad() on the validation part - if that’s correct)

Note: Changes done to the dataset as mentioned by @aakashns have been taken into account

Accuracy: 51% (after a bit more tweaking)

Here’s my work (forgot to upload it at the time of posting the question :stuck_out_tongue:):

1 Like

The new notebook is giving some error:

in forward(self, xb)
10 out = xb.view(xb.size(0), -1)
11 # Get intermediate outputs using hidden layer
—> 12 out = self.linear1(xb)
13 # Apply activation function
14 out = F.relu(out)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/ in call(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
–> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/ in forward(self, input)
86 def forward(self, input):
—> 87 return F.linear(input, self.weight, self.bias)
89 def extra_repr(self):

/opt/conda/lib/python3.7/site-packages/torch/nn/ in linear(input, weight, bias)
1610 ret = torch.addmm(bias, input, weight.t())
1611 else:
-> 1612 output = input.matmul(weight.t())
1613 if bias is not None:
1614 output += bias

RuntimeError: size mismatch, m1: [24576 x 32], m2: [3072 x 500] at /opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/THC/generic/

Whats your model architecture?
There’s some problem with input sizes.

Generally when you get a mismatch error all you have to care about is b = c

m1 is [a x b] which is [batch size x in features]

m2 is [c x d] which is [in features x out features]

Thanks. I restarted and the problem seems to have gone away. Thanks for your reply.

Here is my attempt for assignment 3: Assignment 3 Notebook. I have tried different architectures(3,4,5 layer) but the test accuracy is around 50%. Its like the model don’t want learn anymore. :slight_smile:

1 Like

Hello guys, just want to ask a silly question how do i get the
DATASET_URL from kaggle ? Because what I normal did is open a notebook in kaggle .
Can anyone help ahha

Here is the link to my assignment 3.
I tried adding more hidden layers and tried different learning rates. I could not get beyond 52%. The more epochs I gave, the more it stayed constant. So I don’t think increasing epoch will help here.

Please check and provide feedback if any.

1 Like

Here is a link: I am assuming you want to use Kaggle data in Colab.


Alright, thanks ya. I guess is the same thing using notebook