Lecture 1 - PyTorch Basics & Linear Regression

Why is squared loss used as opposed to the absolute value?

You could use the absolute value as well, although squaring the loss tends to exaggerate the predictions with greater losses. For instance, a loss value of 2 would become 4 when squared, and a loss value of 3 would become 9 when squared. The larger a loss is, the more exaggerated it becomes; that way, it becomes easier to spot which predictions are causing the bulk of the errors and hence which weights need to be adjusted accordingly.



Try running it on colab

At the same time, if the difference between prediction and desired output falls below 1, the loss starts to become a bit less useful, as squaring anything below 1 causes the number to become even smaller (and providing less information on how to change the weights). In this setting it was OK, because the numbers were usually above 1. But I’ve noticed, when working for example with normalized images (with colors inside 0-1 range or -1 to 1), MSE loss starts to struggle with converging toward minima. This can become a problem when working with a sigmoid or tanh activation function.

When the gradient is negative we increase the values of the weight by adding a portion of the gradient to the weights and when the gradient is positive we decrease the value of the weights by subtracting a portion of the gradient from the weights. Is my statement correct ?

For anyone struggling with the online notebooks it may be worth a shot running them locally on your machine. I’ve thrown together a quick guide on how I installed on Windows using VSCode, if anyone is interested.


conda in windows is not very stable with Pytorch and specially when we needed to use GPUs, which is going to be the case for the next lectures. It has numerous compatibility issues with the cuda libraries necessary to make it work.

May I suggest you guys not to run things locally, otherwise you can spend a lot of time trying to configure and make things work instead of focusing on the code and the concepts from the lecture. I just forked the notebooks and I am running the notebooks right here on Jovian, which uses Binder as a kernel where things are already configured.

Just a suggestion from having spend numerous hours trying to run things on Windows.


  • The next thing I did was to follow the github link and found this solution.

Use .retain_grad() if you want the gradient for a non-leaf Tensor. Or make sure you have the leaf Tensor if your have a non-leaf Tensor by mistake.

  • Then I searched about leaf and non leaf tensors and I could not find enough information, would be great if someone helps me with this also.

/srv/conda/envs/notebook/lib/python3.7/site-packages/torch/tensor.py:746: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won’t be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
warnings.warn("The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad "


About questions at “Further Reading”

I don’t know if answer could be:
“you can’t backward on many non-leaf tensors”, because “grad can be implicitly created only for scalar outputs” (what that means, I neither know)

So a solution could be execute => y.sum().backward()

But, is this what you expect? I’m trying to draw on this end questions.

Yes! To minimise the loss as evident from the weight-loss graph.

I got it to work by running this command into the environment:
conda install -c defaults intel-openmp -f

Also make sure Conda is updated to the latest version.


That should not cause an issue @edsenmichaelcy. Depending on your operating system, some underlying dependencies of PyTorch/Jupyter may not get installed on your system. But the notebook should work

Hi @alvertosk84 and @danny thanks for reporting your errors. Try the solution shared by @Luay conda install -c defaults intel-openmp -f

