Lecture 1 - PyTorch Basics & Linear Regression

Hello, I just finished going through the first note book 01-pytorch-basics and I was trying some stuff when I encountered this warning. I am just a beginner in pytorch.

What I did upto this point:

  • The next thing I did was to follow the github link and found this solution.

Use .retain_grad() if you want the gradient for a non-leaf Tensor. Or make sure you have the leaf Tensor if your have a non-leaf Tensor by mistake.

  • Then I searched about leaf and non leaf tensors and I could not find enough information, would be great if someone helps me with this also.

Steps to reproduce:

Warning message:

/srv/conda/envs/notebook/lib/python3.7/site-packages/torch/tensor.py:746: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won’t be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
warnings.warn("The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad "

2 Likes

About questions at “Further Reading”

I don’t know if answer could be:
“you can’t backward on many non-leaf tensors”, because “grad can be implicitly created only for scalar outputs” (what that means, I neither know)

So a solution could be execute => y.sum().backward()

But, is this what you expect? I’m trying to draw on this end questions.

How do you know if i watched the lecture 100% ?

what to do when binder take a long time to start an environment?

Yes! To minimise the loss as evident from the weight-loss graph.

I got it to work by running this command into the environment:
conda install -c defaults intel-openmp -f

Also make sure Conda is updated to the latest version.

3 Likes

It’s all about being genuine. If you haven’t watched lecture. Your won’t be able to work on assignments if you’re a beginner . One thing leads to another @anis-bensaci8

It is for us to learn. That’s why its free !!!
Getting only certificate doesn’t mean anything if you can’t apply the skills.

3 Likes

That should not cause an issue @edsenmichaelcy. Depending on your operating system, some underlying dependencies of PyTorch/Jupyter may not get installed on your system. But the notebook should work

Hi @alvertosk84 and @danny thanks for reporting your errors. Try the solution shared by @Luay conda install -c defaults intel-openmp -f

@jazz215 There’s no confirmation. Also, we’ve made marking attendance optional for lecture 1. cc @viratsatheesh29

We have an option to vote so that you say that you have watched the video and if you have the knowledge already you can complete the assignment

1 Like

@kumarsuraj9450 Binder takes 2-10 minutes to install your dependencies. There may also be some queueing time since it is a free service. Generally, I click “Run on Binder” as soon as I open a notebook, and then I read through the notebook on Jovian, while it loads up on Binder. Hope that helps!


Yes, as far as I know, you can only run .backward on scalars (i.e. numbers)

1 Like

@PrajwalPrashanth I fix it already Thanks :smile: :smile: :smile: :smile:

May I ask in the chapter of liner-regression how do we get W^11 and b^11 such as

yield_apple  = w11 * temp + w12 * rainfall + w13 * humidity + b1

I don’t understand much about how to get W11 and B1

@edsenmichaelcy This is an assumption we are making. That the yield of apples (output variable) is assumed to be a weighted sum of the temperature, rainfall and humidity (input variables). The numbers w11, w12 etc. are the weights we give to give to each input variable. The goal of linear regression is to figure out a good set of weights

1 Like

Another reason for squaring the loss is that the absolute value function is not differentiable at its minima/maxima (it’s a pointed edge) whereas a quadratic function is differentiable at all points.

1 Like

Thanks for giving a detailed description @akashdeep-ghosh!

Basically it goes something like this:

  • If you track the dependencies, you’ll realize y depends on (x,w,b), z depends on y, and w depends on z.
  • This is why “w” is called a leaf node (nothing depends on it) and “z” and “y” are non-leaf nodes.
  • When you call w.backward(), the Pytorch implicitly calls z.backward and y.backward to calculate w.grad and b.grad (this is the chain rule of differentiation if you remember from calculus)
  • Now, the warning simply indicates that you can only call .backward on a variable exactly once. So, if you call y.backward(), then you can no longer call w.backward() since it calls y.backward internally.

@jonathanloscalzo hope this also answers your question from the “Further” reading section.

Indeed, windows has caused us lot of issues in the past. That’s the main reason we run everything on the cloud.

Technically, we always subtract the gradient.

  • Suppose weight is 1.5 and gradient is .5 (positive). Then, we need to decrease the weight to decrease the loss, right? By subtracting gradient, new weight = 1.5 - .5 = 1
  • Suppose weight is 1.5 and gradient is -0.5 (negative). Then, we need to increase the weight to decrease the loss, right? By subtracting gradient, new weight = 1.5 - (-0.5) = 1.5 + .5 = 2
4 Likes

Thanks i got it :grin:


I am getting NaN when computing loss using pytorch. Can someone help me with this and why it occurs?

How does one know how big the batch size should be for a particular training data?

When we initialize weights and bias with tensor.randn. How we pick the size of the tensor with respect to inputs and outputs.
Ex:
w = torch.randn(2, 3, requires_grad=True)
where the value 2 and 3 came from.

We have 3 inputs (temp, rainfall, humidity) and we want 2 outputs (crop yield for apples, crop yield for oranges). Tell me if I answered at your question.