UserWarning：正在访问不是叶 Tensor 的 Tensor 的 .grad 属性

Question

我正在从头开始在 Pytorch 中创建逻辑回归。但是当我更新可训练参数 Weights & biases 时，我遇到了一个问题。这是我的实现，

class LogisticRegression():
    
    def __init__(self, n_iter, lr):
        self.n_iter = n_iter
        self.lr = lr
    
    def fit(self, dataset):
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        n = next(iter(dataset))[0].shape[1]
        self.w = torch.zeros(n, requires_grad=True).to(device)
        self.b = torch.tensor(0., requires_grad=True).to(device)
        
        for i in range(self.n_iter):
            with tqdm(total=len(dataset)) as pbar:
                for x, y in dataset:
                    x = x.to(device)
                    y = y.to(device)
                    y_pred = self.predict(x.float())
                    loss = self.loss(y, y_pred)
                    loss.backward()
                    with torch.no_grad():
                        print(self.w, self.b)
                        self.w -= self.w.grad * self.lr
                        self.b -= self.b.grad * self.lr
                        self.w.grad.zero_()
                        self.b.grad.zero_()
                    pbar.update(1)
            print(f'Epoch: {i} | Loss: {loss}')
    
    def loss(self, y, y_pred):
        y_pred = torch.clip(y_pred, 1e-7, 1 - 1e-7)
        return -torch.mean(
                y * torch.log(y_pred + 1e-7) + 
                (1 - y) * torch.log(1 - y_pred + 1e-7),
            axis=0)
    
    def predict(self, x):
        return self.sigmoid(torch.matmul(x, self.w) + self.b)
    
    def sigmoid(self, x):
        return 1/(1 + torch.exp(-x))

正如您所见，当我用数据集拟合模型时，我正在用零初始化权重和偏差并设置 requires_grad=True 以便稍后访问梯度。我使用了 sklearn 乳腺癌数据集，

X, y = load_breast_cancer(return_X_y=True) # load dataset
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # train test split

# convert all numpy arrays to torch tensor
x_train = torch.tensor(x_train)
x_test = torch.tensor(x_test)
y_train = torch.tensor(y_train)
y_test = torch.tensor(y_test)

# Making it a Torch dataset then into DataLoader
train_dataset = torch.utils.data.TensorDataset(x_train, y_train)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32)

test_dataset = torch.utils.data.TensorDataset(x_test, y_test)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32)

log = LogisticRegression(n_iter=10, lr=0.001)
log.fit(train_loader)

一旦我将数据集放入逻辑回归，它就会给我这个错误（我还在梯度更新之前在逻辑回归中添加了一个打印语句，很明显它有 grad_fn 参数） ,

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0', grad_fn=<ToCopyBackward0>) tensor(0., device='cuda:0', grad_fn=<ToCopyBackward0>)

TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'

在这个错误的开始，它给出了这个用户警告

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.

我需要帮助解决错误以便梯度更新和模型训练成功！

Answer 1

乳腺癌数据集特征的可能值范围很大，从 0.001 到 1000，方差也很大，因此它会影响梯度（当梯度变得太大时会导致不稳定，然后导致 NaN）。为了克服这种依赖性，通常的做法是在拆分后对数据进行归一化，例如：

from sklearn import preprocessing
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split 

X, y = load_breast_cancer(return_X_y=True) # load dataset
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # train test split

scaler = preprocessing.StandardScaler().fit(x_train)  # computing mean and variance of train data
x_train = scaler.transform(x_train) # normalizing train data
x_test = scaler.transform(x_test)   # normalizing test data based on statistics of train

所以在那之后一切都会好起来的。

UserWarning：正在访问不是叶 Tensor 的 Tensor 的 .grad 属性

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed

python

logistic-regression

deep-learning

pytorch