使用 XLNet 进行情绪分析——设置正确的重塑参数

Using XLNet for sentiment analysis - setting the correct reshape parameters

this link之后,我正在尝试使用自己的数据进行情绪分析。但是我得到这个错误:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<timed exec> in <module>

<ipython-input-41-5f2f35b7976e> in train_epoch(model, data_loader, optimizer, device, scheduler, n_examples)
      7 
      8     for d in data_loader:
----> 9         input_ids = d["input_ids"].reshape(4,64).to(device)
     10         attention_mask = d["attention_mask"].to(device)
     11         targets = d["targets"].to(device)

RuntimeError: shape '[4, 64]' is invalid for input of size 64

当我尝试运行这个代码时

history = defaultdict(list)
best_accuracy = 0

for epoch in range(EPOCHS):
    print(f'Epoch {epoch + 1}/{EPOCHS}')
    print('-' * 10)

    train_acc, train_loss = train_epoch(
        model,
        train_data_loader,     
        optimizer, 
        device, 
        scheduler, 
        len(df_train)
    )

    print(f'Train loss {train_loss} Train accuracy {train_acc}')

    val_acc, val_loss = eval_model(
        model,
        val_data_loader, 
        device, 
        len(df_val)
    )

    print(f'Val loss {val_loss} Val accuracy {val_acc}')
    print()

    history['train_acc'].append(train_acc)
    history['train_loss'].append(train_loss)
    history['val_acc'].append(val_acc)
    history['val_loss'].append(val_loss)

我知道这个错误与我的数据形状有关,但我不确定如何找到正确的 reshape 参数来完成这项工作。

但是您还没有发布样本数据,但很明显您是如何使用 reshape function 的。关于你的问题 reshape d["input_ids"] 变成 (4,64) 那么 d["input_ids"] 应该是一个大小为 256 的数组,但实际上在你正在喂养的数据集中尺寸为 64

的模型

所以你需要根据你的数据如何重塑 d["input_ids"] 之类的 (1,64) or (2,32) or (4,16) 等,它的倍数是 64。

只是说明相同:

>>> a = np.arange(256).reshape(4,64)
>>> a
array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
         13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
         26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
         39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
         52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63],
       [ 64,  65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,
         77,  78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,
         90,  91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102,
        103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115,
        116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127],
       [128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,
        141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153,
        154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166,
        167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179,
        180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191],
       [192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204,
        205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217,
        218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230,
        231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243,
        244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]])

您示例中的形状 [4,64] 实际上是 [批量大小,max_sequence_length]

所以也许你可以用你的价值观替换它们...