GPT2-XL预训练模型训练数据大小

Size of the training data of GPT2-XL pre-trained model

huggingface transformer, it is possible to use the pre-trained GPT2-XL language model. But I don't find, on which dataset it is trained? Is it the same trained model which OpenAI used for their paper(在名为 webtext 的 40GB 数据集上训练)?

GPT2-XL 模型是您链接的论文中详述的四种架构中最大的(1542M 参数)。它使用与其他三个相同的数据进行训练,也就是您提到的 WebText。