AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'

AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'

我正在使用 huggingface 转换器库并在 运行ning run_lm_finetuning.py 时收到以下消息:AttributeError:'GPT2TokenizerFast' 对象没有属性 'max_len'。还有其他人遇到这个问题或知道如何解决吗?谢谢!

我的完整实验运行: mkdir 实验

对于 5 中的纪元 做 python run_lm_finetuning.py
--model_name_or_path distilgpt2
--model_type gpt2
--train_data_file small_dataset_train_preprocessed.txt
--output_dir experiments/epochs_$纪元
--do_train
--overwrite_output_dir
--per_device_train_batch_size 4
--num_train_epochs $纪元 完成

"AttributeError: 'BertTokenizerFast' object has no attribute 'max_len'" Github issue 包含修复:

The run_language_modeling.py script is deprecated in favor of language-modeling/run_{clm, plm, mlm}.py.

If not, the fix is to change max_len to model_max_length.

我用这个命令解决了

pip install transformers==3.0.2