微调后如何使用语言模型进行预测?
How to use a language model for prediction after fine-tuning?
我 trained/fine-tuned 一个 Spanish RoBERTa 模型,最近针对除文本分类之外的各种 NLP 任务进行了预训练。
由于基线模型似乎很有前途,我想针对不同的任务对其进行微调:文本分类,更准确地说,是西班牙语推文的情感分析,并用它来预测我抓取的推文上的标签。
预处理和训练似乎工作正常。但是,我不知道以后如何使用这种模式进行预测。
我将省略预处理部分,因为我认为这似乎没有问题。
代码:
# Training with native TensorFlow
from transformers import TFAutoModelForSequenceClassification
## Model Definition
model = TFAutoModelForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne", from_pt=True, num_labels=3)
## Model Compilation
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.metrics.SparseCategoricalAccuracy()
model.compile(optimizer=optimizer,
loss=loss,
metrics=metric)
## Fitting the data
history = model.fit(train_dataset.shuffle(1000).batch(64), epochs=3, batch_size=64)
输出:
/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py:337: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.
"Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 "
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForSequenceClassification: ['roberta.embeddings.position_ids']
- This IS expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFRobertaForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Epoch 1/5
16/16 [==============================] - 35s 1s/step - loss: 1.0455 - sparse_categorical_accuracy: 0.4452
Epoch 2/5
16/16 [==============================] - 18s 1s/step - loss: 0.6923 - sparse_categorical_accuracy: 0.7206
Epoch 3/5
16/16 [==============================] - 18s 1s/step - loss: 0.3533 - sparse_categorical_accuracy: 0.8885
Epoch 4/5
16/16 [==============================] - 18s 1s/step - loss: 0.1871 - sparse_categorical_accuracy: 0.9477
Epoch 5/5
16/16 [==============================] - 18s 1s/step - loss: 0.1031 - sparse_categorical_accuracy: 0.9714
问题:
文本classification/sentiment微调后的模型如何使用? (我想为我抓取的每条推文创建一个预测标签。)
解决这个问题的好方法是什么?
我试过保存模型,但不知道在哪里可以找到并使用:
# Save the model
model.save_pretrained('Twitter_Roberta_Model')
我也试过将它添加到 HuggingFace 管道中,如下所示。但我不确定这是否能正常工作。
classifier = pipeline('sentiment-analysis',
model=model,
tokenizer=AutoTokenizer.from_pretrained("BSC-TeMU/roberta-base-bne"))
虽然这是针对特定模型 (DistilBert) 的示例,但以下预测代码的工作方式应该类似(根据您的需要进行小幅修改)。您只需要根据您的模型 (TFAutoModelForSequenceClassification
) 替换 distillbert
,当然还要确保使用正确的分词器。
loaded_model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
loaded_model.load_weights('./distillbert_tf.h5')
input_text = "The text on which I test"
input_text_tokenized = tokenizer.encode(input_text,
truncation=True,
padding=True,
return_tensors="tf")
prediction = loaded_model(input_text_tokenized)
prediction_logits = prediction[0]
prediction_probs = tf.nn.softmax(prediction_logits,axis=1).numpy()
print(f'The prediction probs are: {prediction_probs}')
我 trained/fine-tuned 一个 Spanish RoBERTa 模型,最近针对除文本分类之外的各种 NLP 任务进行了预训练。
由于基线模型似乎很有前途,我想针对不同的任务对其进行微调:文本分类,更准确地说,是西班牙语推文的情感分析,并用它来预测我抓取的推文上的标签。
预处理和训练似乎工作正常。但是,我不知道以后如何使用这种模式进行预测。
我将省略预处理部分,因为我认为这似乎没有问题。
代码:
# Training with native TensorFlow
from transformers import TFAutoModelForSequenceClassification
## Model Definition
model = TFAutoModelForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne", from_pt=True, num_labels=3)
## Model Compilation
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.metrics.SparseCategoricalAccuracy()
model.compile(optimizer=optimizer,
loss=loss,
metrics=metric)
## Fitting the data
history = model.fit(train_dataset.shuffle(1000).batch(64), epochs=3, batch_size=64)
输出:
/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py:337: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.
"Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 "
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForSequenceClassification: ['roberta.embeddings.position_ids']
- This IS expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFRobertaForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Epoch 1/5
16/16 [==============================] - 35s 1s/step - loss: 1.0455 - sparse_categorical_accuracy: 0.4452
Epoch 2/5
16/16 [==============================] - 18s 1s/step - loss: 0.6923 - sparse_categorical_accuracy: 0.7206
Epoch 3/5
16/16 [==============================] - 18s 1s/step - loss: 0.3533 - sparse_categorical_accuracy: 0.8885
Epoch 4/5
16/16 [==============================] - 18s 1s/step - loss: 0.1871 - sparse_categorical_accuracy: 0.9477
Epoch 5/5
16/16 [==============================] - 18s 1s/step - loss: 0.1031 - sparse_categorical_accuracy: 0.9714
问题:
文本classification/sentiment微调后的模型如何使用? (我想为我抓取的每条推文创建一个预测标签。)
解决这个问题的好方法是什么?
我试过保存模型,但不知道在哪里可以找到并使用:
# Save the model
model.save_pretrained('Twitter_Roberta_Model')
我也试过将它添加到 HuggingFace 管道中,如下所示。但我不确定这是否能正常工作。
classifier = pipeline('sentiment-analysis',
model=model,
tokenizer=AutoTokenizer.from_pretrained("BSC-TeMU/roberta-base-bne"))
虽然这是针对特定模型 (DistilBert) 的示例,但以下预测代码的工作方式应该类似(根据您的需要进行小幅修改)。您只需要根据您的模型 (TFAutoModelForSequenceClassification
) 替换 distillbert
,当然还要确保使用正确的分词器。
loaded_model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
loaded_model.load_weights('./distillbert_tf.h5')
input_text = "The text on which I test"
input_text_tokenized = tokenizer.encode(input_text,
truncation=True,
padding=True,
return_tensors="tf")
prediction = loaded_model(input_text_tokenized)
prediction_logits = prediction[0]
prediction_probs = tf.nn.softmax(prediction_logits,axis=1).numpy()
print(f'The prediction probs are: {prediction_probs}')