How can I use a HuggingFace model for prediction after fine-tuning?


I’ve trained/fine-tuned a Spanish RoBERTa model that has recently been pre-trained for a variety of NLP tasks except for text classification.

Since the baseline model seems to be promising, I want to fine-tune it for a different task: text classification, more precisely, sentiment analysis of Spanish Tweets and use it to predict labels on scraped tweets I have.

The preprocessing and the training seem to work correctly. However, I don’t know how I can use this mode afterwards for prediction.

I’ll leave out the preprocessing part because I don’t think there seems to be an issue.


# Training with native TensorFlow 
from transformers import TFAutoModelForSequenceClassification

## Model Definition
model = TFAutoModelForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne", from_pt=True, num_labels=3)

## Model Compilation
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.metrics.SparseCategoricalAccuracy()

## Fitting the data 
history =, epochs=3, batch_size=64)


/usr/local/lib/python3.7/dist-packages/transformers/ UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.
  "Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 "
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForSequenceClassification: ['roberta.embeddings.position_ids']
- This IS expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFRobertaForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Epoch 1/5
16/16 [==============================] - 35s 1s/step - loss: 1.0455 - sparse_categorical_accuracy: 0.4452
Epoch 2/5
16/16 [==============================] - 18s 1s/step - loss: 0.6923 - sparse_categorical_accuracy: 0.7206
Epoch 3/5
16/16 [==============================] - 18s 1s/step - loss: 0.3533 - sparse_categorical_accuracy: 0.8885
Epoch 4/5
16/16 [==============================] - 18s 1s/step - loss: 0.1871 - sparse_categorical_accuracy: 0.9477
Epoch 5/5
16/16 [==============================] - 18s 1s/step - loss: 0.1031 - sparse_categorical_accuracy: 0.9714


How can I use the model after fine-tuning for text classification/sentiment analysis? (I want to create a predicted label for each tweet I scraped.)

What would be a good way of approaching this?

I’ve tried to save the model, but I don’t know where I can find it and use then:

# Save the model

I’ve also tried to just add it to a HuggingFace pipeline like the following. But I’m not sure if this works correctly.

classifier = pipeline('sentiment-analysis', 


Although this is an example for a specific model (DistilBert), the following prediction code should work similarly (small modifications according to your needs). You just need to replace the distillbert according to your model (TFAutoModelForSequenceClassification) and of course ensure the proper tokenizer is used.

    loaded_model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
    input_text = "The text on which I test"
    input_text_tokenized = tokenizer.encode(input_text,
    prediction = loaded_model(input_text_tokenized)
    prediction_logits = prediction[0]
    prediction_probs = tf.nn.softmax(prediction_logits,axis=1).numpy()
    print(f'The prediction probs are: {prediction_probs}')

Answered By – Timbus Calin

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published