Paraphrase detection model

Hello, I want to test the pretrained model for paraphrase detection.
I did not find the model. So should I train it on the dataset from paraphrase.ru by myself?
I guess I shoud change some paths in the config paraphraser_rubert.json and then run:
python -m deeppavlov predict deeppavlov/configs/classifiers/paraphraser_rubert.json

Hi @alissiawells,
Sorry for the late reply.
You can download the model itself and required files by running

python -m deeppavlov download paraphraser_rubert

To evaluate it on paraphraser.ru dataset you can run

python -m deeppavlov evaluate [-d] paraphraser_rubert

(-d will ensure download of the model if you did not run the download command earlier)

If you want to infer the model on your own data you can do it with a python code:

from deeppavlov import build_model, configs

model = build_model(configs.classifiers.paraphraser_rubert[, download=True])

print(model([text_a_1, text_a_2, text_a_3, ...],
            [text_b_1, text_b_2, text_b_3, ...]))

You can also infer the model on all data divided on batches:

model.batched_call(list_of_texts_a, list_of_texts_b, batch_size=64)

Hope this helps

1 Like

Thank you! Is metric or probability inference now implemented in the library?

To get probabilities you can add "return_probas": true, to your configuration file for the "class_name": "bert_classifier", block.
Or for a built model you can just run

model[-1].return_probas = True

and your model will start returning probabilities instead of class indexes.