Returning probabilities as a NER model output

I am currently using ner_rus_bert model and want to build precision-recall plot. I need tag probabilities as a model output for the task. However, I cannot find how to do it. Also, If there is a possibility, how to match these probabilities to B- and I- tags, so that I knew where the start of a new entity is?

Please, don’t reccommend me to use ner_rus_bert_probas instead as I trained the model from the scratch on my own set of tags, and the process had taken too much time to do train it again. Thanks!

@egorhowtocode Thank you for your interest!

You can add “probas” to the chainer output like that

{
  "dataset_reader": {
    "class_name": "conll2003_reader",
    "data_path": "{DOWNLOADS_PATH}/total_rus/",
    "dataset_name": "collection_rus",
    "provide_pos": false
  },
  "dataset_iterator": {
    "class_name": "data_learning_iterator"
  },
  "chainer": {
    "in": [
      "x"
    ],
    "in_y": [
      "y"
    ],
    "pipe": [ ... ],
    "out": [
      "x_tokens",
      "y_pred",
      "probas"
    ]
  },
...
}

A “probas” variable is a list with shape (# Tokens, # Tags), it contain tags probabilities for every token. The order of tags is defined in tags.dict file.

Here is an example of output i got from ner_rus_bert config

[
  [
    [
      "I",
      "live",
      "in",
      "London"
    ]
  ],
  [
    [
      "O",
      "O",
      "O",
      "B-LOC"
    ]
  ],
  [
    [
      [
        0.9997153810429316,
        0.00003316910272597246,
        0.0000439740756288545,
        0.000043977252467300344,
        0.000048637673234870496,
        0.00009564136416580716,
        0.00001921948884537947
      ],
      [
        0.9996180748017754,
        0.00005889125741379888,
        0.000040992818869201765,
        0.00011231036393879228,
        0.000048028430895354033,
        0.00008674310547533573,
        0.00003495922163206121
      ],
      [
        0.9995755606207712,
        0.00004476913443404276,
        0.000037930143643938715,
        0.00011780307564147073,
        0.000035597903526226606,
        0.0001531972432171844,
        0.00003514187876575006
      ],
      [
        0.002357293996870223,
        0.0001088633071097526,
        0.0001423921705586363,
        0.00027232972999110416,
        0.00016875317645286607,
        0.9968211665071064,
        0.00012920111191096204
      ]
    ]
  ]
]

Hi, Maksim Savkin! Thank you very much for your quick reply. It has just come to my mind that maybe there is a parameter responsible for probability thershold which value I could iteratively change for making alternative classifications and plotting precision-recall curve with this information? Is it there? It would be easier than working with probabilities myself.

If not, can you please tell me what steps I should take with these probabilities for each token to infer whether a specific word refers to an entity with a precise probability. Alternatively, I want to obtain probabilities for words instead of probabilities for tokens. Referring to your example, I would like to see that the probability of ‘London’ being associated with B-LOC is, for instance, 95%. Sorry if my request feels silly. I am new to NLP.

Sorry for misleading you, the “y_pred” and “proba” variables already contain predictions for every word, not token. In ner_rus_bert words and tokens are named “x_tokens” and “x_subword_tokens” accordingly.

See example below:

[
  [
    [
      "I",
      "live",
      "in",
      "Londondon"
    ]
  ],
  [
    [
      "O",
      "O",
      "O",
      "B-LOC"
    ]
  ],
  [
    [
      "[CLS]",
      "I",
      "live",
      "in",
      "London",
      "##don",
      "[SEP]"
    ]
  ],
  [
    [
      [
        0.9998205670532392,
        0.000022740661445636296,
        0.000025579680073423476,
        0.000044798410923201316,
        0.000031688666368575275,
        0.00003988555794564256,
        0.000014739970004354457
      ],
      [
        0.9997798478452604,
        0.00003286809339116068,
        0.00002142539783809065,
        0.0000819787621632602,
        0.00003085032524957288,
        0.0000325300085121201,
        0.000020499567585306254
      ],
      [
        0.9998075040457522,
        0.000022445523418634413,
        0.0000190175051971648,
        0.0000684719446003878,
        0.00002127966304236002,
        0.00004192089321123403,
        0.00001936042477773189
      ],
      [
        0.0017040792035628157,
        0.00024287244563637154,
        0.0005544705489203987,
        0.00014723784924860816,
        0.00174474047046536,
        0.9955289851851594,
        0.00007761429700697706
      ]
    ]
  ]
]

Right now deeppavlov does not support probability thersholding. You would probably need to save the predicted probabilities somewhere and perform thresholding manually.

Ok! Thank you very much! I tried it and everything worked well!

1 Like