@egorhowtocode Thank you for your interest!
You can add “probas” to the chainer output like that
{
"dataset_reader": {
"class_name": "conll2003_reader",
"data_path": "{DOWNLOADS_PATH}/total_rus/",
"dataset_name": "collection_rus",
"provide_pos": false
},
"dataset_iterator": {
"class_name": "data_learning_iterator"
},
"chainer": {
"in": [
"x"
],
"in_y": [
"y"
],
"pipe": [ ... ],
"out": [
"x_tokens",
"y_pred",
"probas"
]
},
...
}
A “probas” variable is a list with shape (# Tokens, # Tags), it contain tags probabilities for every token. The order of tags is defined in tags.dict file.
Here is an example of output i got from ner_rus_bert config
[
[
[
"I",
"live",
"in",
"London"
]
],
[
[
"O",
"O",
"O",
"B-LOC"
]
],
[
[
[
0.9997153810429316,
0.00003316910272597246,
0.0000439740756288545,
0.000043977252467300344,
0.000048637673234870496,
0.00009564136416580716,
0.00001921948884537947
],
[
0.9996180748017754,
0.00005889125741379888,
0.000040992818869201765,
0.00011231036393879228,
0.000048028430895354033,
0.00008674310547533573,
0.00003495922163206121
],
[
0.9995755606207712,
0.00004476913443404276,
0.000037930143643938715,
0.00011780307564147073,
0.000035597903526226606,
0.0001531972432171844,
0.00003514187876575006
],
[
0.002357293996870223,
0.0001088633071097526,
0.0001423921705586363,
0.00027232972999110416,
0.00016875317645286607,
0.9968211665071064,
0.00012920111191096204
]
]
]
]