Error in training the model DeepPavlov DaNetQA on my own data

I’m trying to train model DaNetQA on some data in json format. E.g. one of the entries in the train.json file:

{"question": "Вднх - это выставочный центр?", "passage": "«Вы́ставочный центр» — станция Московского монорельса.", "label": true}

Other files (validation.json and test.json ) have the same format.
But get error:

AttributeError: 'Value' object has no attribute 'names'

What should I do to train DaNetQA model on my data?

Thank you in advance.

(P.S. I tried to train other model insults_kaggle_bert by the same way. The training of this model on my data was successful)

Full discription of error:

My full code:

!pip install -q deeppavlov
!pip install transformers
!pip install datasets

from deeppavlov import build_model, configs
from deeppavlov import train_model
from deeppavlov.core.commands.utils import parse_config


model_config = parse_config("russian_superglue_danetqa_rubert")
model_config['dataset_reader']['path'] = "/content/mydata/" # change the folder in model_config  where store my train files
model = train_model(model_config) # traing model on my data

Full discription of error:

WARNING:datasets.builder:Using custom data configuration mydata-e6b49fefee9b8d55
Downloading and preparing dataset json/mydata to /root/.cache/huggingface/datasets/json/mydata-e6b49fefee9b8d55/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...
Downloading data files: 100%
3/3 [00:00<00:00, 108.16it/s]
Extracting data files: 100%
3/3 [00:00<00:00, 99.23it/s]
Generating test split:
0/0 [00:00<?, ? examples/s]
Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/mydata-e6b49fefee9b8d55/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.
100%
3/3 [00:00<00:00, 65.13it/s]
2023-01-08 13:10:05.918 ERROR in 'deeppavlov.core.common.params'['params'] at line 108: Exception in <class 'deeppavlov.dataset_iterators.huggingface_dataset_iterator.HuggingFaceDatasetIterator'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/core/common/params.py", line 102, in from_params
    component = obj(**dict(config_params, **kwargs))
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/core/data/data_learning_iterator.py", line 49, in __init__
    self.train = self.preprocess(data.get('train', []), *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/dataset_iterators/huggingface_dataset_iterator.py", line 56, in preprocess
    lb = data.info.features[label].names[lb]
AttributeError: 'Value' object has no attribute 'names'
ERROR:deeppavlov.core.common.params:Exception in <class 'deeppavlov.dataset_iterators.huggingface_dataset_iterator.HuggingFaceDatasetIterator'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/core/common/params.py", line 102, in from_params
    component = obj(**dict(config_params, **kwargs))
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/core/data/data_learning_iterator.py", line 49, in __init__
    self.train = self.preprocess(data.get('train', []), *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/deeppavlov/dataset_iterators/huggingface_dataset_iterator.py", line 56, in preprocess
    lb = data.info.features[label].names[lb]
AttributeError: 'Value' object has no attribute 'names'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-2f9574126ac3> in <module>
      1 
      2 
----> 3 model= train_model(model_config)

5 frames
/usr/local/lib/python3.8/dist-packages/deeppavlov/dataset_iterators/huggingface_dataset_iterator.py in preprocess(self, data, features, label, use_label_name, *args, **kwargs)
     54             if use_label_name and lb != -1:
     55                 # -1 label is used if there is no label (test set)
---> 56                 lb = data.info.features[label].names[lb]
     57             dataset += [(feat, lb)]
     58         return dataset

AttributeError: 'Value' object has no attribute 'names'

Hey @Yuri, Thank you very much for your question.

We integrated Russian SuperGLUE (RSG) models in one of our recent releases. The dataset_reader of the RSG models is based on huggingface datasets. The current release contains pretrained models and doesn’t support retrain. path defines the dataset path argument (e.g., russian_super_glue) and name defines the name of the dataset configuration (e.g., danetqa).

Huggingface datasets does support loading from local and remote files. Therefore you are able to integrate loading from a file by modifying the HuggingFaceDatasetReader class.

@Vasily, Thank you. But unfortunately, I still not solve my previes error.

You write:

Here, as I understand, you write about how I can to adjust loading of my data files for training and what my error in loading of this data. But in this screenshot (yellow color line (48 code section)) we can see, that hugging_dataset_iterator have read my files. Therefore, in my opinion, error occurs in other reason.
To give more details of error, I print variable (purple color line (58 code string). This variable ‘Value’ has no attribute ‘names’.

Maybe I must to adjust some files?

I see what you mean. It seems like you should somehow define the label_classes=[“False”, “True”].

You can do so by loading the dataset with local loading script. I believe there should be other way to define names.