How to use 2 types of ODQA models in same notebook?

jchat · August 20, 2019, 11:33am

Hello. I have a question that for instance, I would like to use 2 types of ODQA in my notebook, the first one based on the text articles the model has been trained on, and the second one based on Wikipedia for the same question. I am unable to achieve this with the following example code I found on a Medium article, and get the error AttributeError: 'NoneType' object has no attribute 'split'

Any help is appreciated. Basically, I would like to have 2 types of models in the same notebook. I am very new to this library, so can you provide a minimal example to accomplish this? Thank you.

from deeppavlov import configs
from deeppavlov.core.commands.infer import build_model

# Download all the SQuAD models
squad = build_model(configs.squad.multi_squad_noans_infer, download = True)
# Do not download the ODQA models, we've just trained it
odqa = build_model(configs.odqa.en_odqa_infer_wiki, download = False)
# # Download ODQA model for ODQA 2 from Wiki
odqa2 = build_model(configs.odqa.en_odqa_infer_wiki, download = True)

val_q = "What causes accidents?"
answer1 = odqa([val_q]) #  provide answer based on trained data 
answer2 = odqa2([val_q]) #  provide answer based on Wikipedia

yoptar · August 20, 2019, 11:49am

Hi @jchat,
Everything should work fine like this. Can you provide us with the full exception stack trace?
Also, are you using the same config for both odqa and odqa2?

jchat · August 20, 2019, 1:11pm

Hi @yoptar. I am using the same config for odqa and odqa2. Maybe this is causing the problem? Can you guide me on how to specify different model configs with the same notebook? I will also share thefull Stacktrace soon for the error. The odqa2 works perfectly indecently, and same odqa1 works perfectly independently, but using 2 together (like in my previous comment) gives the problem. Thanks

from deeppavlov import configs
from deeppavlov.core.common.file import read_json
from deeppavlov import configs, train_model


model_config = read_json(configs.doc_retrieval.en_ranker_tfidf_wiki)
model_config["dataset_reader"]["data_path"] = "/content/drive/My Drive/MyData/News_Data"
model_config["dataset_reader"]["dataset_format"] = "txt"
doc_retrieval = train_model(model_config)

yoptar · August 20, 2019, 1:51pm

The problem might actually be caused by memory deficiency. ODQA models are really heavy and can require about 20GB of RAM for one model.

jchat · August 20, 2019, 5:10pm

@yoptar please find the full exception stack trace below. I do not get this error if I run odqa or odqa2 independently in different notebooks, but arises when both are run together. But, from the error, it confuses me as to where the problem is, and if is in fact memory issue or if it can be solved. If it can be solved, then it would actually be great. Thanks

---->7 answer1 = odqa([val_q]) 
     8 answer2 = odqa2([val_q]) 

    2 frames

    [/usr/local/lib/python3.6/dist-packages/deeppavlov/models/preprocessors/odqa_preprocessors.py](https://5a4gisy2eth-b5278c2458b95a70-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20190816-085319-RC00_263780007#) in __call__(self, batch_docs) 70 for doc in docs: 71 if self.paragraphs: ---> 72 split_doc = doc.split('\n\n') 73 split_doc = [sd.strip() for sd in split_doc] 74 split_doc = list(filter(lambda x: len(x) > 40, split_doc))

    AttributeError: 'NoneType' object has no attribute 'split'

jchat · August 20, 2019, 5:13pm

Is it possible that odqa2 is overwriting the odqa config or something like that. I find that only odqa2 works in this notebook,odqa1 does not work. I guess NoneType shows that odqa has not been initialised/available. Is there a simple way to use odqa as well in addition to odqa2? Also, please note that if I comment out the odqa line in the code, and use only odqa2 in the notebook, it works perfectly. Seems like odqa is not available for use, and overwritten by odqa2.

my-master · August 21, 2019, 1:20pm

Hello @jchat!

Let’s try to troubleshoot this issue.
You have 2 models trained on different data, right? When ODQA is being trained, it creates a .db file with your indexed documents and .npz file with tfidf matrix. Please notice that .db and .npz are unique for each dataset. So if you’d like to use 2 models you need to have 2 different configs, and the names for .db and .npz should be different in each config, otherwise they will be overwritten.
Can you please provide the contents of your configuration files for odqa and odqa2?

warrenfelsh · January 18, 2021, 6:51am

The AttributeError is an exception thrown when an object does not have the attribute you tried to access. ‘NoneType’ object has no attribute ‘split’ often indicates that the attribute you are trying to split is Null, meaning there is no value in it to split. So, you need to check the attribute is not Null before splitting. Something like…

if val is not None:
    # ...

Topic		Replies	Views
How to custom train KBQA and ODQA model Models	4	642	October 5, 2021
ODQA: отвечает неправильно и не так как DrQA? DeepPavlov Library	11	1189	December 28, 2020
Обучение ODQA на собственных данных DeepPavlov Library	4	859	April 6, 2021
ODQA Обучение на своих данных DeepPavlov Library	2	1869	January 28, 2020
Can't interact with odqa deepavlov (according to medium article) Tutorials & Guidelines	4	755	April 4, 2020

How to use 2 types of ODQA models in same notebook?

Related topics