Complete guide on mulilingual QA model implementation

zlobendog · April 21, 2021, 7:06pm

Hi everyone!

First of, I really like everything about DeepPavlov: it’s clear and concise website, it’s documentation and presence on Medium.
However, I can’t seem to find a complete walk-through type of tutorial for M-BERT based multilingual QA model.

Basically, here’s what I want to achieve:

I want a model I can access either via API or (even better!) through Telegram. I want this model to be additionally trained on my set of documentations and legislation. If that is even required. Maybe not, I’m not sure.
The overall idea being that I can ask questions on these policies and\or legislation in English, Russian and Azerbaijani and get the answer from there.

And here’s where I’m struggling:

What is the correct formatting my own documents need to be in for model to work? Currently I have .txt files that I read into variables. Should I apply some additional sanitation? Do “” instead of “\n”, for example? I can’t find any info on this.
Should I train this model first on my data? I tried training it by locating this part with:
model_config = json.load(open(configs.squad.squad_bert_multilingual_freezed_emb))
pprint(model_config[‘dataset_reader’])
but that didn’t work, no "dataset_reader"s there.
How does one combine different models? I would like to have a bot that opens up with a chit-chat and then answers question using model above if that is at all possible.

And, just in general, I would appreciate links to articles and tutorials that go through similar process from start to finish, so I can maybe learn best-practices so to speak. I am absolutely new to this. Something more comprehensive than the Colab notebooks available in documentation.

P.S. I speak English and Russian, so feel free to answer in whatever language you are most comfortable with.
P.P.S. If that is the wrong subforum for these kind of questions, I’m sorry. Please forward this post to a more appropriate forum then.
P.P.P.S. If no such tutorial exists, I’d be happy to write one for you once I get it working.

rusdes · August 12, 2021, 12:54pm

Hi, did you figure it out? I’m facing a similar problem

Topic		Replies	Views
Multi-Lang QA is very well in demo! is there any way to use it in code?	1	340	December 10, 2020
Reproducibility of model training DeepPavlov Library	5	347	December 5, 2020
Tutorial: Developing QA Systems for any Language with DeepPavlov Tutorials & Guidelines	0	245	June 17, 2022
General quires regarding Deeppavlov	2	319	February 15, 2021
New Release 0.5.0 Releases	0	537	July 29, 2019

Complete guide on mulilingual QA model implementation

Related topics