How to have control over ODQA module output?

Note: This question was posted to this forum as kindly suggested by Vasily on Medium https://medium.com/p/9004a6406963/responses/show.

Thank you once again for a very nice article and library. A quick question please, when I use my own text articles instead of the PLOS One articles you used in this notebook, is there a way to have a control over the output of the ODQA module? I received the suggestion to try changing top_n parameter in config file, but am not sure how to do that. Eg. in this Colab notebook we have for Deeppavlov on Medium (https://colab.research.google.com/github/deepmipt/dp_notebooks/blob/master/DP_ODQA.ipynb), is there a simple way to do this?

For example:-

If I ask a question, sometimes i only get a one word answer. But, suppose, I want to have a longer answer generated (even if it may not be completely correct), is there a simple way to specify the length of the generated output? If you can point me to a relevant resource, that would be great. Happy to cite this repo and paper in my research too as it is the best one on ODQA I’ve come across till now. Cheers

Hello @jchat,

Happy to hear that you liked our ODQA!

As for your initial question about changing the length of the output answer, it’s awkwardly difficult to implement. When the retrieval part returns top_n documents, they are chunked into smaller pieces to feed to the reader. And when we chunk them, we don’t keep the information about chunk indices and word indices in them. Moreover, the reader can truncate chunks if they are too long. So when the reader returns the best result, we lose the information about the word indices in the original document.
However, we can restore the surrounding words in the same chunk where the best answer was found, but even this requires about 10 lines of source code changed, so not a simple way either.

top_n parameter is used to set the number of found relevant documents that should be fed to the reader. It can be set inside the ranker config. If you open the ODQA config, you can find a path to the ranker config. Open this ranker config and set “top_n” there.