Hello! I have a question about the methods applied to models.
Is there any method with which you can get a list of all the words in the model? (Suppose I want to get all the words from “ru_syntagrus_joint_parsing”.) Or is there a method that immediately shows whether the word being processed is present in the dictionary or not?
Thanks in advance!
I do not understand what do you mean by all words in the model. The lemmatization part is done on the basis of pymorphy
analyzer, which can process out-of-vocabulary words as well. Tagging and parsing components does not use dictionaries in any form.
I mean, I need to find out whether this is an out-of-vocabulary word or nor. For example, pymorphy labels FakeDictionary on out-of-vocabulary words.
OOV words for lemmatization column are exactly OOV words for pymorphy
. For other parts of the output, there is no such notion.
1 Like