Fitting of loaded faq model can not be continued

Hi

I have tried to train DeepPavlov with my own FAQ, but it does not work somehow, whereas I have changed the configuration:

/usr/local/lib/python3.7/site-packages/deeppavlov/configs/faq/tfidf_logreg_en_faq.json

“data_path”: “/Users/michaelwechner/deeppavlov/faq_school_en.csv”

and was executing

python3 -m deeppavlov train tfidf_logreg_en_faq

But then I receive the following warning:

2019-12-26 13:10:54.137 INFO in ‘deeppavlov.core.data.simple_vocab’[‘simple_vocab’] at line 115: [loading vocabulary from /Users/michaelwechner/.deeppavlov/models/faq/mipt/en_mipt_faq_v4/en_mipt_answers.dict]
2019-12-26 13:10:54.138 INFO in ‘deeppavlov.models.sklearn.sklearn_component’[‘sklearn_component’] at line 203: Loading model sklearn.linear_model:LogisticRegression from /Users/michaelwechner/.deeppavlov/models/faq/mipt/en_mipt_faq_v4/logreg.pkl
2019-12-26 13:10:54.138 INFO in ‘deeppavlov.models.sklearn.sklearn_component’[‘sklearn_component’] at line 210: Model sklearn.linear_model.logisticLogisticRegression loaded with parameters
2019-12-26 13:10:54.138 WARNING in ‘deeppavlov.models.sklearn.sklearn_component’[‘sklearn_component’] at line 216: Fitting of loaded model can not be continued. Model can be fitted from scratch.If one needs to continue fitting, please, look at warm_start parameter

I tried to find a way to set the parameter “warm_start”, but unfortunately did not find anything which would have helped.

Any idea what I might be doing wrong?

Thanks

Michael

Hi Michael,

The simplest way would be to remove the /Users/michaelwechner/.deeppavlov/models/faq/mipt/en_mipt_faq_v4/ directory before training.

But if I were you, I would change all the save_path and load_path parameters it the config to suit me better and remove the download block from metadata.
It can look something like this: https://gist.github.com/yoptar/6442bd215deaee24337e5372602f8a79

I also think that it’s better to not store you own configuration files in the package itself, as they can be deleted if you decide to update for example. You can run

python -m deeppavlov train path/to/your/config.json

to run deeppavlov with any config.

1 Like

Hi Aleksei

Great, thank you very much! It is working now :slight_smile:

I have another question, but will open another thread/topic, because it is not directly related to this problem.

Thanks

Michael

Hi there,

I have similar problem. Help please

test_autoFaq.json

{
  "dataset_reader": {
    "class_name": "faq_reader",
    "x_col_name": "Question",
    "y_col_name": "Answer",
    "data_path": "{ROOT_PATH}/downloads/test_autoFaq_train.csv"
  },
  "dataset_iterator": {
    "class_name": "data_learning_iterator"
  },
  "chainer": {
    "in": "q",
    "in_y": "y",
    "pipe": [
      {
        "class_name": "ru_tokenizer",
        "in": "q",
        "id": "my_tokenizer",
        "lemmas": true,
        "out": "q_token_lemmas"
      },
      {
        "ref": "my_tokenizer",
        "in": "q_token_lemmas",
        "out": "q_lem"
      },
      {
        "in": [
          "q_lem"
        ],
        "out": [
          "q_vect"
        ],
        "fit_on": [
          "q_lem"
        ],
        "id": "tfidf_vec",
        "class_name": "sklearn_component",
        "save_path": "{MODEL_PATH}/tfidf.pkl",
        "load_path": "{MODEL_PATH}/tfidf.pkl",
        "model_class": "sklearn.feature_extraction.text:TfidfVectorizer",
        "infer_method": "transform"
      },
      {
        "id": "answers_vocab",
        "class_name": "simple_vocab",
        "fit_on": [
          "y"
        ],
        "save_path": "{MODEL_PATH}/answers.dict",
        "load_path": "{MODEL_PATH}/answers.dict",
        "in": "y",
        "out": "y_ids"
      },
      {
        "in": "q_vect",
        "fit_on": [
          "q_vect",
          "y_ids"
        ],
        "out": [
          "y_pred_proba"
        ],
        "class_name": "sklearn_component",
        "main": true,
        "save_path": "{MODEL_PATH}/logreg.pkl",
        "load_path": "{MODEL_PATH}/logreg.pkl",
        "model_class": "sklearn.linear_model:LogisticRegression",
        "infer_method": "predict_proba",
        "C": 1000,
        "penalty": "l2"
      },
      {
        "in": "y_pred_proba",
        "out": "y_pred_ids",
        "class_name": "proba2labels",
        "max_proba": true
      },
      {
        "in": "y_pred_ids",
        "out": "y_pred_answers",
        "ref": "answers_vocab"
      }
    ],
    "out": [
      "y_pred_answers",
      "y_pred_proba"
    ]
  },
  "train": {
    "evaluation_targets": [],
    "class_name": "fit_trainer"
  },
  "metadata": {
    "variables": {
      "ROOT_PATH": "~/.deeppavlov",
      "DOWNLOADS_PATH": "{ROOT_PATH}/downloads",
      "MODELS_PATH": "{ROOT_PATH}/models",
      "MODEL_PATH": "{MODELS_PATH}/faq/test_autoFaq"
    }
  }
}

Terminal

(base) C:\Users\gyastrebkov>python -m deeppavlov train test_autoFaq
2020-03-11 11:06:00.97 INFO in 'deeppavlov.core.common.file'['file'] at line 30: Interpreting 'test_autoFaq' as 'C:\Users\gyastrebkov\AppData\Roaming\Python\Python37\site-packages\deeppavlov\configs\faq\test_autoFaq.json'
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\gyastrebkov\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\gyastrebkov\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package perluniprops to
[nltk_data]     C:\Users\gyastrebkov\AppData\Roaming\nltk_data...
[nltk_data]   Package perluniprops is already up-to-date!
[nltk_data] Downloading package nonbreaking_prefixes to
[nltk_data]     C:\Users\gyastrebkov\AppData\Roaming\nltk_data...
[nltk_data]   Package nonbreaking_prefixes is already up-to-date!
2020-03-11 11:06:02.308 WARNING in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 219: Cannot load model from C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\tfidf.pkl
2020-03-11 11:06:02.308 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 166: Initializing model sklearn.feature_extraction.text:TfidfVectorizer from scratch
2020-03-11 11:06:02.319 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 109: Fitting model sklearn.feature_extraction.text:TfidfVectorizer
2020-03-11 11:06:02.322 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 241: Saving model to C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\tfidf.pkl
2020-03-11 11:06:02.329 INFO in 'deeppavlov.core.data.simple_vocab'['simple_vocab'] at line 101: [saving vocabulary to C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\answers.dict]
2020-03-11 11:06:02.334 WARNING in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 219: Cannot load model from C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\logreg.pkl
2020-03-11 11:06:02.334 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 166: Initializing model sklearn.linear_model:LogisticRegression from scratch
2020-03-11 11:06:02.341 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 109: Fitting model sklearn.linear_model:LogisticRegression
2020-03-11 11:06:02.350 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 241: Saving model to C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\logreg.pkl
2020-03-11 11:06:02.420 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 203: Loading model sklearn.feature_extraction.text:TfidfVectorizer from C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\tfidf.pkl
2020-03-11 11:06:02.422 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 210: Model sklearn.feature_extraction.textTfidfVectorizer loaded  with parameters
2020-03-11 11:06:02.422 WARNING in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 216: Fitting of loaded model can not be continued. Model can be fitted from scratch.If one needs to continue fitting, please, look at `warm_start` parameter
2020-03-11 11:06:02.427 INFO in 'deeppavlov.core.data.simple_vocab'['simple_vocab'] at line 115: [loading vocabulary from C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\answers.dict]
2020-03-11 11:06:02.430 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 203: Loading model sklearn.linear_model:LogisticRegression from C:\Users\gyastrebkov\.deeppavlov\models\faq\test_autoFaq\logreg.pkl
2020-03-11 11:06:02.431 INFO in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 210: Model sklearn.linear_model._logisticLogisticRegression loaded  with parameters
2020-03-11 11:06:02.432 WARNING in 'deeppavlov.models.sklearn.sklearn_component'['sklearn_component'] at line 216: Fitting of loaded model can not be continued. Model can be fitted from scratch.If one needs to continue fitting, please, look at `warm_start` parameter

Whar i doing wrong?