Can't run the rus_convers_distilrubert_6L offline

fracz · February 27, 2023, 12:25pm

Based on the instructions from here Offline with Docker and the enhancements from here I have prepared a Dockerfile that creates an image for an offline use. It is working perfectly with the ner_ontonotes_bert_mult and the ner_rus_bert models. However, it fails to work offline with the ner_rus_convers_distilrubert_6L because even after building it tries to connect the network.

My Dockerfile:

FROM deeppavlov/base-cpu:0.17.6

RUN sed -i 's/mipt/pavlovteam/g' /base/DeepPavlov/deeppavlov/requirements/bert_dp.txt

RUN python -m deeppavlov install ner_rus_convers_distilrubert_6L && \
    python -m deeppavlov download ner_rus_convers_distilrubert_6L && \
    pip3 install --upgrade protobuf==3.20.0

CMD python -m deeppavlov riseapi ner_rus_convers_distilrubert_6L -p 5000

I even tried to add a single prediction during image building so it downloads everything it needs with the following line:

RUN python -m deeppavlov predict ner_rus_convers_distilrubert_6L -f /etc/passwd

It does downloads something and runs the prediction but then, after disconnect & restart it still tries to get something and results in the following error.


2023-02-27 12:08:11.3 ERROR in 'deeppavlov.core.common.params'['params'] at line 112: Exception in <class 'deeppavlov.models.preprocessors.torch_transformers_preprocessor.TorchTransformersNerPreprocessor'>
Traceback (most recent call last):
  File "/base/DeepPavlov/deeppavlov/core/common/params.py", line 106, in from_params
    component = obj(**dict(config_params, **kwargs))
  File "/base/DeepPavlov/deeppavlov/models/preprocessors/torch_transformers_preprocessor.py", line 322, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(vocab_file, do_lower_case=do_lower_case)
  File "/base/venv/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 435, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/base/venv/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1680, in from_pretrained
    user_agent=user_agent,
  File "/base/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 1279, in cached_path
    local_files_only=local_files_only,
  File "/base/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 1495, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/base/DeepPavlov/deeppavlov/__main__.py", line 4, in <module>
    main()
  File "/base/DeepPavlov/deeppavlov/deep.py", line 113, in main
    start_model_server(pipeline_config_path, args.https, args.key, args.cert, port=args.port)
  File "/base/DeepPavlov/deeppavlov/utils/server/server.py", line 179, in start_model_server
    model = build_model(model_config)
  File "/base/DeepPavlov/deeppavlov/core/commands/infer.py", line 62, in build_model
    component = from_params(component_config, mode=mode, serialized=component_serialized)
  File "/base/DeepPavlov/deeppavlov/core/common/params.py", line 106, in from_params
    component = obj(**dict(config_params, **kwargs))
  File "/base/DeepPavlov/deeppavlov/models/preprocessors/torch_transformers_preprocessor.py", line 322, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(vocab_file, do_lower_case=do_lower_case)
  File "/base/venv/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 435, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/base/venv/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1680, in from_pretrained
    user_agent=user_agent,
  File "/base/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 1279, in cached_path
    local_files_only=local_files_only,
  File "/base/venv/lib/python3.7/site-packages/transformers/file_utils.py", line 1495, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

Do you have an idea of how to initialize this model for an offline use?

Topic		Replies	Views
Offline with Docker Documentation	1	2432	March 3, 2020
Fail to access access deepmipt/bert during installation Documentation	1	402	September 8, 2022
Question about build_model DeepPavlov Library	14	312	October 31, 2023
Model download problem ner_ontonotes_bert_mult_torch	4	663	January 27, 2022
New Release 0.6.0 Releases	1	411	October 2, 2019

Can't run the rus_convers_distilrubert_6L offline

Related topics