Offline with Docker

Hi,
First of all, thanks for the wonderful tool, with ready to use models… really impressive.

I was testing it, mainly the entity recognition, and I would need to have it in a close server (offline). I got the docker from the docker hub, created the image as explained ( ```
docker run -e CONFIG=ner_ontonotes_bert_mult -p 5555:5000
-v ~/my_dp_components:/root/.deeppavlov
-v ~/my_dp_envs:/venv
deeppavlov/base-cpu

I tested, it works on this machine to call the image without the "-e CONFIG=ner_ontonotes_bert_mult" (as it redo the downloads for the git repositories, for example).
So now I have a machine that works on my internet connected machine, and that is configured to do NER on multilanguage. 

However if I save this image (docker save deeppavlov/base-cpu:latest > deeppavlov/base-cpu.tar) and load it on an offline machine (no Internet connexion, by no means) with  cat deeppavlov/base-cpu.tar | docker load. So far so good, it reads the image... but when I try to run it I get the following log: 
docker run -p 5555:5000   -v ~/my_dp_components:/root/.deeppavlov     -v ~/my_dp_envs:/venv  f1744f543abc


INFO[2020-03-02T17:06:57.280084210+01:00] shim containerd-shim started                  address="/containerd-shim/moby/3d5db182bed9a39377da1c5fd7ba3f64640a263ce758fe48f182bb3c6673ba0f/shim.sock" debug=false pid=63470
email-validator not installed, email fields will be treated as str.
To install, run: pip install email-validator
2020-03-02 16:07:07.350 INFO in 'deeppavlov.core.common.file'['file'] at line 30: Interpreting 'ner_ontonotes_bert_mult' as '/base/DeepPavlov/deeppavlov/configs/ner/ner_ontonotes_bert_mult.json'
Requirement already satisfied: tensorflow==1.14.0 in ./venv/lib/python3.7/site-packages (1.14.0)
Requirement already satisfied: keras-preprocessing>=1.0.5 in ./venv/lib/python3.7/site-packages/Keras_Preprocessing-1.1.0-py3.7.egg (from tensorflow==1.14.0) (1.1.0)
Requirement already satisfied: numpy<2.0,>=1.14.5 in ./venv/lib/python3.7/site-packages/numpy-1.16.4-py3.7-linux-x86_64.egg (from tensorflow==1.14.0) (1.16.4)
Requirement already satisfied: grpcio>=1.8.6 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (1.27.2)
Requirement already satisfied: termcolor>=1.1.0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (1.1.0)
Requirement already satisfied: wrapt>=1.11.1 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (1.12.0)
Requirement already satisfied: wheel>=0.26 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (0.33.6)
Requirement already satisfied: google-pasta>=0.1.6 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (0.1.8)
Requirement already satisfied: gast>=0.2.0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (0.3.3)
Requirement already satisfied: protobuf>=3.6.1 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (3.11.3)
Requirement already satisfied: six>=1.10.0 in ./venv/lib/python3.7/site-packages/six-1.13.0-py3.7.egg (from tensorflow==1.14.0) (1.13.0)
Requirement already satisfied: absl-py>=0.7.0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (0.9.0)
Requirement already satisfied: astor>=0.6.0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (0.8.1)
Requirement already satisfied: keras-applications>=1.0.6 in ./venv/lib/python3.7/site-packages/Keras_Applications-1.0.8-py3.7.egg (from tensorflow==1.14.0) (1.0.8)
Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (1.14.0)
Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in ./venv/lib/python3.7/site-packages (from tensorflow==1.14.0) (1.14.0)
Requirement already satisfied: setuptools in ./venv/lib/python3.7/site-packages (from protobuf>=3.6.1->tensorflow==1.14.0) (42.0.2)
Requirement already satisfied: h5py in ./venv/lib/python3.7/site-packages/h5py-2.9.0-py3.7-linux-x86_64.egg (from keras-applications>=1.0.6->tensorflow==1.14.0) (2.9.0)
Requirement already satisfied: markdown>=2.6.8 in ./venv/lib/python3.7/site-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow==1.14.0) (3.2.1)
Requirement already satisfied: werkzeug>=0.11.15 in ./venv/lib/python3.7/site-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow==1.14.0) (1.0.0)
WARNING: You are using pip version 19.3.1; however, version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting git+https://github.com/deepmipt/bert.git@feat/multi_gpu
  Cloning https://github.com/deepmipt/bert.git (to revision feat/multi_gpu) to /tmp/pip-req-build-8jzbnfgx
  Running command git clone -q https://github.com/deepmipt/bert.git /tmp/pip-req-build-8jzbnfgx
  fatal: unable to access 'https://github.com/deepmipt/bert.git/': Could not resolve host: github.com
ERROR: Command errored out with exit status 128: git clone -q https://github.com/deepmipt/bert.git /tmp/pip-req-build-8jzbnfgx Check the logs for full command output.
WARNING: You are using pip version 19.3.1; however, version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/base/DeepPavlov/deeppavlov/__main__.py", line 4, in <module>
    main()
  File "/base/DeepPavlov/deeppavlov/deep.py", line 109, in main
    install_from_config(pipeline_config_path)
  File "/base/DeepPavlov/deeppavlov/utils/pip_wrapper/pip_wrapper.py", line 58, in install_from_config
    install(r)
  File "/base/DeepPavlov/deeppavlov/utils/pip_wrapper/pip_wrapper.py", line 37, in install
    env=os.environ.copy())
  File "/usr/local/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/base/venv/bin/python', '-m', 'pip', 'install', 'git+https://github.com/deepmipt/bert.git@feat/multi_gpu']' returned non-zero exit status 1.
INFO[2020-03-02T17:07:09.877358210+01:00] shim reaped                                   id=3d5db182bed9a39377da1c5fd7ba3f64640a263ce758fe48f182bb3c6673ba0f
INFO[2020-03-02T17:07:09.887232773+01:00] ignoring event                                module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

And the container do not start. It seems it is trying to update  bert (https://github.com/deepmipt/bert.git)... Is there a way to have the docker to run in a completely offline way? 

Best regards...

-- Daniel

Hi @danielcamara,

DeepPavlov configs have different requirements (python packages and model files) but base images don’t contain these requirements since otherwise they would be too big. There are two possible approaches to run a model you need on a machine without internet connection:

  1. Start a model on your internet connected machine mapping volumes with components and virtual environments ( -v ~/my_dp_components:/root/.deeppavlov -v ~/my_dp_envs:/venv).
    The container will download all the necessary components to directories on your host and start a DeepPavlov model (line Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit) means that all requirements are installed and components are downloaded). Then you have to copy my_dp_components and my_dp_envs directories from internet connected host to the offline machine.
    Either on the internet connected or on the offline machine you have to create following Dockerfile:

    FROM deeppavlov/base-cpu:0.7.1
    
    CMD PATH=/venv/$CONFIG/bin:$PATH && \
        python -m deeppavlov riseapi $CONFIG -p 5000
    

    and run docker build -t my_custom_dp_image .
    The created image could now be used on the offline machine:

    docker run -e CONFIG=ner_ontonotes_bert_mult -p 5555:5000
    -v ~/my_dp_components:/root/.deeppavlov
    -v ~/my_dp_envs:/venv
    my_custom_dp_image
    
  2. If you need to work only with one model it is better to build image with all model requirements. Build image with the following Dockerfile (replace ner_ontonotes_bert_mult with any model you need):

    FROM deeppavlov/base-cpu:0.7.1
    
    RUN python -m deeppavlov install ner_ontonotes_bert_mult && \
        python -m deeppavlov download ner_ontonotes_bert_mult 
    
    CMD python -m deeppavlov riseapi ner_ontonotes_bert_mult -p 5000
    

    Run the following command on the internet connected machine in the directory with your Dockerfile to build the image:

    docker build -t deeppavlov:ner_ontonotes_bert_mult .
    

    This image can now be copied to offline machine and started with the following command:

    docker run -p 5555:5000 deeppavlov:ner_ontonotes_bert_mult