Не работает русская сборка

не работает, почти все поднялось кроме ner и agent 1

ff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | [2023-01-09 07:51:19 +0000] [307] [INFO] Booting worker with pid: 307
dialogpt_1 | 2023-01-09 07:51:20,553 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:51:29,374 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:51:38,525 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:51:47,352 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
dff-generative-skill_1 | [2023-01-09 07:51:49 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:307)
dff-generative-skill_1 | [2023-01-09 07:51:49 +0000] [307] [INFO] Worker exiting (pid: 307)
dff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | HTTPConnectionPool(host=‘dialogpt’, port=8091): Read timed out. (read timeout=4)
dff-generative-skill_1 | [2023-01-09 07:51:49 +0000] [331] [INFO] Booting worker with pid: 331
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:51:51,015 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:51:59,276 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
dialogpt_1 | 2023-01-09 07:52:08,340 - server - INFO - dialogpt inputs: [{‘speaker’: ‘human’, ‘text’: ‘привет’}]
agent_1 | Host ner:8021 not yet available…
agent_1 | Host ner:8021 not yet available…
dream_dialogpt_1 exited with code 137
dff-generative-skill_1 | [2023-01-09 07:52:19 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:331)
dff-generative-skill_1 | [20

запускаю так:

sudo docker-compose -f docker-compose.yml -f assistant_dists/dream_russian/docker-compose.override.yml -f assistant_dists/dream_russian/dev.yml up --build

что я делаю не так?

Добрый день!
Поднимите отдельно ner, пожалуйста, чтобы понять, какая ошибка при поднятии этого контейнера.

sudo docker-compose -f docker-compose.yml -f assistant_dists/dream_russian/docker-compose.override.yml -f assistant_dists/dream_russian/dev.yml up --build ner

ner_1 | [nltk_data] Downloading package nonbreaking_prefixes to
ner_1 | [nltk_data] /root/nltk_data…
ner_1 | [nltk_data] Package nonbreaking_prefixes is already up-to-date!
ner_1 | 2023-01-09 13:43:26.485 INFO in ‘deeppavlov.core.data.simple_vocab’[‘simple_vocab’] at line 115: [loading vocabulary from /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/tag.dict]
ner_1 | 2023-01-09 13:43:26,485 - deeppavlov.core.data.simple_vocab - INFO - [loading vocabulary from /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/tag.dict]
ner_1 | Some weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertForTokenClassification: [‘cls.predictions.transform.LayerNorm.weight’, ‘cls.seq_relationship.bias’, ‘cls.predictions.transform.LayerNorm.bias’, ‘cls.predictions.bias’, ‘cls.predictions.transform.dense.weight’, ‘cls.predictions.transform.dense.bias’, ‘cls.seq_relationship.weight’, ‘cls.predictions.decoder.weight’]
ner_1 | - This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
ner_1 | - This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
ner_1 | Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: [‘classifier.weight’, ‘classifier.bias’]
ner_1 | You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
ner_1 | 2023-01-09 13:43:35.687 INFO in ‘deeppavlov.core.models.torch_model’[‘torch_model’] at line 153: Load path /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model is given.
ner_1 | 2023-01-09 13:43:35,687 - deeppavlov.core.models.torch_model - INFO - Load path /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model is given.
ner_1 | 2023-01-09 13:43:35.688 INFO in ‘deeppavlov.core.models.torch_model’[‘torch_model’] at line 160: Load path /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model.pth.tar exists.
ner_1 | 2023-01-09 13:43:35,688 - deeppavlov.core.models.torch_model - INFO - Load path /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model.pth.tar exists.
ner_1 | 2023-01-09 13:43:35.688 INFO in ‘deeppavlov.core.models.torch_model’[‘torch_model’] at line 161: Initializing TorchTransformersSequenceTagger from saved.
ner_1 | 2023-01-09 13:43:35,688 - deeppavlov.core.models.torch_model - INFO - Initializing TorchTransformersSequenceTagger from saved.
ner_1 | 2023-01-09 13:43:35.688 INFO in ‘deeppavlov.core.models.torch_model’[‘torch_model’] at line 168: Loading weights from /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model.pth.tar.
ner_1 | 2023-01-09 13:43:35,688 - deeppavlov.core.models.torch_model - INFO - Loading weights from /root/.deeppavlov/models/ner/mbert_dream_with_numbers_rus_ext/model.pth.tar.
dream_ner_1 exited with code 137

This problem refers to RAM memory issues. Seems like you do not have enough RAM for your containers. Try to build the distributive on the server with higher resources.

i have 64gb ram, and during execute it is not getting larger then 25gb
GPU memory 4gb
is it enough?