Certificate verify failed: self signed certificate in certificate chain

I try to install Dream. In several dockerfiles, any pip call ends with an error like this:

RUN pip install --upgrade pip:
1.205 Collecting pip
1.244   WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)'))': /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl
1.773   WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)'))': /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl
2.806   WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)'))': /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl
4.837   WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)'))': /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl
8.868   WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)'))': /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl
8.897 ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Max retries exceeded with url: /packages/8a/6a/19e9fe04fca059ccf770861c7d5721ab4c2aebc539889e97c7977528a53b/pip-24.0-py3-none-any.whl (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)')))

I faced the same error while installing DeepPavlov before, as well as with the other dockerfiles while trying to install the Dream. I solved those errors by calling export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt just after initializing my virtual environment. I didn’t succeed to make these certificates visible in the “agent”, “confidence-based-response-selector” and “emotion-classification-deepy” modules. I don’t know why the other modules work.

I use Python 3.10 on a WSL on Windows 11. I didn’t manage to successfully apply what was suggested for MacOS. The problem might be connected with our firewall, preventing me from watching videos, including another possible solution suggested here. I’ve already asked for help with a workaround, but a clear solution would be better even if the workaround worked (it doesn’t).

So, what to do?

Seems I’ve found at least a partial solution here: python - How to add a custom CA Root certificate to the CA Store used by pip in Windows? - Stack Overflow

Concretely, this part:

pip config set global.cert path/to/ca-bundle.crt
pip config list

Most downloads work now - but not all, and some work only sometimes. Also, I encountered several errors concerning versions no longer available in pip, but that seems solved easily.

One step further…

Now, all the modules but the emotion_classification_deepy skill installed successfully. Sometimes (about 50% of the tries), it fails already at the line with: RUN pip install --upgrade pip - I don’t know why. The error message is the same as before. But the main error comes later.

At RUN python -m deeppavlov download emo_bert.json, I get the following error report:

0.993 2024-10-07 10:52:38.309 INFO in 'deeppavlov.core.data.utils'['utils'] at line 97: Downloading from http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert to /mnt/c/Users/pavel.veselsky/DeepPavlov/Dream/models/emo_bert3.tar.gz
1.116 2024-10-07 10:52:38.432 WARNING in 'deeppavlov.core.data.utils'['utils'] at line 146: Download failed: HTTPSConnectionPool(host='files.deeppavlov.ai', port=443): Max retries exceeded with url: /deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)'))), retrying
1.116 2024-10-07 10:52:38.432 INFO in 'deeppavlov.core.data.utils'['utils'] at line 97: Downloading from http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert to /mnt/c/Users/pavel.veselsky/DeepPavlov/Dream/models/emo_bert3.tar.gz
1.246 2024-10-07 10:52:38.562 WARNING in 'deeppavlov.core.data.utils'['utils'] at line 146: Download failed: HTTPSConnectionPool(host='files.deeppavlov.ai', port=443): Max retries exceeded with url: /deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)'))), retrying
1.246 2024-10-07 10:52:38.562 INFO in 'deeppavlov.core.data.utils'['utils'] at line 97: Downloading from http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert to /mnt/c/Users/pavel.veselsky/DeepPavlov/Dream/models/emo_bert3.tar.gz
1.374 2024-10-07 10:52:38.690 WARNING in 'deeppavlov.core.data.utils'['utils'] at line 146: Download failed: HTTPSConnectionPool(host='files.deeppavlov.ai', port=443): Max retries exceeded with url: /deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)'))), retrying
1.374 2024-10-07 10:52:38.690 INFO in 'deeppavlov.core.data.utils'['utils'] at line 97: Downloading from http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz?config=emo_bert to /mnt/c/Users/pavel.veselsky/DeepPavlov/Dream/models/emo_bert3.tar.gz
1.509 Traceback (most recent call last):
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
1.509     self._validate_conn(conn)
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
1.509     conn.connect()
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 730, in connect
1.509     sock_and_verified = _ssl_wrap_socket_and_match_hostname(
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 909, in _ssl_wrap_socket_and_match_hostname
1.509     ssl_sock = ssl_wrap_socket(
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
1.509     ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
1.509   File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
1.509     return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
1.509   File "/usr/local/lib/python3.10/ssl.py", line 513, in wrap_socket
1.509     return self.sslsocket_class._create(
1.509   File "/usr/local/lib/python3.10/ssl.py", line 1071, in _create
1.509     self.do_handshake()
1.509   File "/usr/local/lib/python3.10/ssl.py", line 1342, in do_handshake
1.509     self._sslobj.do_handshake()
1.509 ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)
1.509
1.509 During handling of the above exception, another exception occurred:
1.509
... (many paragraphs about other CERTIFICATE_VERIFY_FAILED errors) ...
------
failed to solve: process "/bin/sh -c python -m deeppavlov download emo_bert.json" did not complete successfully: exit code: 1

I tried to change the line to RUN python -m pip install http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz With few lines operating the certificates (copy, chmod 644 and pip config set global.cert) the message changed:

0.558 Collecting http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz
3.097   Downloading http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz (1109.3 MB)
194.9      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 GB 5.7 MB/s eta 0:00:00
194.9 ERROR: Exception:
194.9 Traceback (most recent call last):
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1870, in gzopen
194.9     t = cls.taropen(name, mode, fileobj, **kwargs)
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1847, in taropen
194.9     return cls(name, mode, fileobj, **kwargs)
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1707, in __init__
194.9     self.firstmember = self.next()
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 2622, in next
194.9     raise e
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 2595, in next
194.9     tarinfo = self.tarinfo.fromtarfile(self)
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1285, in fromtarfile
194.9     buf = tarfile.fileobj.read(BLOCKSIZE)
194.9   File "/usr/local/lib/python3.10/gzip.py", line 301, in read
194.9     return self._buffer.read(size)
194.9   File "/usr/local/lib/python3.10/_compression.py", line 68, in readinto
194.9     data = self.read(len(byte_view))
194.9   File "/usr/local/lib/python3.10/gzip.py", line 488, in read
194.9     if not self._read_gzip_header():
194.9   File "/usr/local/lib/python3.10/gzip.py", line 436, in _read_gzip_header
194.9     raise BadGzipFile('Not a gzipped file (%r)' % magic)
194.9 gzip.BadGzipFile: Not a gzipped file (b'\xfd7')
194.9
194.9 The above exception was the direct cause of the following exception:
194.9
194.9 Traceback (most recent call last):
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 105, in _run_wrapper
194.9     status = _inner_run()
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 96, in _inner_run
194.9     return self.run(options, args)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
194.9     return func(self, options, args)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 379, in run
194.9     requirement_set = resolver.resolve(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 76, in resolve
194.9     collected = self.factory.collect_root_requirements(root_reqs)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 539, in collect_root_requirements
194.9     reqs = list(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 495, in _make_requirements_from_install_req
194.9     cand = self._make_base_candidate_from_link(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 232, in _make_base_candidate_from_link
194.9     self._link_candidate_cache[link] = LinkCandidate(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 303, in __init__
194.9     super().__init__(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 158, in __init__
194.9     self.dist = self._prepare()
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 235, in _prepare
194.9     dist = self._prepare_distribution()
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 314, in _prepare_distribution
194.9     return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 527, in prepare_linked_requirement
194.9     return self._prepare_linked_requirement(req, parallel_builds)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 598, in _prepare_linked_requirement
194.9     local_file = unpack_url(
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 180, in unpack_url
194.9     unpack_file(file.path, location, file.content_type)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/utils/unpacking.py", line 326, in unpack_file
194.9     untar_file(filename, location)
194.9   File "/usr/local/lib/python3.10/site-packages/pip/_internal/utils/unpacking.py", line 179, in untar_file
194.9     tar = tarfile.open(filename, mode, encoding="utf-8")
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1817, in open
194.9     return func(name, filemode, fileobj, **kwargs)
194.9   File "/usr/local/lib/python3.10/tarfile.py", line 1874, in gzopen
194.9     raise ReadError("not a gzip file") from e
194.9 tarfile.ReadError: not a gzip file
------
failed to solve: process "/bin/sh -c python -m pip install http://files.deeppavlov.ai/deeppavlov_data/emotion_classification/emo_bert3.tar.gz" did not complete successfully: exit code: 2

This seems like exactly the same problem, that I failed to workaround earlier. As I appear to be the only one who can’t unzip the file to be downloaded, I suspect my company’s firewall of breaking it even with the correct certificate. But that’s strange… where else could be the problem?

Even though the emo_bert3.tar.gz file seemed corrupted or “not .gzip” to the python code invoked by pip, I manage to extract it. First, I renamed it to emo_bert3.tar, and then invoked tar -xvf emo_bert3.tar in my WSL command line (with Ubuntu 22.04). I moved the extracted files to a subdirectory, and invoked tar -czvf emo_bert4.tar.gz . from there. Now I can open the archive even from Windows. I updated my code and tried to install emotion_classification_deepy from the new archive. I got the following error:

0.518 Processing ./emo_bert4.tar.gz
12.50 ERROR: file:///app/emo_bert4.tar.gz does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.
------
failed to solve: process "/bin/sh -c pip install /app/emo_bert4.tar.gz" did not complete successfully: exit code: 1

This seems to be intended, removing some files while un- and re-packing the .tar ball would be too strange. So, the workaround with the pip install from the archive, online or on my disc, is not viable. But how to pass the certificate to the emo_bert.json script and the scripts called from it?

Finally, I needed to patch DeepPavlov/deeppavlov/core/data/utils.py, line 105, to:

r = requests.get(url, stream=True, headers=headers, verify='ca-my-company.crt')

Now, almost everything works. However, Emotion Classification Deepy relies on an old version of DeepPavlov (0.12.0, if I’m correct). To apply my certificate hack, I tried to upgrade the annotator for DeepPavlov 1.6.0, but after few days, I don’t think it does worth trying any more. Other ways I can think of are:

  1. make the request in DeepPavlov/deeppavlov/core/data/utils.py find the certificate without any hacks, or

  2. install old DP 0.12.0 alongside the 1.6.0 I already have.

I understand no. 1 would be a lot cleaner, but I’ve already spent days looking for a solution, to no avail. Can anyone help me with this?