This is somewhat tangential to the precise question you asked, but nevertheless may be interesting. High level, while the compression approaches you applied may not be able to compress encrypted data, there are schemes that can in fact let you significantly compress encrypted data if the secret key is available at decompression time (but not during compression, of course).
To my knowledge, the first work propsing compression of encrypted data is by Johnson et al. [1]. The main insight here is that while the compression algorithm sees only the encrypted stream, if we assume that the decoder has access to the secret key, then it can combine decoding and decryption into one step to enable compression gains via a reduction to the Wyner-Ziv source coding problem. I'll give a brief account of the results, limiting to the lossless setting for simplicity.
Structure: Suppose that the secret key $K^n$ is drawn uniformly from a $2^{nR'}$-length secret key codebook $\mathcal{K}_n \subset \{0,1\}^n$, and the data $X^n \in \{0,1\}^n$ is drawn iid from some law $p$, and is encrypted to give $Y^n = X^n \oplus K^n$, where the XOR is applied bitwise. Then an encoder maps this $Y^n$ to a ${nR}$ length bitstring $B,$ without knowledge of $K^n$, but knowing the secret-key codebook $\mathcal{K}_n$, and a decoder uses $(B,K^n)$ to produce $\hat{X}^n$ (implicitly, we're assuming that the decoder gets the secret key through some secure channel). Say that $(R',R)$ rates are achievable with Wyner-sense perfect secrecy if we can find an encoder and decoder that ensure $P(X^n \neq \hat{X}^n) \to 0$ as $n$ diverges.
The insight is that as far as the encoder is concerned, the secret $K^n$ is just some side information available at the decoder. Using this the paper argues that if you construct $\mathcal{K}_n$ by drawing $2^{nR'}$ iid sequences according to the law of $X$, $p$, itself, and use Slepian-Wolf coding for $X^n$, then you can achieve $(H(X), H(X))$-rates with Wyner-sense perfect secrecy. In other words, you can compress to the optimal rate, and retain $H(X)$-bits of secrecy (which is the best possible, since the source itself has only $H(X)$ bits of entropy).
[1]. Johnson, M., Ishwar, P., Prabhakaran, V., Schonberg, D., & Ramchandran, K. (2004). On compressing encrypted data. IEEE Transactions on Signal Processing, 52(10), 2992-3006.
Note that the paper is 20 years old and well cited, so likely this theory has evolved since. I haven't really read anything on the topic beyond this, but I'm sure you can explore to the extent of your interest by chasing down papers that cite it.