2

I'm sending a request for a set of images to one of my API's. The API returns these images in a JSON format. This format contains data about the resource together with a single property that represents the image in Base64.

An example of the JSON being returned.

{
    "id": 548613,
    "filename": "00548613.png",
    "pictureTaken": "2020-03-30T11:38:21.003",
    "isVisible": true,
    "lotcode": 23,
    "company": "05",
    "concern": "46",
    "base64": "..."
}

The correct content of the Base64
The incorrectly parsed Base64

This is done with the Python3 requests library. When i receive a successful response from the API i attempt to decode the body to JSON using:

url = self.__url__(f"/rest/all/V1/products/{sku}/images")
headers = self.__headers__()
r = requests.get(url=url, headers=headers)
if r.status_code == 200:
    return r.json()
elif r.status_code == 404:
    return None
else:
    raise IOError(
        f"Error retrieving product '{sku}', got {r.status_code}: '{r.text}'")

Calling .json() results in the Base64 content being messed up, some parts are not there, and some are replaced with other characters. I tried manually decoding the content using r.content.decode() with the utf-8 and ascii options to see if this was the problem after seeing this post. Sadly this didn't work. I know the response from the server is correct, it works with Postman, and calling print(r.content) results in a JSON document containing the valid Base64.

How would i go about de-serializing the response from the API to get the valid Base64?

3
  • @Trenton I assume you mean the Base64, sadly i cannot share it because i do not have ownership of the serialized resources. Commented Jun 15, 2020 at 18:00
  • 1
    @Harjan Take a random image of a duck. Convert it to base64. Put that base64 in a request like the one you provided and see if the problem arises. If yes, post that request so we can try. Commented Jun 15, 2020 at 20:05
  • 1
    @Trenton I have added some Base64, it should be a 1024x1024 picture of a pink and white box when parsed correctly. Commented Jun 16, 2020 at 7:10

1 Answer 1

1
import base64
import re
...
b64text = re.search(b"\"base64\": \"(?P<base>.*)\"", r.content, flags=re.MULTILINE).group("base")
decode = base64.b64decode(b64text).decode(utf-8)

Since you're saying "calling print(r.content) results in the valid Base64", it's just a matter of decoding the base64.

Sign up to request clarification or add additional context in comments.

7 Comments

Good suggestion, i think this might have worked if it was just Base64 that was being returned. Calling this on my content results in the entire JSON response being decoded from Base64.
@Harjan then it's just a matter of extracting the base64 data from the text directly, see my answer for an example implementation.
I tried your edited solution. But calling r.content or r.text results in the same corrupted Base64. Extracting works, but parsing is not possible because it still contains the illegal characters.
@Harjan Check your content-type and charset, the default in requests is text/html, you can set a charset utf-8, that's probably not what your API is using, set the appropriate value using r.encoding and retry. Have you tried using urrlib and reproducing this behaviour?
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.