1

I have the two following encoded string :

base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'

Using Base64 online decoder/encoder , the results are as follow (which are the right results) :

base64_str1_decoded = '{"section_offset":2,"items_offset":36,"version":1}7'
base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'

However, when I tried to encode base64_str1_decoded or base64_str2_decoded back to Base64, I'm not able to obtain the initial base64 strings.

For instance, the ouput for the following code :

base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'
recoded_str2 = base64.b64encode(bytes(base64_str2_decoded, 'utf-8'))
print(recoded_str2)

# output = b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='
# expected_output = eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D

I tried changing the encoding scheme but can't seem to make it work.

1
  • In the future, it may be more useful to use something like Code Sand Box to test your encoding and show a complete example when possible. The online encoder mentioned is very helpful but it's not a 1:1 comparison for output as shown buy it stripping the b' from the final output, which is used in Python to denote that the string is actually a bytes object. I put together a simple example here: codesandbox.io/p/sandbox/peaceful-ramanujan-w6hk52 Commented Jan 21, 2023 at 2:18

1 Answer 1

1

Notice that extra 7 at the end of base64_str1_decoded? That's because your input strings are incorrect. They have escape codes required for URLs. %3D is an escape code for =, which is what should be entered into the online decoder instead. You'll notice the 2nd string in the decoder has an extra ÃÜ on the next line you haven't shown due to using %3D%3D instead of ==. That online decoder is allowing invalid base64 to be decoded.

To correctly decode in Python use urllib.parse.unquote on the string to remove the escaping first:

import base64
import urllib.parse

base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'

# Demonstrate Python decoder detects invalid B64 encoding
try:
    print(base64.b64decode(base64_str1))
except Exception as e:
    print('Exception:', e)
try:
    print(base64.b64decode(base64_str2))
except Exception as e:
    print('Exception:', e)

# Decode after unquoting...
base64_str1_decoded = base64.b64decode(urllib.parse.unquote(base64_str1))
base64_str2_decoded = base64.b64decode(urllib.parse.unquote(base64_str2))
print(base64_str1_decoded)
print(base64_str2_decoded)

# See valid B64 encoding.
recoded_str1 = base64.b64encode(base64_str1_decoded)
recoded_str2 = base64.b64encode(base64_str2_decoded)
print(recoded_str1)
print(recoded_str2)

Output:

Exception: Invalid base64-encoded string: number of data characters (69) cannot be 1 more than a multiple of 4
Exception: Incorrect padding
b'{"section_offset":2,"items_offset":36,"version":1}'
b'{"section_offset":0,"items_offset":0,"version":1}'
b'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0='
b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='

Note that the b'' notation is Python's indication that the object is a byte string as opposed to a Unicode string and is not part of the string itself.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.