0

I need to decode base64 embedded in ldif (openldap) backups.

I found here a way to join lines starting with a blank.

Then, based on this question about "How to decode base64 text in xml file in Linux?" I want to decode the base64 strings, but I'm not being able to get it to work.

My Script is:

#Join lines starting with space
sed -n 'H; ${ x; s/\n//; s/\n //g; p}' "$FILE" > "$FILE_JOINED"

#Decode lines containing base64 (those with double colon)
sed -r 's/(:: )([[:graph:]]+)/\1 '"`grep -oP ':: [[:graph:]]+' "$FILE_JOINED" |cut -c 4- | base64 -d`"'/g' "$FILE_JOINED"

When I execute this, I get the following error:

sed: -e expression #1, char 297: unknown option to `s'

Here I add an example of the "$FILE_JOINED" contents:

dn: olcDatabase={1}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {1}mdb
olcDbDirectory: /var/lib/ldap
olcSuffix: dc=proxy,dc=ldap
olcAccess:: b25lIHZhbHVlCg==
olcAccess: {1}to filter=(&(objectClass=securityPrincipal)(!(pwdAccountLockedTime=*))) attrs=userPassword,shadowLastChange by dn="cn=Man1,ou=local,dc=proxy,dc=ldap" write by anonymous auth by self write by * none
olcAccess: {2} to * by * read
olcAddContentAcl: FALSE
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcReadOnly: FALSE
olcRootDN: cn=Man1,ou=local,dc=proxy,dc=ldap
olcRootPW:: dmFsdWUgdHdvCg==
olcSyncUseSubentry: FALSE
olcSyncrepl:: dmFsdWUgdGhyZWUK
olcMirrorMode: TRUE

dn: olcOverlay={0}unique,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcUniqueConfig

(NOTE that the second command leaves the double colon (::) instead of leaving only one. I did it on purpose to be able to easily grep the output. I'll fix that later)

The second command has a grep in it: How does it "select" the correct line to decode in all the file contents?

Here is the result of the grep command alone:

# grep -oP ':: [[:graph:]]+' x |cut -c 4- | base64 -d
one value
value two
value three

Could anybody please give me any pointers on how to decode the base64 values contained in a ldif file?

3
  • The problem you're likely having is that the output of base64 -d may include a / and so is terminating the sed statement too early. You may need to force any / to be quoted to protect it. Commented Aug 30, 2018 at 1:20
  • Thank you for your answer. At the end I changed the way to do it. Instead of using a grep command, I directly echoed the value to be base64 decoded Commented Aug 30, 2018 at 2:20
  • see my answer (using perl and MIME::Base64) to a very similar (almost a dupe) question at unix.stackexchange.com/a/735968 Commented Feb 18, 2023 at 5:04

3 Answers 3

1

I found a way to do it:

sed -r 's/(.*:)(: )([[:graph:]]+)/echo "\1 `echo -n '\\3' |base64 -d`"/ge' "$FILE_JOINED"

And if you want to fold the long lines, (based on this answer)

sed -r 's/(.*:)(: )([[:graph:]]+)/echo "\1 `echo -n '\\3' |base64 -d`"/ge' "$FILE_JOINED" | \
awk -v WIDTH=76 '
{
    space="";
    while (length>WIDTH) {
        print substr($0,1,WIDTH);
        space=" ";
        $0=space substr($0,WIDTH+1);
    }
    print;
}
'

In case anybody needs it, here is the whole script.

[Note the script's AWK command leaves alone commented lines (lines beginning with "#") that is not included in the preceding one]:

#!/bin/bash

FILE=$1

DIR=`dirname $FILE`
pushd $DIR

WIDTH=76

FILE=`basename $FILE`
FILE_JOINED="`basename $FILE .ldif`-una-linea.ldif"
FILE_DECODED="`basename $FILE .ldif`-decodificado.ldif"

echo
echo DIR: $DIR
echo FILE: $FILE
echo FILE_JOINED: $FILE_JOINED
echo FILE_DECODED: $FILE_DECODED

sed -n 'H; ${ x; s/\n//; s/\n //g; p}' "$FILE" > "$FILE_JOINED"

sed -r 's/(.*:)(: )([[:graph:]]+)/echo "\1 `echo -n '\\3' |base64 -d`"/ge' "$FILE_JOINED" | \
awk -v WIDTH=$WIDTH -v space=" " '
/^[^#]/ {
    while (length>WIDTH) {
        print substr($0,1,WIDTH);
        $0=space substr($0,WIDTH+1);
    }
    print;
}
/^[#]|^$/ {
    print;
}
' > $FILE_DECODED

rm $FILE_JOINED

UPDATE 20180830

There was an error with shell expansion. It wasn't preserving the "*" characters, but replacing them with a list of files.

The fix was to add double quotes in the first echo command. I've already fixed the commands and script shown before.

The ERRONEOUS command was:

sed -r 's/(.*:)(: )([[:graph:]]+)/echo \1 `echo -n '\\3' |base64 -d`/ge' "$FILE_JOINED"

UPDATE 20180830-b

The AWK command was also modifying comments, and it shouldn't have.

The PREVIOUS command was:

awk -v WIDTH=$WIDTH '
BEGIN {
    space=" ";
}
{
    while (length>WIDTH) {
        print substr($0,1,WIDTH);
        $0=space substr($0,WIDTH+1);
    }
    print;
}
' > $FILE_DECODED
1

Based on your cool base64-stuff, I combine the unwrap and decode to a pipe-cmd:

sed -n ':loop N; s/\n //; t loop; P; D' | sed -r 's/(.*:)(: )([[:graph:]]+)/echo "\1 `echo -n '\\3' |base64 -d`"/ge' 
0

The sed command piped into awk in elysch’s answer didn't work for me on some of my data, so I changed that one line to be the following perl:

perl -pe 'use MIME::Base64; s/(.*:)(: )([[:graph:]]+)/$1 . " " . decode_base64($3)/e' "$FILE_JOINED"

Pipe that into awk if you want.

2
  • Can you characterize the data that caused the OP’s answer to fail —  and describe how it failed? Commented Jul 2, 2024 at 22:10
  • I think the issue is the length of the data. The one entry I was testing was 295,000 characters long. Using shorter data worked, but long data did not. It just didn't send anything to 'base64'. perl didn't mind the long data. Commented Jul 4, 2024 at 12:21

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.