Skip first 3 bytes of a file

Question

I am using AIX 6.1 ksh shell.

I want to use one liner to do something like this:

cat A_FILE | skip-first-3-bytes-of-the-file

I want to skip the first 3 bytes of the first line; is there a way to do this?

Jonathan Leffler · Accepted Answer · 2017-02-03 23:40:03Z

29

Old school — you could use dd:

dd if=A_FILE bs=1 skip=3

The input file is A_FILE, the block size is 1 character (byte), skip the first 3 'blocks' (bytes). (With some variants of dd such as GNU dd, you could use bs=1c here — and alternatives like bs=1k to read in blocks of 1 kilobyte in other circumstances. The dd on AIX does not support this, it seems; the BSD (macOS Sierra) variant doesn't support c but does support k, m, g, etc.)

There are other ways to achieve the same result, too:

sed '1s/^...//' A_FILE

This works if there are 3 or more characters on the first line.

tail -c +4 A_FILE

And you could use Perl, Python and so on too.

edited Feb 3, 2017 at 23:40

answered Oct 24, 2012 at 15:38

Jonathan Leffler

1,52313 silver badges14 bronze badges

Thanks for your help. Both the sed and the tail commands work in AIX 6.1. For the dd command, it should be dd if=A_FILE bs=1 skip=3 in AIX 6.1

Alvin SIU
– Alvin SIU

2012-10-25 13:55:39 +00:00
Commented Oct 25, 2012 at 13:55
You may want to use standard input as such cat A_FILE | tail -c +4 with gnu.

MUY Belgium
– MUY Belgium

2013-11-08 07:57:16 +00:00
Commented Nov 8, 2013 at 7:57
2

Warning: using dd like this will slow down the whole process by several orders of magnitude. bs=1 sets the block size to a single byte which prevents efficient file IO. Switching the parameters, e.g. bs=3 skip=1 helps a little bit, but using tail is much more efficient anyway. I did not test sed for speed.

Lena Schimmel
– Lena Schimmel

2023-12-02 12:11:42 +00:00
Commented Dec 2, 2023 at 12:11

Add a comment |

squiguy · Accepted Answer · 2012-10-24 15:29:51Z

26

Instead of using cat you can use tail as such:

tail -c +4 FILE

This will print out the entire file except for the first 3 bytes. Consult man tail for more information.

answered Oct 24, 2012 at 15:29

squiguy

3613 silver badges5 bronze badges

Don't know about AIX, but on Solaris you must use /usr/xpg4/bin/tail, at least on my machine. Good tip nonetheless!

BellevueBob
– BellevueBob

2012-10-24 19:34:32 +00:00
Commented Oct 24, 2012 at 19:34
1

@BobDuell It's hard to post something that is compatible with every OS.

squiguy
– squiguy

2012-10-24 20:10:20 +00:00
Commented Oct 24, 2012 at 20:10
Yes, it works in AIX 6.1

Alvin SIU
– Alvin SIU

2012-10-25 13:54:50 +00:00
Commented Oct 25, 2012 at 13:54
@AlvinSIU Good to know. Glad I could help.

squiguy
– squiguy

2012-10-25 15:43:24 +00:00
Commented Oct 25, 2012 at 15:43
Thank you, this is a much better choice for working with large files with a tiny amount of garbage at the beginning. I used dd over an ssh connection to get a file image and I needed to remove the "[sudo] password for X:" at the beginning of the resulting file.

Compholio
– Compholio

2022-06-19 23:11:29 +00:00
Commented Jun 19, 2022 at 23:11

Add a comment |

Sergiy Kolodyazhnyy · Accepted Answer · 2017-02-04 03:59:30Z

1

If one has Python on their system, one can use small python script to take advantage of seek() function to start reading at the nth byte like so:

#!/usr/bin/env python3
import sys
with open(sys.argv[1],'rb') as fd:
    fd.seek(int(sys.argv[2]))
    for line in fd:
        print(line.decode().strip())

And usage would be like so:

$ ./skip_bytes.py input.txt 3

Note that byte count starts at 0 (thus first byte is actually index 0), thus by specifying 3 we're effectively positioning the reading to start at 3+1=4th byte

answered Feb 4, 2017 at 3:59

Sergiy Kolodyazhnyy

16.9k12 gold badges58 silver badges111 bronze badges

Add a comment |

csherrell · Accepted Answer · 2016-02-04 03:42:10Z

I needed to recently do something similar. I was helping with a field support issue and needed to let a technician see real time plots as they were making changes. The data is in a binary log that grows throughout the day. I have software that can parse and plot the data from logs, but it is currently not real time. What I did was capture the size of the log before I started processing the data, then went into a loop that would process the data and each pass create a new file with the bytes of the file that had not yet been processed.

#!/usr/bin/env bash

# I named this little script hackjob.sh
# The purpose of this is to process an input file and load the results into
# a database. The file is constantly being update, so this runs in a loop
# and every pass it creates a new temp file with bytes that have not yet been
# processed.  It runs about 15 seconds behind real time so it's
# pseudo real time.  This will eventually be replaced by a real time
# queue based version, but this does work and surprisingly well actually.

set -x

# Current data in YYYYMMDD fomat
DATE=`date +%Y%m%d`

INPUT_PATH=/path/to/my/data
IFILE1=${INPUT_PATH}/${DATE}_my_input_file.dat

OUTPUT_PATH=/tmp
OFILE1=${OUTPUT_PATH}/${DATE}_my_input_file.dat

# Capture the size of the original file
SIZE1=`ls -l ${IFILE1} | awk '{print $5}'`

# Copy the original file to /tmp
cp ${IFILE1} ${OFILE1}

while :
do
    sleep 5

    # process_my_data.py ${OFILE1}
    rm ${OFILE1}
    # Copy IFILE1 to OFILE1 minus skipping the amount of data already processed
    dd skip=${SIZE1} bs=1 if=${IFILE1} of=${OFILE1}
    # Update the size of the input file
    SIZE1=`ls -l ${IFILE1} | awk '{print $5}'`

    echo

    DATE=`date +%Y%m%d`

done

If only because I'm in that kind of mood, and don't like coding against the output of ls; have you considered using stat -c'%s' "${IFILE}" instead of that ls|awk combo? That is, assuming GNU coreutils... — jimbobmcgee
– jimbobmcgee, Commented Oct 26, 2016 at 18:51

Stack Exchange Network

Skip first 3 bytes of a file

4 Answers 4

You must log in to answer this question.

Hot Network Questions

Skip first 3 bytes of a file

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions