1

The idea is that the text file has 150 rows where each row is a string of 1024 bits (a representation of a 32x32 image).

What i want to achieve is to have an array of 150 elements where every element is an array of size 1024.

By trying the code below i get an array of 150 elements with inf value. Is there a way to convert those values to vectors using numpy's loadtxt directly.

Thank you in advance!

import numpy as np

data = np.loadtxt("digits.txt")
3
  • Do the lines in the text file consist of the ASCII characters 0 and 1? If not, how are the binary vectors represented in the file? Commented Sep 8, 2019 at 14:30
  • It actually seems a bit strange to store binary data in a text file. Why not directly call it a binary file? You could just use .read(128) for each 1024 bits... no newlines ("rows") needed. Commented Sep 8, 2019 at 14:52
  • Did you read the loadtxt docs? Default dtype is float. If you have a 1000 digits without delimiter, it tries to make one number from that. The inf value is likely. Commented Sep 8, 2019 at 16:01

2 Answers 2

1

If each line is exactly the same length and contains only the characters 0 and 1, you can use numpy.genfromtxt, with delimiter=1. When the argument delimiter is a single integer, genfromtxt treats each line as a sequence of fixed-width fields. The value given to delimiter specifies the field width.

For example, suppose the file 01.txt contains

0001
1010
1111
0000
1001

Here's how you can use genfromtxt to read that into a NumPy integer array with shape (5, 4):

In [2]: import numpy as np                                                                                                                                

In [3]: data = np.genfromtxt('01.txt', delimiter=1, dtype=np.int8)                                                                                        

In [4]: data                                                                                                                                              
Out[4]: 
array([[0, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 1, 1, 1],
       [0, 0, 0, 0],
       [1, 0, 0, 1]], dtype=int8)
Sign up to request clarification or add additional context in comments.

Comments

0

supposed your text file contains 128 characters in each line (excluding newline character), each character representing 1 byte / 8 bits, you could use

data = np.loadtxt(file, dtype=np.str)
bits_arr = []
for line in data:
    byte_arr = np.frombuffer(line.encode('UTF-8'), dtype=np.uint8) # UTF-8 assumed
    bits_arr.append(np.unpackbits(byte_arr).reshape(32,32))

bits_arr will then contain 1 "32x32 bitmap" for each line. Note that reshape(32,32) will fail if an invalid number of bytes (!=128) is read in a line.

Sidenote: it is probably more efficient here to use a simple readlines() instead of carrying around all the overhead of np.loadtxt since you actually don't use what this function can do for you. The code could therefore be simplified to

bits_arr = []
with open(file, 'rb') as binfile:
    line = binfile.readline().strip() # strip to remove newline char
    byte_arr = np.frombuffer(line, dtype=np.uint8)
    bits_arr.append(np.unpackbits(byte_arr).reshape(32,32))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.