I have the following code:
datadicts = [ ]
with open("input.txt") as f:
    for line in f:
        datadicts.append({'col1': line[':'], 'col2': line[':'], 'col3': line[':'], 'col4': line[':']})
df = pd.DataFrame(datadicts)
df = df.drop([0])
print(df)
I am using a text file (that is not formatted) to pull chunks of data from. When the text file is opened, it looks something like this, except on a way bigger scale:
00 2381    1.3 3.4 1.8 265879 Name 
34 7879    7.6 4.2 2.1 254789 Name 
45 65824   2.3 3.4 1.8 265879 Name 
58 3450    1.3 3.4 1.8 183713 Name 
69 37495   1.3 3.4 1.8 137632 Name 
73 458913  1.3 3.4 1.8 138024 Name 
Here are the things I'm having trouble doing with this data:
- I only need the second, third, sixth, and seventh columns of data. The issue with this one, I believe I've solved with my code above by reading the individual lines and creating a dataframe with the columns necessary. I am open to suggestions if anyone has a better way of doing this.
 - I need to skip the first row of data. This one, the open feature doesn't have a skiprows attribute, so when I drop the first row, I also lose my index starting at 0. Is there any way around this?
 - I need the resulting dataframe to look like a nice clean dataframe. As of right now, it looks something like this:
 
Col1   Col2   Col3 Col4
2381    3.4 265879 Name 
7879    4.2 254789 Name 
65824   3.4 265879 Name 
3450    3.4 183713 Name 
37495   3.4 137632 Name 
458913  3.4 138024 Name 
Everything is right-aligned under the column and it looks strange. Any ideas how to solve this?
- I also need to be able to perform Statistic Analysis on the columns of data, and to be able to find the Name with the highest data and the lowest data, but for some reason, I always get errors because I think that, even though I've got all the data set up as a dataframe, the values inside the dataframe are reading as objects instead of integers, strings, floats, etc.
 
So, if my data is not analyzable using Python functions, does anyone know how I can fix this to make the data be able to run correctly?
Any help would be greatly appreciated. I hope I've laid out all of my needs clearly. I am new to Python, and I'm not sure if I'm using all the proper terminology.
read_csvshould do a lot good here, you can try using it as -df = pd.read_csv("input.txt", sep=' ', skipinitialspace=True, names=['col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'col7'])