New to Python here.
I am trying to get the most active ip address from a log.txt file and print it in another text file. My first step is to get all the ip addresses. Second to sort the most occurring ip address. But I am stuck in the first step which is:
with open('./log_input/log.txt', 'r+') as f:
# loops the lines in teh text file
for line in f:
# split line at whitespace
cols = line.split()
# get last column
byte_size = cols[-1]
# get the first column [0]
ip_addresses = cols[0]
# remove brackets
byte_size = byte_size.strip('[]')
# write the byte size in the resource file
resource_file = open('./log_output/resources.txt', 'a')
resource_file.write(byte_size + '\n')
resource_file.truncate()
# write the ip addresses in the host file
host_file = open('./log_output/hosts.txt', 'a')
host_file.seek(0)
host_file.write(ip_addresses + '\n')
host_file.truncate()
resource_file.close()
host_file.close()
The problem is in the new host.txt file, it reprints the ip addresses instead of overwriting. I tried this too:
resource_file = open('./log_output/resources.txt', 'w')
host_file = open('./log_output/hosts.txt', 'w')
and 'w+' and so on.. but w or w+ gives only one ip address in the host file.
Can someone guide me through this?
Sample Input File
www-c2.proxy.aol.com - - [01/Jul/1995:00:03:52 -0400] "GET /history/skylab/skylab-1.html HTTP/1.0" 200 1659
isdn6-34.dnai.com - - [01/Jul/1995:00:03:52 -0400] "GET /images/kscmap-tiny.gif HTTP/1.0" 200 2537
isdn6-34.dnai.com - - [01/Jul/1995:00:03:52 -0400] "GET /images/ksclogosmall.gif HTTP/1.0" 200 3635
ix-ftw-tx1-24.ix.netcom.com - - [01/Jul/1995:00:03:52 -0400] "GET /shuttle/countdown/count.gif HTTP/1.0" 200 40310
host_file = open('./log_output/hosts.txt', 'a')opens an outdated version of the file and then as it reassignshost_file, the prior loop's data is flushed to the file. Close the thing after you use it or put it in awithclause.