0

I have a config file of format:

IP,username,logfile

IP,username,logfile1

IP,username,logfile2

I am giving code below to store text file lines into a list, but i need help with the code which can determine whether name of logfile is same as logfile1 or not please help

import csv

config_file_path = "config15.txt"  # read config file and assign IP,username,logfile,serverpath,localpath
file  = open(config_file_path, 'r')
reader = csv.reader(file)
all_rows = [row for row in reader] # appending config file contents in a list

output above code gives:

[['127.0.0.1', 'new34', 'logfile'], ['127.0.0.1', 'new34', 'logfile1']]

I want a code that compares and tell if name of logfile and logfile1 same or not and if same return true as output.

4
  • Can you add sample input ad sample output Commented Jun 14, 2019 at 12:05
  • @0xPrateek if the name of logfile and logfile1 is same it should return true as output Commented Jun 14, 2019 at 12:09
  • 1
    By logfile and logfile1 do you mean any two of the filenames, of just the second and third ones in the config file, or something else? What if there are two pairs of filenames equal? What if three filenames are equal? Commented Jun 14, 2019 at 12:10
  • @RoryDaulton any two filenames. i have edited the question for more clarity Commented Jun 14, 2019 at 12:14

3 Answers 3

1

Using a simple iteration and a set used as a check variable.

Ex:

all_rows = [['127.0.0.1', 'new34', 'logfile1'], ['127.0.0.1', 'new34', 'logfile1']]
def check_row(data):
    seen = set()
    for i in data:
        if i[-1] in seen:
            return True
        else:
            seen.add(i[-1])
    return False


print(check_row(all_rows))  #True
Sign up to request clarification or add additional context in comments.

Comments

0

If this is really yours file format. It would be easier to read it as Data Frame:

import pandas as pd
df = pd.read_csv('config15.txt',sep=',', header = None, names =['ip','un','lf']) #or just change extension to *.csv
dupldf =df[df.duplicated(['lf'])]# find duplicate rows 

if empty, there is not duplicated values

Comments

0

Thus, if I've understood, you are looking for logfile duplicates. First of all you need a list (or a vector of logfiles), e.g.:

logfiles = [row[-1] for row in reader]

This list contains the logfile names. Now, i suggest you to use numpy, which is a very large library for python contains usefull methods (if you want code in python you have to know this library), so:

import numpy as np
logfiles = np.array(logfiles) #simply transformation of list into a numpy array 
i, j = np.where(logfiles[:, np.newaxis]==logfiles[np.newaxis, :])

logfiles[i] are the duplicated elements, i.e. logfiles[i] = logfiles[j] clearly each element is also equal to it self, so you have to delete the elements for which i==j:

index2save = np.where(i[:, np.newaxis]!=j[np.newaxis, :])[0]
i = i[index2save]

Now i are the indices of duplicates elements, and logfiles[i] are the same names. Hoping this can help you!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.