1

I'm an moderator of a forum and I need to prune all the bots that register on there.
As you can see below, I can list all the users by:

Username number_of_mssages register_date

Example:

- Thurman Valsin0190    0       Sat Jan 14, 2012 5:00 pm
- Rubye Tones01AD   0       Sat Jan 14, 2012 4:59 pm

I need a super simple Python little program that parses me each line of a text file, so I can get, from the string above, only the nick names.

- Thurman Valsin0190
- Rubye Tones01AD

This means that the program has to delete for each line the 0 and everything that is behind him. The text is taken from a .txt file.
I know this is not that difficult but I'm not a lot into Python. Thanks in advance!

3
  • What you are calling a username appears to be two separate things -- a nickname and a username. Commented Jan 14, 2012 at 16:41
  • "Thurman Valsin0190" is the username. Commented Jan 14, 2012 at 16:43
  • That 0 will always be 0? or can be any one digit number? Commented Jan 14, 2012 at 17:20

4 Answers 4

3

It's not a python question really, it's a regex/string parsing question...

Is it correct to say that every line contains the nickname, a tab character, and then a 0?

Then it should be as simple as:

(assuming line contains a single line from the file)

nickname = line.split("\t")[0]
Sign up to request clarification or add additional context in comments.

Comments

3

consider using regular expressions:

import re

pattern = re.compile(r'(.*?)\s+0\s+')
pattern.findall('- Thurman Valsin0190    0       Sat Jan 14, 2012 5:00 pm')[0] 
# - Thurman Valsin0190

2 Comments

I need a generic program, not only for a single username.
@user963658 This is generic, the pattern is match the substring before multiple white spaces following a 0 with trailing multiple whitespaces. You may give it a try.
1

Why not split on 0 with leading spaces (or tabs) included as part of split key to avoid splitting other zeros:

with open("filename.txt", "r") as f:
    for line in f:
        nick = line.split(" 0 ")[0].strip() # OR .split("\t0\t") if those are tabs
        print nick

2 Comments

And if I need to save the results on the same file, deleting the un-parsed lines? (I need only a text file with the pared lines, the ones only with the username)
Run this program with shell and redirect its output to another .txt file. Delete the original .txt file and rename the new file to original file. e.g. python script.py > out.txt rm filename.txt mv out.txt filename.txt
0

Parse by splitting on " 0 " string e.g., extract-nickname.py:

#!/usr/bin/env python
import fileinput

for line in fileinput.input():
    nick, sep, rest = line.partition(" 0 ")
    if sep:
       print(nick.strip())

It assumes that nicknames can't contain " 0 " and leading/trailing whitespace is not a part of a nickname. Otherwise you could use line.partition("\t") if a tab character is a separator between Username and number_of_mssages.

Example

$ python extract-nickname.py log.txt
- Thurman Valsin0190
- Rubye Tones01AD

If you need to change the file inplace then you could specify inplace=True parameter to fileinput.input() function.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.