0

Sorry for the ignorance, I'm just starting out and haven't been able to find a good answer to this anywhere else. Basically, I have an HTML file saved as plaintext, and I want to pull a string within a line out of it. The line looks like this:

<li><strong>Password: XXXXXX</strong></li>

First of all, it is the second instance of that line that I want to pull. And the only part of it that I want is XXXXXX. I would prefer to delete everything else in the file besides XXXXXX. The string changes often, so I can't just grep for it. Thanks for any help.

3 Answers 3

2
$ cat file
<li><strong>Password: AAAAAA</strong></li>
<li><strong>Password: XXXXXX</strong></li>
<li><strong>Password: ZZZZZZ</strong></li>

$ awk 'sub(/.*<li><strong>Password: /,"") && sub(/<\/strong><\/li>.*/,"") && ++c==2' file
XXXXXX
Sign up to request clarification or add additional context in comments.

3 Comments

This did not work. Rather than writing anything to a file, I just piped the text from curl. I don't think this would cause a problem, so not sure why it didn't work.
In what way did it "not work"? No output, wrong output, error messages, something else? Piping from curl would make no difference vs reading a file. Are you sure your input was exactly the format you posted? Are you sure there were at least 2 occurrences of the regexp in the input?
There was no output. I tried forwarding it to a file instead of trying to see it in standard output, and still nothing. I copied and pasted it, and just changed how awk received the string. Yes, there are definitely 2 occurrences of the regular expression.
0

something like this should work:

cat c.txt |grep "Password:"|awk '{print $2}'|awk -F "<" '{print $1}'|sed -n 2p

1 Comment

@sullivnc it's best to wait a couple of hours at least before accepting an answer as often the first answer you get is not the best one but once an answer has been accepted your question will get less attention than otherwise. In this case the answer you accepted has a UUOC and a pipe of 4 separate commands when 1 would do and will fail given various values of the password so it's not a very good solution and in fact every other answer you've received so far is better than this one.
0

Just correct the line NR.

awk -F'[: <]' 'NR == 1 {print $5}' file 
XXXXXX

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.