3

I'm trying to grep 3 fields for the strings a, b and c. I know that this can be done with

grep -E 'a|b|c'

However, I also want to grep for the strings x, y and z, including the following line. I know that this can be done with

grep -A1 'x'

So my question is, is it possible to combine all of these into a single command? E.g. something like (i know this command doesn't work, just an example)

grep -E 'a|b|c' -A1 'x|y|z'

If there is a better way without grep, or even using python that would be helpful, I just resorted to using grep as I thought it would be faster than reading a file line by line with python. Cheers!

EDIT: So I have a big file with recurring sections, it looks something like this:

{
    "source_name": [
        "$name"
    ],
    "source_line": [
        52
    ],
    "source_column": [
        1161
    ],
    "source_file": [
        "/somerandomfile"
    ],
    "sink_name": "fwrite",
    "sink_line": 55,
    "sink_column": 1290,
    "sink_file": "/somerandomfile",
    "vuln_name": "vuln",
    "vuln_cwe": "CWE_862",
    "vuln_id": "17d99d109da8d533428f61c430d19054c745917d0300b8f83db4381b8d649d83",
    "vuln_type": "taint-style"
}                      

And this section between the {} repeats in the file. So what I'm trying to grep is the line below source_name, source_line and source_file along with the vuln_name, sink_file and sink_line. So sample Output should be:

    "source_name": [
        "$name"
    "source_line": [
        52
    "source_file": [
        "/somerandomfile"
    "sink_line": 55,
    "sink_file": "/somerandomfile",
    "vuln_name": "vuln",
3
  • Why is there a need to combine these commands? Commented Nov 20, 2018 at 14:36
  • @JonahBishop makes my life a bit easier by having the output follow each other, instead of being split up. If that makes any sense Commented Nov 20, 2018 at 14:39
  • 1
    Try grep -Poz 'a|b|c|(x|y|z).*\R.*' file Commented Nov 20, 2018 at 14:59

2 Answers 2

1

This python script should be able to do the job, and it allows for some ad-hoc customization that would be hard to get into a dense grep-command:

my_grep.py

import re
import sys

first = re.compile(sys.argv[1])
second = re.compile(sys.argv[2])
with open(sys.argv[3]) as f:
  content = f.readlines()

for idx in range(len(content)):
  first_match = first.search(content[idx])
  if first_match:
    print(content[idx])
  second_match = second.search(content[idx])
  if second_match and (idx+1) < len(content):
    print(content[idx])
    print(content[idx+1])

You can generate your desired output like this:

 python my_grep.py 'sink_line|sink_file|vuln_name' 'source_name|source_line|source_file' input_file

Given that your input file is called input_file.

Sign up to request clarification or add additional context in comments.

1 Comment

This works fine and makes it easy for me to modify the output to my liking or assign output to variables. Thanks dude!
0

AWK

awk supports range patterns which match everything from pattern1 until pattern2:

awk '/(aaa|bbb|ccc)/,/[xyz]/' data.txt

PYTHON

Python allows you to compile regular expressions for speed and you can call the script as a single command by putting it in a file.

import re

pattern1 = re.compile("a|b|c")
pattern2 = re.compile("x|y|z")
saw_pattern1 = False

with open("data.txt", "rb") as fin:
    for line in fin:
        if saw_pattern1 and pattern2.match(line):
            print("do stuff")
        saw_pattern1 = pattern1.match(line)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.