0

Say I want to search for '123456789' and I want it to be near 'firstname'. Is there a way to do it?

I've 0 logic to do this. Maybe piping greps could work?

What kind of algorithm is best suited for this purpose?

Input:

search_string1='firstname' search_string2='123456789'
proximity_#_of_lines=10

Output:

Find search_string1 and search_string2 instance if they are 10 lines near each other. i.e if these two strings are 10 lines near each other then display it to a file.

6
  • It's still unclear what is the output. You can keep your sentence, but provide the expected output. Commented Sep 19, 2023 at 3:45
  • Do you want a yes/no answer, the line that contains 123456789 if it's within 10 lines of a line containing the substring firstname, or the other way around? Is your data in some common format, like JSON, YAML, CSV, XML or other? Commented Sep 19, 2023 at 6:58
  • 1) There is a log file named application.log 2) When I do proximity search and such holds true: a) either show me the file only in that case b) let it be a way to highlight those search terms and view them 3) My files are general log files in .log file (normal text file). Commented Sep 19, 2023 at 8:27
  • 1
    When asked for clarifications, please edit your question to include the answers. For text-processing questions like this one, be sure to provide a (possibly anonymized) example of the log file you want to process, the search strings, and what the output should look like for your example input. Commented Sep 21, 2023 at 13:34
  • Please edit your question to include statements of requirements for multiple matches, overlapping ranges, start/end on the same line, regexp metachars treatment, substrings treatment, etc. and sample input/output that covers your requirements so we can best help you. See how-do-i-find-the-text-that-matches-a-pattern for more info on some of that. Commented Oct 2, 2023 at 18:04

2 Answers 2

1

A standard approach could be:

what_we_want='123456789'
context='firstname'
distance=10
grep -E -e "${context}" -C "${distance}" file_to_look_into | grep -E -e "${what_we_want}" -C "${distance}"

The first grep ensures that we only look $distance lines around $context matching lines. The 2nd then sees if it find $what_we_want in those 2*$distance+1 lines.

If you just want the matching line as a result, drop the -C "${distance}" from the 2nd grep.

0
0

Assuming the strings can only appear once each in your input then using any awk (untested):

#!/usr/bin/env bash

awk -v str1='string1' -v str2='string2' -v prox=10 '
    index($0,str1) { nr1 = NR }
    index($0,str2) { nr2 = NR }
    nr1 && nr2 {
        delta = (nr1 > nr2 ? nr1 - nr2 : nr2 - nr1)
        exit ( delta > prox )
    }
' file

if (( $? == 0 )); then
    printf 'success\n'
else
    printf 'failure\n'
fi
4
  • They don't appear just once. Though! Commented Oct 2, 2023 at 17:46
  • Then edit your question to state that and any other unstated requirements and add concise, testable sample input and expected output that demonstrates your requirements so we have something we can copy/paste to test a potential solution with. Commented Oct 2, 2023 at 18:02
  • I'll try to desensetize the data and make a log file. Commented Oct 2, 2023 at 18:11
  • Right, that is how people typically provide the "sample input" part of a [mcve], but remember to create and post the "expected output" part of it too. And make sure to cover the difficult rainy-day cases you can think of or you'll probably end up with a solution that only works for the sunny day cases. Commented Oct 2, 2023 at 18:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.