Skip to main content
added 2 characters in body
Source Link
Ed Morton
  • 35.8k
  • 6
  • 25
  • 60

Is there a pdf viewer that allows me to search its text by a regex expression?

In case that I haven't found one, I am thinking about extracting the text and layout from a pdf file by

less my.pdf > mytextfile

or pdftotext -layout. In the text file, pages are separated by new form-feed character (Ctrl-L), and lines are separated by new line-feed character.

I was wondering how to find all the matches to a given pattern in the text file, and output their locations (page numbers and line numbers in each page)?

Is there a pdf viewer that allows me to search its text by a regex expression?

In case that I haven't found one, I am thinking about extracting the text and layout from a pdf file by

less my.pdf > mytextfile

or pdftotext -layout. In the text file, pages are separated by new form character (Ctrl-L), and lines are separated by new line character.

I was wondering how to find all the matches to a given pattern in the text file, and output their locations (page numbers and line numbers in each page)?

Is there a pdf viewer that allows me to search its text by a regex expression?

In case that I haven't found one, I am thinking about extracting the text and layout from a pdf file by

less my.pdf > mytextfile

or pdftotext -layout. In the text file, pages are separated by form-feed character (Ctrl-L), and lines are separated by line-feed character.

I was wondering how to find all the matches to a given pattern in the text file, and output their locations (page numbers and line numbers in each page)?

Source Link
Tim
  • 106.7k
  • 234
  • 650
  • 1.1k

How to search in a pdf or text file by a regex pattern and output locations of matches?

Is there a pdf viewer that allows me to search its text by a regex expression?

In case that I haven't found one, I am thinking about extracting the text and layout from a pdf file by

less my.pdf > mytextfile

or pdftotext -layout. In the text file, pages are separated by new form character (Ctrl-L), and lines are separated by new line character.

I was wondering how to find all the matches to a given pattern in the text file, and output their locations (page numbers and line numbers in each page)?