Minimum Data to regex
\documentclass{article}
\begin{document}
-------------------------------------------------------------
A           B       C               D
Header      Aligned Aligned         Aligned
----------- ------- --------------- -------------------------
First       row     12.0            Example of a row that
                                    spans multiple lines.
Second      row     5.0             Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------
Table: Here's the caption. It, too, may span
multiple lines.
\section{Lorem Ipsun}
Hello world!
-------------------------------------------------------------
A           B       C               D
Header      Aligned Aligned         Aligned
----------- ------- --------------- -------------------------
First       row     12.0            Example of a row that
                                    spans multiple lines.
Second      row     5.0             Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------
Table: Here's the caption. It, too, may span
multiple lines.
\end{document}
Desired output
-------------------------------------------------------------
A           B       C               D
Header      Aligned Aligned         Aligned
----------- ------- --------------- -------------------------
First       row     12.0            Example of a row that
                                    spans multiple lines.
Second      row     5.0             Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------
Table: Here's the caption. It, too, may span
multiple lines.
-------------------------------------------------------------
A           B       C               D
Header      Aligned Aligned         Aligned
----------- ------- --------------- -------------------------
First       row     12.0            Example of a row that
                                    spans multiple lines.
Second      row     5.0             Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------
Table: Here's the caption. It, too, may span
multiple lines.
I want extract all tables from LaTeX document.
Pseudocode
- match >7 of "-" in a row and until everything until "Table:". Include the line with "Table:" but not not anything after that line.
- iterate 1) until the end of the file
My attempt
The first step
[-]{10,777}$
and to include now everything except word "Table:"
((?!Table:).)*$
and include finally everything from line with "Table:"
^(?=.*?\Table:\b)
All combined
[-]{10,777}$((?!Table:).)*$^(?=.*?\Table:\b)
which cannot work. There is something wrong but I do not know what.
How can you regex such an environment well in Perl?



pandocto parse the LaTeX file, then select the interesting tables in the result, then usepandocagain to convert the result back to LaTeX.