0

I'm using react-pdf which creates HTML based on the PDF document passed to it. There is a (for each PDF page) with 50+ (each line of text) nested into it which I need to parse the text of.

The goal is use this to "element.scrollIntoView()" when a specific string is found.

There seems to be 2 options for me to get the elements, an HTMLCollection and a NodeList. Currently I get an HTMLCollection by doing:

const spanElementCollection = page.getElementsByTagName("span");

Which shows the full HTMLCollection

I've found numerous resources recommending that I use Array.from(HTMLCollection) to convert to an array so I can parse it. After converting, the array is always empty.

const spanElements = Array.from(spanElementCollection);

Empty array after conversion

Additionally, I've tried converting this to an array from a NodeList using const spanElementList = await page.querySelectorAll('div.textLayer > span'); which always returns an empty NodeList similar to the above screenshot.

Removing > span from querySelectorAll works to get the parent <div>, but I cannot get the ~50 NodeList <span> children into an array.

const spanElementList = await page.querySelectorAll('div.textLayer'); returns

All of this is being called from within an inputRef(() => {}) as recommended by the developer of the package

I've already viewed the closes adjacent issue here

2
  • This suggests that the HTML elements are populated after you create the HTML collection with getElementsByTagName. Please make sure you only execute your code (to create an array, or to use another method) when the document is fully loaded with all relevent HTML elements. Commented Oct 4, 2023 at 17:15
  • @trincot I believe you're correct and that dawned on me shortly after writing all of this out (the true magic of writing a stackoverflow post). I changed my approach, but I think implementing a useEffect would fix the problem entirely. Commented Oct 6, 2023 at 16:35

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.