Timeline for Data scraping from Internet with Excel-VBA
Current License: CC BY-SA 4.0
13 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Feb 19, 2019 at 21:12 | comment | added | TinMan |
I had an issue with querySelectorAll not being supported prior IE9 (I think??). Inserting <meta http-equiv="X-UA-Compatible" content="IE=IE9 /> into the response text head tag, prior to creating the HTMLFile, should fix it.
|
|
| Feb 19, 2019 at 21:09 | comment | added | TinMan |
If there are no elements containing the class javascript returns null ans VBA sets the object to nothing. Testing whether or not the element is Nothing, which I do, should be sufficient.
|
|
| Feb 19, 2019 at 18:32 | comment | added | QHarr | I think this line: Set element = document.querySelector(".pagnDisabled") needs tweaking. It will throw a 424 if not present (unless I am missing where it is handled - which is entirely possible). Either handle the potential error or use a test of Set element = document.querySelectorAll(".pagnDisabled") then test if element.length = 0 . You can also remove a loop by using querySelectorAll("#s-results-list-atf li"). Used typed functions where possible and add in the ByVal/ByRefs. | |
| Dec 29, 2018 at 16:49 | comment | added | TinMan | @RyanWildry I agree. I could think of a better way to explain it. | |
| Dec 29, 2018 at 15:52 | comment | added | Ryan Wildry |
I was confused as I didn't see a callback, I guess this part: While server.readyState <> READYSTATE_COMPLETE:DoEvents:Wend is doing that. That's not really idiomatic IMO of async code, but, I guess it works.
|
|
| Dec 29, 2018 at 7:05 | comment | added | TinMan |
@RyanWildry Although VBA is runs synchronously it does not have to wait on the XMLHTTP responses. By setting the XMLHTTP.Send varAsync` parameter to True, you can have multiple connections running simultaneously. You don't have to wait for each connection to return its response before opening a new connection. My sample code is pretty crude. A better example with more dramatic results is my answer to: Retrieve data from eBird API and create multi-level hierarchy of locations
|
|
| Dec 29, 2018 at 2:01 | comment | added | Ryan Wildry | Not sure if I follow this part "When ran the code parses 20 pages of results asynchronously in under 12 seconds". Where/how is the code running async? | |
| Dec 27, 2018 at 11:37 | comment | added | Vityata | I am impressed by your remote-debugging skills! It worked! :) | |
| Dec 27, 2018 at 11:34 | comment | added | TinMan | @Vityata very strange. Try removing the parameter names. | |
| Dec 27, 2018 at 11:32 | comment | added | Vityata |
I have just tried the amazon-scraper.xlsb and still got the same error on the same place. Currently with Excel 2010, 64 bits.
|
|
| Dec 27, 2018 at 11:01 | comment | added | TinMan |
@Vityata I got that error using a different version of the library. CreateObject("MSXML2.XMLHTTP.6.0") gave me the error but CreateObject("MSXML2.XMLHTTP") worked. download amazon-scraper.xlsb
|
|
| Dec 27, 2018 at 10:03 | comment | added | Vityata |
Thanks for the feedback! :) The library is a good idea indeed, I was trying to make the code portable through copy and paste, thus I was not using early binding. On your code I get an error here .Open bstrMethod:="GET", bstrUrl:=URL, varAsync:=False - "448 - named arbugment is not found". I call it like this - Debug.Print Join(getBooks("VBA").ToArray, vbNewLine)
|
|
| Dec 27, 2018 at 5:11 | history | answered | TinMan | CC BY-SA 4.0 |