Wanted: Reimplement WHATWG URL Parser in WASM #38708
Comments
|
Can we use rust? |
|
As a vendored in dependency I personally wouldn't care. But... need to consider the impact on the prerequisites that are necessary to build node.js. If vendoring the dependency requires someone to set up a whole rust development environment to be able to build, then that may be problematic. If, alternatively, what we vendor in is just the wasm files and some glue code such that we only need the wasm compiler, then it really doesn't matter what the source language is. |
|
How much C/C++ experience would be needed? I have submitted small patches to Gecko before |
|
The Servo WHATWG URL parser in Rust has CI set up for WASM. Compiling that and publishing the resulting WASM to eg. npm could maybe even be something they themselves would be interested in? https://github.com/servo/rust-url/blob/d673c4d5e22b3a8ac91b7f52faa45dc32a275f75/.github/workflows/main.yml |
|
Since the practice already exists in undici, can we first try to work this on an npm module? |
|
I'll be happy with whatever works and keeps us spec compliant. It would be good to ensure that we can still quickly respond to changes in the base spec, so we'll need to make sure that whatever the implementation is it stays maintained. |
|
@jasnell Has anyone taken the issue? If not then I would like to work on this. |
It tried to find any benchmark results for the mentioned undici migration to wasm and found only the following message: nodejs/undici#575 Probably that's the not right place to look into since I don't see a significant performance improvement in the above message. Most likely that's a wrong place to look at, so could someone tell me where to find more information on existing experiments? |
|
So, simplified, in essence what it would take is to swap the Lines 79 to 102 in 910efc2 Similar to how the import of Considering that, I guess it can easier to compile the existing c++ code to WebAssembly than to set up some wasm-pack / wasm-bindgen flow that compiles JS-bindings for servo/rust-url using something like I did however open servo/rust-url#712 to see if they would be interested in publishing such bindings themselves. |
|
Hope my question is not too stupid. But may I ask how WASM is debugged? JS and C++ are quite easy to debug but I never debugged WASM till now. |
|
In this context, you can use chrome devtools. it supports dwarf sections and whatnot so it should "just work" |
|
@voxpelli ... yes, that's essentially it. The one complicating bit there are the |


The current WHATWG URL parser implementation is written in C/C++ and incurs a fairly significant cost crossing the JS/C++ boundary. It should be possible to realize a significant performance improvement by porting the implementation to WASM (similar to how the https://github.com/nodejs/undici project has seen a massive performance boost out of moving llhttp parser to wasm).
If someone wanted to pick up this effort, I am available to help mentor through it.
It will require c/c++ experience. What I would imagine would be best is setting it up as a separate project that we would vendor in as a dependency. Done right it would have no breaking API changes.
The text was updated successfully, but these errors were encountered: