New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Code-split components when component testing #20287
base: main
Are you sure you want to change the base?
Conversation
|
Could you help me understand where exactly are the performance gains coming from? I.e. what used to happen and is no longer happening. |
|
Currently, all of the examples being tested are compiled into a single
bundle.
In large apps, the JS bundle can become the bottleneck for running parallel
tests. In my testing I could see bottlenecks around:
1. Web server serving large bundles to parallel browsers
2. Browsers downloading the bundle
3. Parse and eval time being large for such monolithic bundles.
Ultimately for us, the speed of the test suite was directly proportional to
the amount of work (memory, CPU) the browser was doing.
In code split bundles, the work done is proportional to the component being
tested. Without it, work done is proportional to the global number of
components being used.
I hope that makes some sense. If you’re aware of any medium- large scale
public codebase that uses component testing, I can perhaps share more
reproducible data.
On Mon, 23 Jan 2023 at 10:38 PM, Pavel Feldman ***@***.***> wrote:
Could you help me understand where exactly are the performance gains
coming from? I.e. what used to happen and is no longer happening.
—
Reply to this email directly, view it on GitHub
<#20287 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACEIEH7TVUPA4LNXUEO5AE3WT23IVANCNFSM6AAAAAAUDUIMYQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Regards,
Shubham Kanodia
|
|
I guess I'm trying to understand, which line gives bundler a hint on whether it should split the code and whether we can rely on that. You are still importing everything unconditionally as per my read of the code. In terms of (1), (2) and (3), one would need to stress the system very hard in order to see the effect in (1) and (2). And even for (3), given that component testing is reusing the browser context, everything should settle in the disk and compile caches of the browser, so you should not see any amortized savings. Of course doing less work is always better, but I'm curious why the average test time has changed - maybe we are under-utilizing the browser's compilation cache. |
Dynamic imports establish code split points. See —
Yes, in our case, we are talking about a large app, with the monolithic bundle produced standing at ~8MB. With a concurrency of ~10 browsers on an M1 mac (say), the webserver needs to be streaming ~80MB at a time (worst case), in which case the performance of the webserver etc. does start to matter. Most apps using playwright might not be large enough for this, but should still see smaller improvements.
This is interesting because it seems in our case that the webserver essentially serves the bundle for each test case (we logged requests served). If there was disk caching/compile caching involved, we wouldn't be seeing this, nor would we pay the parse cost again. That is unexpected according to you? Lastly, the isolation benefits (which aligns one of playwright's core philosophies) of code split entry points is a reason on its own — even if we discount the performance benefits. |
I see, but it looks like all modules are loaded unconditionally, or why is it not happening? Every time |
278cea9
to
2d6d374
Compare
|
@pavelfeldman Ah, I can see why that might have been confusing. Our internal implementation for code splitting was implemented in userland. I missed making the dynamic imports lazy while copying over changes in this PR to the They should be lazily loaded and evaluated on a need-basis now. playwright/packages/playwright-test/src/plugins/vitePlugin.ts Lines 311 to 316 in 2d6d374
|
c827bf1
to
30c7d3e
Compare
|
I'll take a look at a few of the failing tests this week. But would be good to know if I'm right directionally here. |
|
Yes, with the latest changes that turn the import into the arrow function, I understand how it becomes lazy / module split! |
30c7d3e
to
9593fb6
Compare
| await fs.promises.writeFile(buildInfoFile, JSON.stringify(buildInfo, undefined, 2)); | ||
|
|
||
| for (const [filename, testInfo] of Object.entries(buildInfo.tests)) | ||
| setExternalDependencies(filename, testInfo.deps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's what is making the tests fail.
Code splitting ensures that the test JS / CSS is the minimal required for a test to run in isolation, which should ease up memory and CPU when executing tests. Code splitting also introduces script boundaries, which means that a runtime error in code of another component would not mean that the whole test suite blows up. This comes at a price of a slight increase in bundling time if your module graph is large and you have a lot of components, but this should be offset by the fact that once compilation happens, running individual tests are a lot faster in such cases.
9593fb6
to
009538f
Compare


Code splitting ensures that the test JS / CSS is the minimal required for a test to run in isolation, which should ease up memory and CPU when executing tests.
Code splitting also introduces script boundaries, which means that a runtime error in the global code of another component would not lead to the entire test suite blowing up.
This comes at a price of a slight increase in bundling time if your module graph is large and you have a lot of components, but this should be offset by the fact that once compilation happens, running individual tests is a lot faster.
When running with an without code-splitting on a large real-world test suite at Atlassian with 900+ tests, I noticed that the average time per test dropped from ~5 seconds / test to < 2 seconds / test when code-splitting. Browsers also consumed less memory and CPU which meant we could push for better parallelization on the same hardware.