DEV Community

Letting Playwright MCP Explore your site and Write your Tests

Debbie O'Brien on June 18, 2025

What if your tests could write themselves — just by using your app like a real user? In this post, we explore how the Playwright MCP (Model Contex...
Collapse
 
biomathcode profile image
Pratik sharma

This is so great

Collapse
 
rmarinsky profile image
marinsky roma

It's a nice concept, but it is not working so smoothly, fortunately, and not

Collapse
 
debs_obrien profile image
Debbie O'Brien

It takes lots of tweaking the prompt and the LLMs change too. I think I did this with claude 3.7 but if you try another model you will get different results. Its all still in exploration stages thats for sure but yes its a fun concept indeed and I am using it daily on all my projects. The more I use it the better my prompts get

Collapse
 
rmarinsky profile image
marinsky roma

I've tried using it too with Claude 3.7 and understand that the result can be different from time to time.
I'm seeing MCP perhaps only for agents development, not for Agentic IDEs usage to implement automated tests autonomously. Due to the nature of LLM and the way it works, it is now mostly (IMO) generating searches for "//input", except for "//body" which has text, and less frequent searches by role, with valuable assertions. Too often, I have to remove useless code from LLM's output

Now, I mostly prefer to use it only for prototyping, brainstorming, and rephrasing or explaining documentation and code, but never for end-to-end development.
I believe that it's true for every experienced engineer

Thread Thread
 
debs_obrien profile image
Debbie O'Brien

We still always need a human in the loop thats for sure but it gets you off to a great starting point and of course so much can be and should be improved in this area. Thanks for the feedback and for trying it out

Collapse
 
mosesmorris profile image
Moses-Morris

Great 😊 one. This will save time when testing.

Collapse
 
aakash_paliwal profile image
aakash paliwal

Thanks, Nice one !

Collapse
 
mareks_082 profile image
Marek Sirkovský

Followed the tutorial(Sonnet 3.5), but got a lot of "I apologize" and the final one was:
"I apologize, but it seems that the website at debs-obrien.github.io/playwright-m... might be either temporarily unavailable or has been significantly changed. The site is not responding as expected, and the elements we're trying to interact with are not present on the page."

Collapse
 
debs_obrien profile image
Debbie O'Brien

Sorry it seems too many people read my blog post and started trying it out so I am guessing my azure function for the movies database times out. Thanks for reporting. Next time will try keeping it a simpler demo. Give it a try again though and let me know

Collapse
 
mareks_082 profile image
Marek Sirkovský

Thanks a lot. I used a different model(gtp4-1) today, and it seems to be working. LLM created a simple test, but it tests something useful, though.
Nice. Thanks for your article!

Collapse
 
liunate profile image
Nate Liu • Edited

Hey thanks for sharing. I am curious the exploration part, how does it know what to explore? Did you at least provide some initial hints that this is a movie database website, I’m a movie lover and i usually use this website to do this and that?

I mean, depends on the user role, the “key functionalities” might differ erent between users roles.

Collapse
 
debs_obrien profile image
Debbie O'Brien

It takes a page snapshot so it can see whats on the page. I find it explores pretty much in the order as if you were using the site without a mouse so tabbing along. First thing is search field then team toggler then login etc. you can be more precise and say to ignore certain areas or focus on others or have some tests and ask it to explore and find tests that have not been written. Maybe i will do that as my next post

Collapse
 
calvinszeto profile image
Calvin Szeto

I’m trying to make this work for my team, and this is such a great start!

My struggle is with everything that happens after the agent writes the initial test (e.g. in your prompt, the instruction “Execute the test file and iterate until the test passes”).

With Playwright MCP, the agent has access to the accessibility tree and the network requests, which are critical to debugging and fixing a test the way a human would, particularly for choosing the right selectors and mocking network requests correctly (the latter may not be applicable for everyone; my team mocks API requests in order to test different scenarios).

I find that when the agent needs to actually get tests passing, it does a loop of analyzing screenshots and writing “debug” tests to output logs, but neither of these do a good job at giving it the information that the MCP does. It ends up writing very complex selectors or not getting the test correct at all.

It feels like I’m soo close with the amazing tools that the MCP provides, but I can’t actually incorporate it into my testing process! 😞 In the meantime, I can at least use it to scaffold out some initial tests like this demo does, and work from there.

Thank you for a fantastic demo! Really excited to see where this project goes from here!

Collapse
 
extinctsion profile image
Aditya

Cool project. I wanna try it out. Can you please share GitHub link

Collapse
 
extinctsion profile image
Aditya

Also, I'm also a tech writer myself and dev.to mod. I liked your article and I'll promote this article for better reach to audience.

Collapse
 
debs_obrien profile image
Debbie O'Brien

ohh dont have one but will try create one

Collapse
 
extinctsion profile image
Aditya

I want to ask you something. Can we connect on LinkedIn?
linkedin.com/in/aditya-sharma123 - this is me.

Collapse
 
cryot profile image
Tommy

Next time you ask chatgpt to create an article for you, at least change the formatting and delete the icons so it's not sooo obvious...

Collapse
 
debs_obrien profile image
Debbie O'Brien

well actually my process is to record a video and upload my video and ask copilot to create a transcript and then ask chatgpt to create the blog post based on my transcript. this saves tons of time cause i really dont have so much time to share all the cool things I do so thanks for the feedback, will ask chatgpt to keep that in mind for the next blog post I create based on my transcript.

Collapse
 
extinctsion profile image
Aditya • Edited

Exactly for this purpose, I have created myself MCP server to write dev.to articles for me. You can check out my repo - github.com/extinctsion/mcp-py-devto . You can use the MCP tool to generate unpublished articles on dev.to and tweak that already written articles according to your need. It is indeed a game changer for me!

Collapse
 
tarunvarshney profile image
Tarun Varshney

Is any data sent outside the local machine by playwright. Copilot would send obviously.

Collapse
 
debs_obrien profile image
Debbie O'Brien

no playwright doesnt send or store any data