What if your AI assistant could do more than just chat? What if it could actually do things?
We're not talking about just suggesting code snippets. We're talking about actively executing PowerShell commands, automating any website with JavaScript, compiling C# apps on the fly, and even orchestrating conversations between different AI models—all from a single, voice-controlled Windows application.
That's the reality with PowerShellGPT, a project I've poured countless hours into. Today, I'm thrilled to share it with the dev.to community.
The Core Concept: The Intelligent Feedback Loop
At its heart, PowerShellGPT is built on a simple but powerful idea: closing the loop between AI code generation and real-world execution. The process is a continuous cycle:
You Prompt: You issue a command via text or voice in one of over 80 languages.
AI Generates: The AI model (your choice!) generates PowerShell or JavaScript, wrapped in special tags like @PowerShellGPT@
or @JsGPT@
PowerShellGPT Executes: My application detects these tags, asks for your permission (crucial for security), and then executes the code in the appropriate environment.
Feedback is Returned: The output, result, or any errors from the execution are captured and sent back to the AI as a new prompt.
This feedback loop is the magic ingredient. It allows the AI to understand the consequences of its code, debug its own errors, and perform complex, multi-step tasks.
The Arsenal: System & Web Automation
This isn't just a simple wrapper. PowerShellGPT provides two powerful, distinct execution environments.
- System Control via PowerShell (& a Revolutionary Console)
You can execute any PowerShell command generated by the AI. The output is displayed in the Console Browser. But here's the part I'm most proud of: the Console Browser's entire UI is a user-editable JavaScript plugin.
That's right. The PowerShell Console System Javascript is user editable and contains the HTML, CSS, and JS that renders the console. Don't like the retro theme? Want to add a button that triggers your favorite script? You can build your own PowerShell interface. This level of customization is, as far as I know, unprecedented.
- Web Control via BrowserGPT
BrowserGPT is a full-fledged, multi-tabbed automation hub. The AI can inject JavaScript to automate any website. What makes it truly powerful are the advanced browser directives—a simple orchestration language I built using comments included before the javascript code:
//run in tab ID [UserID]//...
//run in tab titlecontains [Text]//...
//load page in new tab [URL]//...
//orcreate [URL]//...
(This one is huge: if the target tab doesn't exist, it creates it and navigates before running the code).
//switch to target tab//
This system allows you to build complex, multi-page web agents that can navigate, scrape data, and interact with elements across different tabs. It also supports two-way communication with window.chrome.webview.postMessage()
allowing you to prompt the AI model from the Browser using javascript.
The Brains: Multi-AI Support & The Control Panel
You are not locked into a single ecosystem. Seamlessly switch between Gemini, Claude, ChatGPT, Grok, and local models via LM Studio.
Each AI model's browser window gets its own floating, draggable AI Output Control Panel. This lets you:
Toggle Output Mode: Decide if the [modelsresponse] placeholder in your scripts gets the AI's full text or just its last code block.
Recapture Response: Instantly re-process the AI's last output.
Trigger TTS: Use the integrated speaker icon to have the response read aloud.
(ChatGPT Specific): Choose between ChatGPT's native voice or the universal TTS system.
The Voice: Universal Voice Control & TTS
Voice Control: Speak your commands in over 80 languages with "Click to Talk," "Constant," and "Wake Word" modes.
Universal TTS: This is a centralized Text-To-Speech engine (powered by LazyPy) with over 1400 voices in 90 languages. It works for every connected AI. The system intelligently strips code from AI responses for clean, natural-sounding speech. You can even set different voices for different AIs using a [SETVOICE][VoiceName] within the text submitted for conversion to speech, perfect for orchestrating AI vs. AI debates.
Putting It All Together: From Voice Command to Compiled App
PowerShellGPT enables some truly mind-blowing workflows. Here’s the killer example: asking the AI to write, compile, and execute a C# application from a single voice prompt.
You Say: "Write an application in C# that shows a window with a button that when clicked toggles the visibility of the taskbar."
The AI Generates (and PowerShellGPT Executes):
@PowerShellGPT@
$csharpCode = @'
using System;
using System.Runtime.InteropServices;
using System.Windows.Forms;
public class TaskbarToggler {
[DllImport("user32.dll")]
private static extern IntPtr FindWindow(string lpClassName, string lpWindowName);
[DllImport("user32.dll")]
private static extern int ShowWindow(IntPtr hWnd, int nCmdShow);
private const int SW_HIDE = 0;
private const int SW_SHOW = 5;
private static bool isVisible = true;
private static IntPtr taskbarHandle = FindWindow("Shell_TrayWnd", null);
public static void Toggle() {
ShowWindow(taskbarHandle, isVisible ? SW_HIDE : SW_SHOW);
isVisible = !isVisible;
}
}
public class Program {
[STAThread]
public static void Main() {
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Form form = new Form();
form.Text = "Taskbar Toggle";
form.Width = 300;
form.Height = 150;
Button toggleButton = new Button();
toggleButton.Text = "Toggle Taskbar";
toggleButton.Dock = DockStyle.Fill;
toggleButton.Click += (sender, e) => TaskbarToggler.Toggle();
form.Controls.Add(toggleButton);
Application.Run(form);
}
}
'@
$sourceFile = "TaskbarApp.cs"
Set-Content -Path $sourceFile -Value $csharpCode
$cscPath = "C:\Windows\Microsoft.NET\Framework\v4.0.30319\csc.exe"
$outputExe = "TaskbarApp.exe"
if (Test-Path $cscPath) {
& $cscPath /nologo /t:winexe /out:$outputExe $sourceFile
if (Test-Path $outputExe) {
Add-Type -AssemblyName System.Windows.Forms
$result = [System.Windows.Forms.MessageBox]::Show(
"Run the program?", "Compilation successful",
[System.Windows.Forms.MessageBoxButtons]::YesNo)
if ($result -eq [System.Windows.Forms.DialogResult]::Yes) {
Start-Process ".\$outputExe"
}
} else {
Write-Output "Compilation failed"
}
} else {
Write-Output "C# compiler not found"
}
@/PowerShellGPT@
The result? A running C# application, created from scratch by a single voice command. The same principles apply to the included AI DJ (which automates YouTube) and Agent Vision (which uses a webcam and an image-to-text service to let the AI "see").
Advanced Agentic Features
For power users, there's even more depth:
Command Chaining: Link commands with a natural phrase (default: "and then").
Dynamic [KEYWORD] Commands: Create flexible voice templates where your speech dynamically fills in parts of a script.
The Agent Bridge: A command-line utility (agent_bridge.exe) to trigger actions in named PowerShellGPT instances from external scripts or even other agents.
Why I Built This & The Vision
I believe the future of AI is agentic. It's about giving these powerful models the tools to interact with the world, learn from the results, and take meaningful action. PowerShellGPT is my vision for that future on the Windows desktop—a platform for building your own powerful, personalized AI agents.
Where to Find It
Download & Learn More: PowerShellGPT.com
Watch the Demos: YouTube Channel: YouTube/@PowerShellGPT
I invite you to download it, experiment, and show me what amazing things you can create. I'm excited to see what you build
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.