Baseten reposted this
Today we're releasing TIM-Qwen3.6-27B on a new OpenAI and Anthropic compatible API. Last month I wrote that open models had finally caught up to frontier models on the work most people *actually* need AI to do. The bottleneck stopped being the model and started being the environment around it. This release is our next step to unlock open models with our co-designed runtime and post-training process, now delivered in an API format that developers already love. The newest iteration of our inference runtime, TIMRUN, compresses context on the fly without losing reasoning quality. On long-context agent workloads, that means 10x effective context window length, 3x concurrent throughput, and 49% lower latency compared to models using SGLang on the same GPU. If you have a project that uses the OpenAI or Claude SDKs, you can point it at our endpoint and try TIM-Qwen3.6-27B in a few minutes. Full post on this release linked below. (We're also excited to share this system with 250+ developers at our hackathon next week with Baseten, Cloudflare, and Wayfair as part of Boston TECH WEEK by a16z)