- π Make LLMs run on spyre accelerator and CPUs (yes CPUs)
- π© Bend vLLM into places it wasn't designed for
- ποΈ Fix build systems across architectures
- π¦ Make multi-arch containers actually behave
- β‘ CPU-only LLM inference pipelines
- ποΈ s390x(cpu) and spyre support for modern ML stacks
- βΈοΈ Infra that works beyond a single machine
π₯ squeezing every drop of performance.
β οΈ making PyTorch do questionable things
βΈοΈ running clean infra on Kubernetes


