·Croktile Team
Introducing Croktile
We're excited to introduce Croktile, a new TileFlow programming language for GPU computing.
announcementrelease
Today we're thrilled to publicly announce Croktile — a C++ EDSL (Embedded Domain Specific Language) designed to make GPU kernel programming dramatically simpler, safer, and more productive.
The Problem
Writing high-performance GPU kernels today is painful. Even experienced CUDA programmers spend significant time on:
- Manual index calculations that are error-prone
- DMA descriptor configuration that varies across hardware
- Shape mismatches that only surface as silent runtime bugs
- Code that's hard for AI tools to understand and optimize
Our Solution
Croktile introduces the TileFlow programming paradigm — a way to express data movement and computation on GPUs using high-level, shape-aware primitives.
With Croktile, a matrix multiplication kernel is just 12 lines of readable code, compared to 30+ lines of equivalent CUDA.
What's Next
We're working on:
- Extended tutorial documentation
- AI-driven auto-tuning tools
- Support for more GPU architectures
- Community-contributed kernel libraries
Stay tuned for more updates!