·Croktile Team

Introducing Croktile

We're excited to introduce Croktile, a new TileFlow programming language for GPU computing.

announcementrelease

Today we're thrilled to publicly announce Croktile — a C++ EDSL (Embedded Domain Specific Language) designed to make GPU kernel programming dramatically simpler, safer, and more productive.

The Problem

Writing high-performance GPU kernels today is painful. Even experienced CUDA programmers spend significant time on:

  • Manual index calculations that are error-prone
  • DMA descriptor configuration that varies across hardware
  • Shape mismatches that only surface as silent runtime bugs
  • Code that's hard for AI tools to understand and optimize

Our Solution

Croktile introduces the TileFlow programming paradigm — a way to express data movement and computation on GPUs using high-level, shape-aware primitives.

With Croktile, a matrix multiplication kernel is just 12 lines of readable code, compared to 30+ lines of equivalent CUDA.

What's Next

We're working on:

  • Extended tutorial documentation
  • AI-driven auto-tuning tools
  • Support for more GPU architectures
  • Community-contributed kernel libraries

Stay tuned for more updates!