In Research · Cadence Labs

Kàkàņfò

Inference at the boundary.

A lightweight inference runtime built for edge hardware — ARM, RISC-V, and custom NPUs. Real-time machine learning at the device layer, with no cloud dependency and no compromise on capability.

Join the waitlist What we're building

Active research programme · Pre-alpha · Waitlist open

Kah-kah-nnn-fawKàkàņfò

"Field agent; one who sees from afar and acts promptly."

The name reflects the core intent of the project — bringing perceptive intelligence to devices at the edge of a network, far from the cloud, operating in real time on constrained hardware with limited or no connectivity.

Where SEER gives visibility into AI systems that run in the cloud, Kàkàņfò brings intelligence to the hardware that sits at the physical boundary of the world — sensors, robots, medical devices, industrial equipment, and the billions of edge nodes that cloud-native ML cannot reach.

The Problem

Brilliant models. Useless at the edge.

Most machine learning assumes a network connection, a GPU server, and milliseconds to spare. For the majority of the world's hardware — industrial sensors, medical implants, mobile robots, satellite uplinks — none of those assumptions hold.

The world your model was built for

High-bandwidth internet connection always available
Data sent to a cloud GPU for inference — results returned
Hundreds of milliseconds of latency acceptable
Power budget measured in watts, not microwatts
Homogeneous, well-supported hardware (NVIDIA, x86)
Retrain and redeploy in hours — model updates are easy

The world Kàkàņfò is built for

Intermittent or zero network connectivity — decisions must happen locally
Inference runs on-device, in real time — no round-trip to the cloud
Latency measured in microseconds — a robotic arm cannot wait
Power budgets in milliwatts — battery-powered, energy-harvested
Heterogeneous hardware: ARM Cortex-M, RISC-V, custom ASICs, NPUs
Model updates are infrequent and carefully managed over-the-air

What We're Building

An inference runtime that
runs anywhere hardware runs.

Kàkàņfò is a software layer that sits between your trained ML model and the hardware it runs on. It handles the hard parts — memory layout, kernel scheduling, power management, quantisation — so the model runs efficiently regardless of what chip it lands on.

⚡

Hardware-native kernel compilation

Kàkàņfò compiles model operations directly to the instruction set of the target hardware — NEON for ARM, vector extensions for RISC-V, custom kernels for NPUs. No generic fallbacks. Runs as fast as the hardware allows.

📦

Aggressive quantisation with quality preservation

Models are quantised to INT8, INT4, or binary representations at runtime, with automatic calibration to preserve accuracy. A model that runs at 94% accuracy in FP32 typically runs at 91–93% at INT8 — with 8x less memory and 4x the speed.

🔋

Power-aware scheduling

Kàkàņfò monitors available power budget in real time and adjusts inference precision and batch behaviour dynamically. A device on battery power infers at INT4. When plugged in, it upgrades to INT8 automatically — no code changes needed.

📡

Offline-first model management

Models are deployed to devices as compact binary packages over any transport — BLE, LoRa, USB, or direct flash write. Kàkàņfò handles versioning, rollback, and A/B deployment across fleets of devices, even in disconnected environments.

🧩

Universal model import

Import from TensorFlow Lite, ONNX, PyTorch Mobile, or CoreML. Kàkàņfò handles the conversion and optimisation internally. You train in whatever environment you prefer — Kàkàņfò handles the edge deployment.

Hardware support status

ARM Cortex-A (64-bit)Active

ARM Cortex-M (32-bit)Active

RISC-V RV64In research

RISC-V RV32In research

Custom NPU (via SDK)In research

RP2040 / RP2350Planned

ESP32 familyPlanned

Benchmark — MobileNetV2, Cortex-A53

Inference latency (INT8)4.2ms

Memory footprint1.4MB

Power draw18mW

Accuracy vs FP32 baseline−1.8pts

Binary size (runtime)220KB

Use Cases

Where Kàkàņfò works.

Any environment where intelligence is needed but cloud connectivity cannot be assumed. Kàkàņfò is designed for hardware that operates at the boundary — physically, economically, or logistically — of the connected world.

🏭

Industrial sensing

Predictive maintenance models running directly on vibration sensors and PLCs. Detect bearing failure, anomalous heat signatures, and process deviations in real time, at the machine, without sending data off-site.

Latency critical · Air-gapped networks

🤖

Robotics & autonomous systems

Perception and decision models running on the robot's own compute — no latency penalty from cloud round-trips. Object detection, path planning, and anomaly avoidance at the speed the hardware demands.

Sub-10ms latency · Battery powered

🏥

Medical devices

Classification and signal analysis on implantables, wearables, and point-of-care devices. Runs within strict power budgets and memory constraints, with the reliability standards medical hardware requires.

Regulatory-grade reliability · Ultra-low power

🌾

Remote & rural infrastructure

Agricultural sensors, environmental monitors, and utility infrastructure in areas with no reliable connectivity. Inference runs locally, data is summarised on-device, and syncs when connectivity is briefly available.

Intermittent connectivity · Solar powered

🚗

Automotive & mobility

In-vehicle classification and sensor fusion running on automotive-grade ARM MCUs. From ADAS assistance features to fleet telematics — processed locally, with no dependency on mobile network availability.

ISO 26262 targets · Harsh environments

📱

Consumer IoT

On-device intelligence for smart home, wearable, and consumer electronics manufacturers who want to add ML capability without the cost, latency, or privacy exposure of cloud inference.

Cost sensitive · Privacy preserving

Research Status

Where we are. Where we're going.

Kàkàņfò is an active research programme inside Cadence Labs' Edge & Distributed Intelligence domain. We are currently in the pre-alpha phase — the core runtime is functional on ARM Cortex-A hardware, and RISC-V support is in development.

We are building the waitlist to understand the landscape of use cases and hardware targets before we move into closed alpha. Waitlist members directly influence our roadmap — we read every response and reach out to learn more.

If you're working on a problem that needs on-device ML, we want to hear about it now — not after we launch.

Join the waitlist

Q4 2025 · Complete

Core runtime — ARM Cortex-A, INT8 quantisation, ONNX import

Q1 2026 · Complete

ARM Cortex-M support, power-aware scheduling, TFLite import

Q2 2026 · In Progress

Offline model management, INT4 quantisation, fleet OTA prototype

Q3 2026 · Planned

RISC-V RV64 support, NPU SDK, closed alpha with waitlist partners

Q4 2026 · Planned — Developer Preview

RISC-V RV32, ESP32 port, binary distribution, developer preview SDK

Q4 2026 · Planned — Public Beta

Public beta — open SDK, docs, community programme

Early Access

Get in before we build the wrong thing.

We're opening a small, selective waitlist for teams who are working on edge ML problems right now. Waitlist members get early access to the pre-alpha SDK, a direct line to the research team, and the ability to shape what we build next.

We are not collecting emails to send a launch newsletter. We are looking for partners who have a real problem and real hardware. If that's you, tell us about it.

🔑

Pre-alpha SDK access

Hands-on access to the Kàkàņfò runtime before public release, with direct support from the engineering team.

🎯

Influence the roadmap

Your use case and hardware targets directly affect what we build next. We will reach out and ask questions.

🤝

Research collaboration

If your problem is novel enough, we may invite you into a formal research collaboration with the Cadence Labs team.

💸

Founder pricing

Waitlist members lock in founder pricing when Kàkàņfò launches commercially. That price will not be available after the public beta.