Platform · Architecture

FlexSys platform architecture.

How grid-aware compute actually works. The operator, the control plane, the gateway, the grid integration — and where the inference router fits.

Three pillars

Compute · Economics · Grid

One platform with three things to do: schedule the work, follow the price, respect the network.

01 · Compute

Kubernetes operator + CRDs

A FlexSysJob CRD captures every workload type — Train, ImageGen, VideoGen, InferenceService — with first-class ResourceRequirements, QoS tier and priority. Real scheduling, not best-effort.

  • FlexSysJob CRDs · Train / ImageGen / VideoGen / InferenceService
  • ResourceRequirements + QoS + priority
  • Open-source operator runs in our cluster or yours
02 · Economics

Real-time grid pricing

AEMO + grid spot pricing is baked directly into the scheduler. Workloads land in the cheapest region right now. Every routing decision writes a row in the audit trail you can inspect per request.

  • Live AEMO + spot pricing in the planner
  • Per-request routing decision audit trail
  • Same price for the customer regardless of where it lands
03 · Grid

CURTAIL + demand response

When the network asks for less load, FlexSys gracefully drains GPU work and bids the freed capacity back into the spot market. Built on Flipped Energy's wholesale + retail experience and the Transgrid DMIA project.

  • Graceful drain on curtailment signals
  • Bids freed capacity into wholesale spot
  • Transgrid DMIA · CSIRO · UQ M&V partners
The full picture

Five layers, all replaceable

From the customer SDK at the top, down to the GPU at the bottom. Each layer has a clear responsibility, a public CRD or HTTP surface, and a repo.

01Customer SDK02Gateway03ControlPlane04Operator05Cluster
01 · OpenAI client

Customer SDK

Your app keeps using the OpenAI SDK. We're a drop-in URL swap. Same chat/completions, completions and embeddings surface — same tokens, same usage object.

built onOpenAI Python · Node · any HTTP clientrepoOpenAI Python SDK
02 · Auth · Route · Meter

Gateway

YARP-based reverse proxy. Authenticates the API key, asks the router for a routing decision, forwards to the chosen backend, captures usage, stamps decision/cost/latency headers, and writes the audit trail.

built on.NET 9 · YARP · Asp.Net minimal APIsrepogitea.flipped.energy / flexsys-gateway
03 · Plan · Pricing · Audit

ControlPlane

The brain. Owns tenants, plans, model SKUs, pricing per 1M tokens, backend availability, and routing decisions. Exposes an admin API to activate/deactivate backends and inspect routing audit rows.

built on.NET 9 · EF Core · Postgres · open-telemetryrepogitea.flipped.energy / flexsys-control-plane
04 · Kubernetes CRDs

Operator

Watches FlexSysJob custom resources. When a model is activated, it materialises Deployment + Service in the cluster, advertises the URL back to the ControlPlane, and the router targets it on the next decision.

built onkubebuilder · controller-runtime · client-gorepogitea.flipped.energy / flexsys-operator
05 · GPU workloads

Cluster

The actual GPUs serving tokens. Self-hosted in our DC today, or in your cluster (BYOC). The agent reports health and price signal back to the operator. We never see your data plane.

built onkind / k3s / EKS · NVIDIA device plugin · Ollama / vLLMrepogitea.flipped.energy / flexsys-agent
Workload types

What you can submit

Four kinds of FlexSysJob, all sharing the same scheduler and audit trail. We mark each one honestly — what's exercised end-to-end, what's smoke-tested, and what's still scaffolded.

Train
shipped (smoke-tested in kind)

Fine-tuning, LoRA / QLoRA, distillation runs

resources
nvidia.com/gpu: 1–8 · 64–256Gi mem
qos
Best-effort or Burstable · low priority
ImageGen
shipped (smoke-tested in kind)

SDXL / Flux batch + interactive generation

resources
nvidia.com/gpu: 1 · 24–48Gi mem
qos
Burstable · medium priority
VideoGen
scaffolded, not exercised yet

Text-to-video / image-to-video pipelines

resources
nvidia.com/gpu: 1–4 · 64Gi+ mem
qos
Burstable · medium priority
InferenceService
shipped, live in this demo

Chat / completions / embeddings (this site's playground)

resources
nvidia.com/gpu: 1 · 16–48Gi mem
qos
Guaranteed · high priority
Bring your own cluster · Shape 3

Run the data plane in your DC

Same operator, your hardware, our control plane.

Control planeSaaS · oursAgentoutbound only · TLSYour clusterdata planeOperatorkube CRsAudit + pricingmetadata onlyGPUsprompts stay hereBYOC · SHAPE 3
  • Your data plane stays in your DC
    Tokens, prompts and responses never leave your network. We see usage counts and routing metadata, nothing more.
  • Same operator runs at Flipped + at you
    One open-source kube operator. Identical FlexSysJob CRs. What we run in our cluster is what runs in yours.
  • Outbound-only agent — no firewall changes
    The agent dials out to our control plane over TLS. No inbound NAT, no exposed ingress, no shared VPC.
Talk to us about your DC
opens email to sales@flipped.energy

The control plane is SaaS-hosted by Flipped Energy — that's where tenants, pricing and routing decisions live. The data plane (the operator and the GPUs) stays in your DC. The agent is outbound-only over TLS, so there are no firewall changes, no inbound NAT, and no shared VPC. We see usage counts and routing metadata; prompts and responses never leave your network.

Grid integration · CURTAIL + DMIA

This is real, not theoretical

A regulator-overseen demonstration project on the live Transgrid network.

FlexSys is being demonstrated on the live Transgrid network under the DMIA project (Demand Management Innovation Allowance), with independent panel endorsement secured March 2026. DMIA is a regulator-overseen demonstration project — this is real infrastructure on a real grid, not vapourware.

The CURTAIL leg of the platform pairs the inference scheduler with Flipped Energy’s existing wholesale and retail energy market integrations: when the network operator asks for less load, FlexSys drains GPU work without dropping in-flight requests, and bids the freed capacity back into the wholesale spot market.

StatusDMIA panel endorsement · March 2026
  • AEMO Pre-Dispatch integration
    The scheduler reads AEMO Pre-Dispatch and 5-minute spot signals directly. Workload placement reflects the price the grid is paying right now, not yesterday's tariff curve.
  • High-resolution DC metering
    Targeting IEC 61000-4-30 Class A measurement at the DC point of common coupling. Real instrumentation, regulator-grade.
  • Independent M&V partners
    CSIRO and the University of Queensland are independent measurement and verification partners on the FlexSys CURTAIL trial.
What we are not

Infrastructure, not a middleman

Three things FlexSys deliberately is not. Helps procurement triangulate.

Not a reseller

We don't margin up someone else's GPUs and pretend that's the product. We run the operator, the scheduler and the gateway ourselves.

Not just a router

The inference router on the homepage is one surface. The real product is the platform underneath: the operator, the CRDs, and the grid integration.

Not a marketplace

There is no bidding stack, no third-party seller fan-out, no opaque price discovery. One scheduler, one audit trail, one bill.

Bring it to us

Bring your AI workloads. Or your cluster. Or both.

Get a demo key and try the inference router in 30 seconds, or talk to engineering about running the operator in your own DC.

Get a demo API key Talk to engineering
engineering@flipped.energy · live operator demos by request