Decentralized GPU Marketplace

Decentralized GPU. On-Demand AI.

A decentralized GPU marketplace where developers run AI inference on demand and GPU owners monetize idle hardware through a simple API.

Get API Key Register as Host

How it works

From request to response in four steps.

Submit

Send a model inference request through the SDK or REST API with your prompt and model choice.

Route

The broker evaluates all available hosts and routes to the cheapest GPU with sufficient VRAM.

Execute

The host agent loads the model, runs inference on its GPU, and begins generating output.

Stream

Results stream back to you in real time via WebSocket. Pay only for what you use.

For Developers

Run AI models with five lines of code

Use the Python SDK to run inference on any supported model. Results stream back in real time. No GPU required on your end.

View SDK docs

inference.py

from sdk.compute import ComputeClient

client = ComputeClient(url, api_key="YOUR_KEY")
job = client.run_model("gpt2", prompt="Hello")

for event in client.stream_job(job["job_id"]):
    print(event["text"], end="")

For GPU Owners

Earn by sharing your idle GPU

Install the host agent, register your GPU, and start earning. Auto-detection, automatic bidding, and transparent payouts.

12%

Platform fee

Auto

GPU detection

$0.12/hr

Avg. host earnings

Supported models

Run popular open-source models out of the box.

Model	Parameters	VRAM Required	Status
GPT-2	124M	2 GB	Available
GPT-2 Medium	355M	4 GB	Available
GPT-2 Large	774M	6 GB	Available
DistilGPT2	82M	1.5 GB	Available
GPT-2 XL	1.5B	8 GB	Coming Soon

Architecture

A simple three-layer system connecting developers to GPU providers.

UserSDK / API

BrokerRouting + Billing

Host AgentGPU Inference

Start building on Infrintia

Run AI inference on decentralized GPUs or earn by sharing your idle hardware.

Start building Learn about hosting