TL;DR: A Pod is a cloud GPU machine managed by FlowScale AI that powers your ComfyUI workspace and any APIs/playgrounds you deploy from a project. Pods give you fine‑grained control over cost, performance, access, and scale—all without having to provision infrastructure yourself.

Why Pods Exist

Run ComfyUI

Every FlowScale AI project needs an attached pod to spin‑up a ComfyUI workspace in the cloud

Serve APIs & Playgrounds

Deploy workflows as API endpoints with a Playground UI

Scale on Demand

Serverless GPU containers that scale automatically based on traffic

Key Concepts

Pod Types & GPU Configurations

T4 - 16GB VRAM

Best for: Development, simple workflows, testingPerfect for getting started and lightweight image generation tasks

A10G - 24GB VRAM

Best for: Enhanced development, moderate workloadsCost-effective option with improved performance over T4

L4 - 24GB VRAM

Best for: Standard image generation, medium workloadsBalanced performance for most production use cases

L40S - 48GB VRAM

Best for: Advanced image/video generation, large modelsHigh-memory GPU for complex multi-modal workflows

A100 - 40GB VRAM

Best for: Complex workflows, large model trainingHigh-performance computing for demanding AI tasks

A100 - 80GB VRAM

Best for: Massive models, extensive batch processingMaximum A100 memory for the largest workloads

H100 - 80GB VRAM

Best for: Largest models, real-time inferenceTop-tier performance for the most demanding workloads

H200 - 141GB VRAM

Best for: Massive models, extreme workloadsMaximum memory capacity for the largest AI models

B200 - 180GB VRAM

Best for: Next-generation AI inference, cutting-edge modelsLatest architecture with massive memory for breakthrough AI performance

Creating a Pod

1

Navigate to Pods

Go to Pods → ➕ New Pod
2

Select GPU Type

Pick a GPU Type that balances memory requirements vs. price
3

Configure Containers

Set Allotted GPU Containers (start with 1 unless you expect heavy traffic)
4

Set Budget (Optional)

Tick Assign Credit Budget and enter a limit to control costs
5

Create

Click Create Pod — ComfyUI is now one click away!

Pod Settings Configuration

Navigate to Pods › Your Pod › Settings to access three configuration tabs:
Pod configuration interface showing GPU selection, resource allocation, and performance settings with clear options and recommendations
  • Rename or add description to your pod
  • Toggle Assign Credit Budget and manage spending limits

How Pod Scaling Works

Key Phases:
  • Cold Start – Container boots and loads workflow graph ± model weights
  • Warm Pool – Containers stay resident until Idle Timeout expires
  • Round‑Robin – FlowScale AI evenly distributes jobs to maximize utilization

Cost Control Best Practices

Monitoring & Performance

Real-Time Metrics

Monitor CPU, GPU, memory, and storage utilization in real-time

Performance Analytics

Track execution times, throughput, and latency patterns

Error Tracking

Monitor error rates, failure types, and get debugging information

Cost Analytics

Track resource costs and identify optimization opportunities

Next Steps