Automated Provisioning Resource Allocation Workflow
Provisions real cloud infrastructure across multiple providers — from virtual servers on OpenStack, VMware, or Proxmox to Kubernetes namespaces on OpenShift. A deterministic 4-step pipeline with quota enforcement, approval gates, and automatic rollback on failure.
4 steps, deterministic flow
Every provisioning request progresses through exactly 4 steps, regardless of cloud provider. Each step tracks its own status. If any step fails, the platform automatically rolls back all allocated resources.
Allocate Resources
Create compute, storage, and network resources
Create VM with boot volume, attach security group
Configure Infrastructure
Set up networking, firewall rules, endpoints
Allocate floating IP, configure SSH/HTTP/HTTPS rules
Activate Service
Start the service and verify endpoints
Boot server, wait for ACTIVE status
Verify Health
Run health checks, confirm readiness
SSH connectivity test, HTTP probe
11 states, full traceability
Every provisioning request is tracked through its complete lifecycle — from queued through the 4-step pipeline to completion, suspension, or rollback. Each transition publishes a Kafka event.
Queued
Request received, awaiting processing
Pending Approval
Approval rule matched — waiting for admin
Allocating
Step 1 — creating resources
Configuring
Step 2 — networking & firewall
Activating
Step 3 — starting service
Verifying
Step 4 — health checks
Completed
Infrastructure ready, credentials delivered
Suspended
Service stopped (non-payment or request)
Failed
Step failed after retries exhausted
Rolling Back
Resources being cleaned up
Rolled Back
All resources released, quota freed
Multi-provider, unified interface
VPS Provisioning Flow
Provisions tenant-isolated Kubernetes environments. Each request creates 5 resources for complete namespace isolation with enforced quotas and network policies.
NamespaceTenant isolation containerResourceQuotaCPU, memory, storage limits enforced by K8sLimitRangeDefault pod resource requests and limitsRoleBindingTenant gets admin access within their namespaceNetworkPolicyIngress/egress isolation between tenantsFully functional mock provider for end-to-end testing. Configurable success/failure rates, realistic delays, and fake resources with provider-specific endpoints.
For vendor-managed services. Request pauses at “Awaiting Manual Input” — admin enters access details via API. Customer receives delivery email as normal.
Distributed, resilient
The provisioning worker is the engine that processes requests through the pipeline. Horizontally scalable via Redis distributed locks, with per-tenant fairness, automatic retry, and circuit breaker patterns on all provider calls.
Horizontally Scalable
Multiple worker instances run safely via Redis distributed locks
Per-Tenant Fairness
Max 5 concurrent requests per tenant prevents monopolization
Automatic Retry
Exponential backoff (5s → 10s → 20s → 40s → 5min cap), max 3 retries
Stuck Detection
Requests stuck for >30 minutes auto-fail
Graceful Shutdown
Releases all locks, waits up to 30s for in-flight work
Circuit Breaker
Bulkhead pattern on all provider calls prevents cascade failures
Exponential Backoff
5s
10s
20s
40s
5m
Quotas & approval gates
Enforcement Flow
Define rules that gate provisioning on manual approval. Rules can be global or tenant-scoped with priority ordering. Conditions include minimum CPU, memory, storage, resource types, and regions.
Example Rule
“High-Resource VPS Approval”
minCpu: 16 · minMemory: 32GB · minStorage: 500GB
types: [VPS] · regions: [eu-west-1] · priority: 10
Intelligent cluster selection
Infrastructure clusters are registered and managed centrally. The worker automatically selects the best cluster by matching provider type, region, capacity, and health status.
Type Matching
Match provider type to cluster type — OpenStack, OpenShift, simulation, manual
Region Matching
Match requested region — eu-west-1, us-east-1, geographic proximity
Health Verification
Verify cluster is online and provisioning-enabled, skip degraded clusters
Capacity Check
Check available CPU, RAM, storage, and VM slots before allocating
Default Preference
Prefer default cluster in region, fall back to first with sufficient capacity
Connectivity Testing
Admins test cluster connectivity on demand — success/failure, latency in ms
Post-provisioning management
After provisioning completes, customers manage their services through suspend, reactivate, and deprovision actions. Real-time metrics available for OpenStack, VMware, or Proxmox-provisioned services.
Tenants register webhooks for real-time provisioning events. HMAC-signed with SHA256 for verification. Automatic retry with exponential backoff and auto-deactivation after 5 consecutive failures.
Provisioning requests can declare prerequisites via dependsOnfields. The worker uses topological sorting (Kahn's algorithm) to process requests in the correct order. Circular dependencies are detected and rejected.
Platform in sync, real-time
Provisioning events flow through Kafka to keep the Order Service, Notification Service, Billing, and analytics synchronized across the entire platform.
provisioning.startedRequest created
provisioning.step-completedPipeline step done
resource.allocatedServer/namespace created
service.activatedEndpoints ready
provisioning.completedAll 4 steps done
provisioning.failedStep failed
service.suspendedService stopped
service.reactivatedService restarted
provisioning.rolled-backResources cleaned up
Automate your infrastructure
From request to running infrastructure — multi-provider, quota-enforced, fully observable, automatically recoverable.
Common Questions
The platform automatically rolls back all allocated resources. VMs are deleted, namespaces removed, floating IPs released, and cluster capacity decremented. A provisioning.rolled-back event triggers the Order Service saga to process a refund via the Billing Service. No orphaned resources, no mystery charges.
Workers use Redis distributed locks for safe horizontal scaling. Per-tenant fairness limits each tenant to 5 concurrent requests, preventing monopolization. Exponential backoff handles transient failures, and stuck detection auto-fails requests that hang for more than 30 minutes.
Yes. Define approval rules with conditions like minimum CPU, memory, storage, resource types, and regions. Rules can be global or tenant-scoped with priority ordering. When a request matches, it enters Pending Approval and waits for admin action. Approved requests resume automatically.
Requests can declare prerequisites via dependsOn fields. The worker uses topological sorting (Kahn's algorithm) to process requests in the correct order. Circular dependencies are detected and rejected. Dependent requests are deferred until all prerequisites reach Completed status.
Engineering culture
Short reads that sharpen your engineering instincts and help you stay ahead of the curve.