Platform Module

Automated Provisioning Resource Allocation Workflow

Provisions real cloud infrastructure across multiple providers — from virtual servers on OpenStack, VMware, or Proxmox to Kubernetes namespaces on OpenShift. A deterministic 4-step pipeline with quota enforcement, approval gates, and automatic rollback on failure.

Provisioning Pipeline

4 steps, deterministic flow

Every provisioning request progresses through exactly 4 steps, regardless of cloud provider. Each step tracks its own status. If any step fails, the platform automatically rolls back all allocated resources.

step 1

Allocate Resources

Create compute, storage, and network resources

Create VM with boot volume, attach security group

step 2

Configure Infrastructure

Set up networking, firewall rules, endpoints

Allocate floating IP, configure SSH/HTTP/HTTPS rules

step 3

Activate Service

Start the service and verify endpoints

Boot server, wait for ACTIVE status

step 4

Verify Health

Run health checks, confirm readiness

SSH connectivity test, HTTP probe

Each step: Pending → Running → CompletedAny failure → automatic rollback
Request Lifecycle

11 states, full traceability

Every provisioning request is tracked through its complete lifecycle — from queued through the 4-step pipeline to completion, suspension, or rollback. Each transition publishes a Kafka event.

Queued

Request received, awaiting processing

Pending Approval

Approval rule matched — waiting for admin

Allocating

Step 1 — creating resources

Configuring

Step 2 — networking & firewall

Activating

Step 3 — starting service

Verifying

Step 4 — health checks

Completed

Infrastructure ready, credentials delivered

Suspended

Service stopped (non-payment or request)

Failed

Step failed after retries exhausted

Rolling Back

Resources being cleaned up

Rolled Back

All resources released, quota freed

Cloud Providers

Multi-provider, unified interface

OpenStackproduction · direct REST API
live
NovaComputeVM lifecycle — create, start, stop, delete, resize
NeutronNetworkingSecurity groups, floating IPs, private networks
CinderBlock StorageVolumes, snapshots, boot-from-volume
KeystoneIdentityToken management, service catalog discovery

VPS Provisioning Flow

1
Authenticate to Keystone — get token + service endpoints
2
Create security group with default rules (SSH, HTTP, HTTPS)
3
Resolve compute flavor and OS image from Nova catalog
4
Generate root password + cloud-init user data
5
Create server with NVMe boot volume block device mapping
6
Allocate and assign floating IP for external access
7
Tag with metadata (request ID, order ID, tenant ID)
8
Poll until server reaches ACTIVE state
9
Return IP, hostname, endpoints, credentials
VPS · Private Cloud · Managed Database · Object Storage
OpenShift / Kubernetesproduction · K8s REST API
live

Provisions tenant-isolated Kubernetes environments. Each request creates 5 resources for complete namespace isolation with enforced quotas and network policies.

NamespaceTenant isolation container
ResourceQuotaCPU, memory, storage limits enforced by K8s
LimitRangeDefault pod resource requests and limits
RoleBindingTenant gets admin access within their namespace
NetworkPolicyIngress/egress isolation between tenants
endpoints: API server URL · OpenShift Console link
Simulationdev

Fully functional mock provider for end-to-end testing. Configurable success/failure rates, realistic delays, and fake resources with provider-specific endpoints.

Manualvendor

For vendor-managed services. Request pauses at “Awaiting Manual Input” — admin enters access details via API. Customer receives delivery email as normal.

Worker Engine

Distributed, resilient

The provisioning worker is the engine that processes requests through the pipeline. Horizontally scalable via Redis distributed locks, with per-tenant fairness, automatic retry, and circuit breaker patterns on all provider calls.

Horizontally Scalable

Multiple worker instances run safely via Redis distributed locks

Per-Tenant Fairness

Max 5 concurrent requests per tenant prevents monopolization

Automatic Retry

Exponential backoff (5s → 10s → 20s → 40s → 5min cap), max 3 retries

Stuck Detection

Requests stuck for >30 minutes auto-fail

Graceful Shutdown

Releases all locks, waits up to 30s for in-flight work

Circuit Breaker

Bulkhead pattern on all provider calls prevents cascade failures

automatic rollback
1
Worker detects non-retryable failure or max retries exhausted
2
Request transitions to Rolling Back
3
Provider deprovision() — VMs deleted, namespaces removed, IPs released
4
Cluster capacity decremented
5
Tenant quota decremented
6
provisioning.rolled-back event published to Order Service
7
Order saga triggers refund via Billing Service

Exponential Backoff

5s

10s

20s

40s

5m

Governance

Quotas & approval gates

tenant quota system
Max InstancesTotal number of active services
Max CPUTotal CPU cores across all services
Max MemoryTotal RAM in GB
Max StorageTotal storage in GB

Enforcement Flow

Order EventCheck QuotaOK → ProvisionExceeded → Fail
no quota entry = unlimited · violations return detailed messages
approval workflow

Define rules that gate provisioning on manual approval. Rules can be global or tenant-scoped with priority ordering. Conditions include minimum CPU, memory, storage, resource types, and regions.

1
Provisioning request created
2
Approval service evaluates all active rules
3
If rule matches → enters Pending Approval
4
Admin receives notification in dashboard
5
Approve → provisioning resumes automatically
6
Reject → request fails with rejection reason

Example Rule

“High-Resource VPS Approval”
minCpu: 16 · minMemory: 32GB · minStorage: 500GB
types: [VPS] · regions: [eu-west-1] · priority: 10

Cluster Management

Intelligent cluster selection

Infrastructure clusters are registered and managed centrally. The worker automatically selects the best cluster by matching provider type, region, capacity, and health status.

Type Matching

Match provider type to cluster type — OpenStack, OpenShift, simulation, manual

Region Matching

Match requested region — eu-west-1, us-east-1, geographic proximity

Health Verification

Verify cluster is online and provisioning-enabled, skip degraded clusters

Capacity Check

Check available CPU, RAM, storage, and VM slots before allocating

Default Preference

Prefer default cluster in region, fall back to first with sufficient capacity

Connectivity Testing

Admins test cluster connectivity on demand — success/failure, latency in ms

Service Lifecycle

Post-provisioning management

After provisioning completes, customers manage their services through suspend, reactivate, and deprovision actions. Real-time metrics available for OpenStack, VMware, or Proxmox-provisioned services.

SuspendStop the service — resources preserved, billing paused
ReactivateRestart a suspended service, billing resumes
DeprovisionPermanently delete — VM destroyed, namespace removed, quota freed
RetryRe-attempt a failed provisioning request
Manual CompleteAdmin fills in access details for vendor-managed provider
webhook notifications

Tenants register webhooks for real-time provisioning events. HMAC-signed with SHA256 for verification. Automatic retry with exponential backoff and auto-deactivation after 5 consecutive failures.

HMAC-signed — SHA256(secret, payload)
Auto retry — 3 attempts (5s, 30s, 5min)
Auto-deactivation after 5 failures
Full delivery history with status
Test delivery endpoint
request dependencies

Provisioning requests can declare prerequisites via dependsOnfields. The worker uses topological sorting (Kahn's algorithm) to process requests in the correct order. Circular dependencies are detected and rejected.

dependsOn: [“vps_request_id”] → topological sort → deferred until completed
Event-Driven Integration

Platform in sync, real-time

Provisioning events flow through Kafka to keep the Order Service, Notification Service, Billing, and analytics synchronized across the entire platform.

7
NTFY
4
BILL
3
AUDT
0
ORDR
4
PRTL
provisioning.started

Request created

NotificationAudit
provisioning.step-completed

Pipeline step done

NotificationPortal
resource.allocated

Server/namespace created

AuditBilling
service.activated

Endpoints ready

AuditPortal
provisioning.completed

All 4 steps done

NotificationBilling
provisioning.failed

Step failed

NotificationBilling
service.suspended

Service stopped

NotificationPortal
service.reactivated

Service restarted

NotificationPortal
provisioning.rolled-back

Resources cleaned up

NotificationBilling
Kafka streaming
exactly-once deliverysaga coordinationwebhook relay

Automate your infrastructure

From request to running infrastructure — multi-provider, quota-enforced, fully observable, automatically recoverable.

FAQ

Common Questions

The platform automatically rolls back all allocated resources. VMs are deleted, namespaces removed, floating IPs released, and cluster capacity decremented. A provisioning.rolled-back event triggers the Order Service saga to process a refund via the Billing Service. No orphaned resources, no mystery charges.

Workers use Redis distributed locks for safe horizontal scaling. Per-tenant fairness limits each tenant to 5 concurrent requests, preventing monopolization. Exponential backoff handles transient failures, and stuck detection auto-fails requests that hang for more than 30 minutes.

Yes. Define approval rules with conditions like minimum CPU, memory, storage, resource types, and regions. Rules can be global or tenant-scoped with priority ordering. When a request matches, it enters Pending Approval and waits for admin action. Approved requests resume automatically.

Requests can declare prerequisites via dependsOn fields. The worker uses topological sorting (Kahn's algorithm) to process requests in the correct order. Circular dependencies are detected and rejected. Dependent requests are deferred until all prerequisites reach Completed status.