📞 +91 9494 4655 07 ✉ krishnac.balaga@gmail.com 🔗 linkedin.com/in/krishnabalaga 🐙 github.com/krishnac7 🌐 krishnac7.com

Krishna Balaga

Cloud & Agentic AI Systems Architect

10+ years of engineering, product & sales · 9+ at IBM · 5 progressive roles.

I architect enterprise AI systems at IBM — watsonx and Cloud Pak deployments for APAC's largest banks, telcos, GSIs, and governments, from reference architecture down to the install scripts. I turn the hardest systems into stunning POCs that close deals, then own the client deployments end-to-end — orchestrating partners, IBM Expert Labs, product, and support to land them in production.

On weekends: BLE wearables with ADPCM streaming at <300ms first-audio · multilingual voice bots with in-memory graph RAG (12–15× faster than Chroma) · a clinical research engine mapping symptom ↔ gene ↔ drug correlations from biomedical literature. Same muscle, different scale — whether the budget is $5M or zero, I ship what I design.

$5.2M+ enterprise AI deal won (2026)

4.7B/50d tokens on average — AI-native engineering

4,000/day users — public health platform shipped in 3 weeks (2021)

US Patent #19/452620 — 'Sustainable Charging Station Selection' (2026)

What I do

Career so far — every move a deliberate scope expansion

2016–2017 · Product Developer at Axelta Systems & Loop Reality, Hyderabad — stereo-vision rack management + traffic-density billboard scoring (OpenCV); ambient-monitoring hardware for refrigerated trucks (Arduino, ESP, embedded C); Android OS mods for custom hardware; VR-cycle induction-braking control + universal BLE vital-signs module integrated with Unreal Engine.

2017–2019 · 2 yrs

Solution Architect

CTO Executive Team · IBM

Architected internal Language Data crowdsourcing platform feeding IBM's NLP training
Architected Fleet Management — real-time IoT telemetry + event processing
Outstanding Technical Achievement Award, ISL Star Award, Needle Mover Award

2019–2020 · 1 yr

Developer Advocate

Data / AI Client Advocacy · IBM

Architected Data Lake for India's largest automotive client
Designed CBSE AI curriculum (national board) + 3 EdTech partners
Shipped P3DR Nepal — earthquake tolerance app piloted by Nepalese Government
Open-source Node-RED module (~120 weekly downloads)

2020–2022 · 2 yrs

Data Scientist & Cloud Solutions Engineer

Hybrid Cloud Build Team · ISA · IBM

Brought in 14 new enterprise logos, $210K USD direct revenue
Shipped Combat COVID in 3 weeks — 4,000 users/day, integrated into ICMR COVID Assist. IBM Global Research Accomplishment
Architected ML pipeline with drift detection + continuous retraining on Kubernetes (CP4D/OpenShift)
First joint CP4D + CP4S enablements in region — became internal playbook. Outstanding Innovator Award (2020)

2022–present · 3.5 yrs & counting

Advisory CSM Architect · watsonx APAC

Cloud & AI Solutions Lead · Partner Ecosystem Lead

Architected IBM's first-of-a-kind watsonx.governance deployment (2025) — adopted as an IBM Reference architecture across APAC. Separately, watsonx Orchestrate agentic platforms at GSIs — SAP PO processing (vision-LLM), SharePoint integration, ERM, vendor onboarding
Technical lead for 22+ enterprise accounts across India, Sri Lanka, and Singapore (BFSI / Telecom / Healthcare / Mfg / Govt / GSI). Also lead for partner enablement, Data & AI User Groups, CSM newsletter, and the CSM portal

Things I've shipped — systems I designed and delivered

Enterprise SaaS · platform I built

theFactory

Node.js 20 · esbuild · Redis · IBM Code Engine · watsonx

Drop a folder — it becomes a live customer demo. Plugin auto-discovery + runtime esbuild, Orchestrate-first LLM routing (timeout, caching, retryable 503s), 3-tier AES-256-GCM credentials, 29-method pluggable IStore. Hosts 12 verticalized demos (BFSI fraud, Telecom churn, Healthcare claims, Mfg predictive-maint, Govt citizen-service, etc.) used by IBM Sales / GSI Labs APAC.

The move: treated the runtime bundle as a plugin. Dropping a folder triggers esbuild → plugin manifest → auto-mount. 12 verticalized demos, one platform, zero forks.

Public health · shipped in 3 weeks

Combat COVID

Flutter · FastAPI · IBM Cloud · India's 2021 Delta wave

Oxygen / beds / blood locator. Peaked at 4,000 users/day, integrated into ICMR's COVID Assist. IBM Global Research Accomplishment.

The move: crowdsourced leads went through a dedup + staleness-decay queue before surfacing, so one fake entry couldn't poison the map; outbound rate-limited to stop hospital flooding.

Agentic AI · full-stack

Orchestrate — full-stack agentic

watsonx Orchestrate · MCP · FastMCP

Infra control plane (14 ops as MCP tools) · business workflow (vision-LLM w/ graceful fallback) · 6-agent supervisor · LLM routing fabric with retryable 503s.

The move: treated cloud infra itself as a first-class tool surface — 14 control-plane APIs wrapped as MCP so agents provision, not just advise.

Connected cars · India's largest automaker · live in production

A connected-cars data platform

Kafka · Spark · Hadoop · IBM Cloud Pak for Data · real-time telemetry

Data-lake and analytics architecture for India's largest automotive manufacturer — ingesting connected-car telemetry across the fleet, powering diagnostics, predictive maintenance, and fleet operations. Live in production.

The move: designed the ingest + stream-processing + analytics tiers so the OEM could run real-time fleet ops and historical deep-dives on the same lake without forking pipelines.

Consumer social · psych-informed

A spontaneous-social app

Flutter · Firebase · Riverpod · 14 systems · 2,138+ tests

Map, ephemeral events, in-event chat, tribes, nudge queue. 60+ SOPs, widget-test harness per feature, feature-first clean architecture.

The move: 3-vibe taxonomy (not 40 categories) to stop discovery fragmentation; nudge queue is capacity-aware — if you've been pinged recently, the system holds back instead of spamming.

Hardware · firmware → cloud

An AI memory wearable

nRF52840 firmware · 8kHz PDM · ADPCM/BLE 5.0 · Flutter macOS

Always-on memory trigger. VAD wakes it only when you speak, IMU double-tap gesture. Backend: WhisperX (STT + diarization) + Silero VAD + FastAPI summarizer.

The move: firmware-side VAD + 8:1 ADPCM over BLE 5.0 means pendant transmits only speech, not audio. Double-tap IMU gesture = no visible button, no social stigma.

Distributed multi-agent coding · cross-validation · open-source

Stoa

WebSocket daemon · append-only JSONL log · floor-control protocol · MCP + CLI adapters · Claude · Codex

You give the room one task; the agents self-split it, execute under shared dev tools and project rules, and cross-validate each other's patches before anything ships. Every patch, run, lock, review, and handoff lands in an append-only event log — the audit trail is the artifact. Patch/review/apply puts one agent's edit in front of another for sign-off; soft TTL locks prevent collisions; floor-control keeps the human on intent and final review.

The move: distributed intelligence with cross-validation across vendor boundaries — Claude and Codex don't call each other's APIs, they coordinate through shared room events under one rulebook. One task in; peer-reviewed multi-agent work out.

A few more worth naming — ibmcloudtoolkit: a 14-tool MCP server wrapping IBM Cloud provisioning APIs (VSIs, buckets, clusters, IAM, networks) so Orchestrate agents provision infra, not just describe it; open-source shell installers for IBM Cloud Paks (CP4D / CP4I / CP4App / CP4MCM / CP4Auto / IAF / Portworx); end-to-end MLOps on Cloud Pak for Data + OpenShift — drift detection, automated retraining, and model lifecycle across tier-1 banks (first joint CP4D + CP4S enablements in region, became internal playbook); air-gapped on-prem watsonx.ai + NVIDIA GPU-cluster deployments for APAC telcos; on-demand LLM batch-processing on NVIDIA NIM + HPC for bursty inference. 87 public GitHub repos for the rest.

What I'm deep in

Customer-Facing Engineering & GTM: Pre-sales solution architecture · executive technical advisory · discovery, evaluation & proof-of-value · reference-architecture authoring · architecture decision records & trade-off analysis · Day 0 → Day 1 → Day 2 engagements (design · deploy · operate) · co-sell with partners, GSIs & ISVs · program management across multi-stakeholder rollouts · customer adoption & expansion · user-group & newsletter programs · team leadership & uplift
Applied AI, Agentic Systems & AI Governance: Agentic system architecture — multi-agent orchestration & supervisor patterns · context engineering & eval-driven prompt design · tool-use, function-calling & parallel execution · MCP servers, sub-agents & agent skills · SDK-level integration with Claude Code & Agent SDK internals · LLM routing with fallback & retries · agentic RAG · Graph RAG · dataset curation & versioning · hardening demos into production · deployment at scale · AI governance as infrastructure — evals, guardrails, policy-as-code, model cards, audit trails · regulatory alignment (DPDP Act, GDPR, EU AI Act) · watsonx (.ai, Orchestrate, .governance)
Cloud, Platform & AI Infrastructure: Multi-cloud, hybrid, air-gapped & on-prem deployment (AWS · Azure · GCP · IBM Cloud) · Kubernetes & OpenShift platform ownership (operators, CRDs, multi-tenancy) · IaC & GitOps at platform scale (Terraform modules, ArgoCD-style delivery) · scale-to-zero serverless inference · GPU cluster architecture for training & inference · MLOps — drift detection & continuous retraining · vector databases & retrieval infra · model serving & gateways · zero-trust identity fabric (OIDC, SAML, SCIM, fine-grained RBAC) · FedRAMP & data-residency · service mesh, API gateway & BFF patterns · multi-region active-active with RPO/RTO targets · observability (OpenTelemetry, SLO/SLA) · FinOps
Distributed Systems, Data & Software Architecture: Domain-driven & clean architecture with pluggable ports & adapters · event-driven patterns (CQRS, saga) · legacy modernization (strangler-fig, anti-corruption layers) · multi-tenant SaaS · streaming data-lake (Kafka, Spark, Hadoop) · sharding, replication & change-data-capture · plugin & extensibility runtimes · resilience patterns — circuit breakers, bulkhead, backpressure, rate limiting · firmware-to-cloud pipelines (BLE, VAD, ADPCM streaming) · edge & hardware platforms (NVIDIA Jetson, IBM Z) · polyglot — Python, Node.js, TypeScript, Flutter/Dart, C/C++

Recognition — consistency over peaks

IBM 100% Club — 4 consecutive years (2022–2025)
US Patent #19/452620 — "Sustainable Charging Station Selection for Mobile Computer Systems" · Docket P202402359US01 (granted 2026)
IBM Global Research Accomplishment — Combat COVID (2021)
IBM Call for Code grant recipient — P3DR Nepal (earthquake-response app piloted by Nepalese Government)
Top CSR contributor — IBM (2019, 2021)
watsonx Challenge Winner · Q1 Client Reference Story Winner · TechXchange Market Challenge — ISA 2nd place (2025)

Responsible AI Agents BlogBuster — winner (2025)
33 BluePoints recognitions · 2,280+ points — from Directors down. Described by CSM Market Leader as "a standout expert in Data & AI."
Featured in APAC-wide "Mission Possible" comms by Hans Dekkers — GM & VP, IBM Technology Sales APAC
IBM Code Pattern contributor — reverse-image-search, IotImageAnalysis (adopted as IBM educational content), and other production GenAI samples in IBM's official open-source catalog (pre-watsonx era)
Returning intern-mentor (2025 + 2026 cohorts) · EduBot AI supervisor · cross-team architecture review lead

Speaking & thought leadership

14+ named external conference stages — IBM Think (×5), IBM TechXchange (×2), CYPHER (India's largest AI conference, 1,100+), IBM Dev Day (×9+, AI Track Workshop Lead), Sterling OMS User Group, Gartner India, IBM Data & AI User Group Mumbai (I organized it), Infosys External Speaker Series. 90+ total talks; max live audience 3,000+; Speaker NPS 9.6.

Judge — HackOn Agentic AI (1,140 devs, 220+ teams). SME — NatWest AI for IT Hackathon. Organized IBM Call for Code across Bangalore, Pune, Hyderabad, Odisha, Kerala. Authored IBM TechXchange 2024 lab — "Governing Private LLM Deployments."

Original research & thought leadership — differential-privacy pipeline for MaaS360 device data (analytics without exposing individual device identity); neural-pattern-mapping study for puzzle learning in mice — University of Toronto collaboration, applying ML to neuroscience data.

Education & certs

BTech Electronics & Communication — MVGR College of Engineering (2016) · IEEE Student Chapter President (2 years) · Solar car team (4th at Maruti Suzuki Solar Challenge finals)
BA Psychology (2021–2023) & certified counsellor — completed alongside full-time IBM. Consent, capacity, and nudge-design show up in my consumer work (capacity-as-feature, behavioral nudges, consent-first recording).
IBM L4 Certified across watsonx portfolio · watsonx.ai Practitioner Advanced · API-led Integration Practitioner Advanced · IBM Quantum Ambassador · Quantum Algorithms · Guardium (10+ IBM technical badges) · advancing to L5 Practitioner Mastery

Full write-ups, diagrams, and demos → krishnac7.com