Core Responsibilities
1. Multi-Cloud Architecture & IaC
- Architect scalable, secure, and resilient cloud solutions across Azure and GCP using Infrastructure-as-Code (IaC) tools like Terraform or Bicep.
- Manage multi-cloud infrastructure, optimizing for cost, performance, and reliability with cloud-native services.
- Leverage Azure components (e.g., AAD, Azure Functions, Event Hubs, App GW, Virtual Network, NSG, Load Balancer, Azure VM) for microservices and solutions.
- Leverage GCP components (e.g., Cloud IAM, Cloud Functions, Pub/Sub, Cloud Load Balancing, VPC, Firewall Rules, Cloud Run, GKE, and Compute Engine) for microservices and solutions
- Implement IaC with version control and state management; orchestrate containerized apps using Docker and Kubernetes (AKS/GKE) with service mesh and autoscaling.
- Design disaster recovery, multi-region deployments, and governance via tagging and policies (e.g., Azure Policy).
- Manage databases like Azure SQL, PostgreSQL, MySQL, MongoDB, or TSDB.
2. DevOps & CI/CD Leadership
- Build and maintain CI/CD pipelines using GitHub Actions, Jenkins, or Azure DevOps with parallel execution, artifact management, and automated rollbacks.
- Automate deployment strategies (blue-green, canary, rolling updates, feature flags) to minimize downtime and risk.
- Integrate cloud services, troubleshoot issues, and automate security (vulnerability scanning, secrets management, compliance).
- Code in at least one of: Java, Python, Go, Node.js; experience with SRE development and OpenTelemetry.
3. RunOps & FinOps
- Ensure platform reliability and SLAs through RunOps processes and monitoring with Prometheus, Grafana, ELK Stack, DataDog, or New Relic.
- Develop observability dashboards for Azure/GCP services, set monitoring thresholds, KPIs, and SLAs, and manage large-scale observability data.
- Use tools like Log Analytics, App Dynamics, Splunk, and ServiceNow for monitoring and service management.
- Drive FinOps for cost optimization via resource rightsizing, autoscaling, and commitment-based discounts (e.g., Reservations, Savings Plans).
- Conduct compliance audits (ISO, SOC 2) and enforce security best practices.
4. Leadership & Team Development
- Mentor and train engineers, resolve client issues, and foster team morale.
- Proactively anticipate challenges, think strategically, and embrace continuous learning for professional growth.
Required Skills & Qualifications
- 10-15+ years in Cloud Engineering/Architecture, focusing on large-scale cloud solutions.
- Expert in Azure (Azure DevOps, AAD, Security Center) and strong GCP experience (networking, security, compute/data services).
- Proficient in IaC (Terraform, Bicep/ARM), Git, and CI/CD (GitHub Actions, Jenkins).
- Skilled in FinOps, cloud security, IAM/SSO (AAD), and compliance frameworks.
- Strong leadership, communication, and mentoring skills.
- Certifications: Azure Architect (AZ-305) or GCP Professional Cloud Architect (preferred).