Key Responsibilities
Cloud Operations & Reliability
- Operate and support multi-tenant SaaS workloads across multiple AWS accounts.
- Ensure high availability and resilience through proactive monitoring, troubleshooting, and incident response.
- Manage lifecycle tasks (patching, scaling, upgrades, backups, and DR exercises).
CI/CD & Automation
- Own and enhance CI/CD pipelines (ArgoCD, GitHub Actions) for reliable, repeatable deployments.
- Automate operational workflows (infrastructure, application releases, reporting).
- Support engineering teams with smooth delivery pipelines and self-service tooling.
FinOps & Cost Optimization
- Monitor, analyze, and optimize AWS usage across accounts.
- Drive cost savings through compute optimization (Graviton/AMD migrations, RDS tuning), storage tiering, and right-sizing.
- Partner with finance and engineering stakeholders to align cost efficiency with performance.
Database & Data Operations
- Operate, scale, and optimize MongoDB and RDS clusters in production.
- Monitor database performance, indexing, replication, and backup/restore processes.
- Collaborate with data engineering to ensure stable data pipelines and integrations.
Observability & Incident Response
- Implement and manage monitoring and logging with Splunk, Grafana, OpenTelemetry, and AWS CloudWatch.
- Define SLI/SLO metrics and drive continuous improvements in availability and performance.
- Lead incident response (P0/P1/P2), root cause analysis, and postmortems.
Security & Compliance
- Apply least-privilege IAM practices, patching, and hardening.
- Ensure compliance with healthcare and industry standards (GXP, GDPR, HIPAA, NIST).
- Support audit readiness (SOC 2, ISO
Required Experience & Qualifications
- Experience: 7+ years in DevOps or cloud infrastructure roles, with significant experience in SaaS and multi-tenant platforms. Proven track record of mentoring team members in Cloud infrastructure related projects.
- Cloud Expertise: Expert knowledge of AWS services, including VPC, IAM, EC2, S3, RDS, Lambda, EKS, AWS WAF, AWS EventBridge, and AWS CloudTrail.
- Containerization & Orchestration: Deep proficiency in Docker, Kubernetes, Helm, and associated ecosystem tools.
- CI/CD Proficiency: Expertise in CI/CD tools such as ArgoCD and GitHub Actions.
- Infrastructure as Code (IaC): Advanced experience with AWS CDK (TypeScript preferred) and CloudFormation.
- Networking: Strong understanding of AWS networking services such as VPCs, Transit Gateway, ALB, and Security Groups.
- Security: In-depth knowledge of IAM, AWS KMS, encryption standards, AWS WAF, and security compliance frameworks including NIST.
- Monitoring & Alerting: Extensive experience with OpenTelemetry, Splunk, Grafana, AWS CloudWatch, and AWS CloudTrail for monitoring and incident response.
- Data & ETL Pipelines: Familiarity with AWS Glue and Managed Kafka for real-time and batch data processing.
- Programming & Automation: Strong scripting and automation skills using TypeScript and Bash.
- Multi-Account AWS Management: Experience managing multiple AWS accounts with AWS Control Tower.
- Communication & Collaboration: Exceptional verbal and written communication skills, with the ability to explain complex technical concepts to diverse stakeholders.
Desired Experience & Qualifications
- Advanced expertise in AWS CDK, including building complex, reusable constructs and pipelines.
- Familiarity with Projen for automating CDK project configuration and management.
- Hands-on experience with Helm charts and Kubernetes manifests.
- Experience with monitoring and logging tools such as Splunk, Grafana, and AWS CloudWatch.
- Exposure to multi-tenant SaaS platforms and best practices.
- Experience working with AI tools and frameworks.
Personal Attributes
- Mentor & Leader: Enjoys mentoring team members, and fostering a collaborative, innovation-driven team culture.
- Organized & Adaptable: Able to manage multiple priorities and thrive in a fast-paced environment.
- Innovative: Passionate about leveraging technology to solve complex problems and drive efficiency.
- Customer-Focused: Dedicated to building infrastructure that delivers measurable business and customer value.
Work Arrangement
This is an in-office role based in Shanghai, China, with a requirement to work a minimum of three days per week on-site. Remote or travel flexibility is not available.
Join Evinova and redefine healthcare with us. Apply now to be part of a team that's transforming life sciences with technology, data, and innovation.