返回查询:AI Ops / 广州市
  • 知名金融科技外企 Global fintech company
  • 0-1项目搭建 0-to-1 project build-out

About Our Client
Our client is a well-established organisation within the financial services sector.

Job Description
Responsibilities 职责:

  • 设计、构建并持续优化面向 AI/ML 的 Kubernetes 云原生基础设施与 CI/CD 体系。

Design, build and continuously optimize cloud K8s infrastructure and CI/CD pipelines for AI/ML workloads.

  • 落地端到端 MLOps / AIOps 自动化方案,保障模型开发、训练、部署及推理全生命周期的高可用与 SLA。

Implement end-to-end MLOps / AIOps automation to guarantee high availability and SLA across the entire model lifecycle.

  • 与 AI 上下游团队(算法、数据、平台)紧密协作,快速定位并解决生产环境故障,推动系统可观测性与稳定性提升。

Partner closely with AI upstream/downstream teams to rapidly troubleshoot and resolve production incidents, driving observability and system resilience.Requirements 要求:

  • 经验 Experience

5 年以上 DevOps / SRE 经验,有 AI/ML 项目全生命周期交付与线上运维经验者优先。

5+ years of DevOps / SRE experience; hands-on experience shipping and operating AI/ML workloads.

  • 技术栈 Technology Stack
  • Kubernetes 深度实践:集群联邦、多租户隔离、NetworkPolicy、CRD 开发与 Operator 编写。

Deep Kubernetes expertise**: federation, multi-tenancy, NetworkPolicy, CRD & Operator development.

  • CI/CD & IaC 熟练:Argo CD、Tekton、GitHub Actions;Terraform、Helm、Kustomize。

Proficient in CI/CD & IaC**: Argo CD, Tekton, GitHub Actions; Terraform, Helm, Kustomize.

  • 公有云实战:AWS EKS / GCP GKE / 阿里云 ACK,精通 Spot/Preemptible 成本优化、多 AZ/Region 灾备架构。

Hands-on with public clouds**: AWS EKS, GCP GKE or Alibaba Cloud ACK; expert in Spot cost optimization and multi-AZ/Region disaster recovery.

  • AI 协同能力 AI Collaboration

理解 GPU/NUMA 调度、Docker 镜像分层优化、Kubeflow / MLflow 流水线;能与数据科学家并肩调试模型推理延迟、批量离线任务及 A/B 灰度发布。

Familiar with GPU/NUMA scheduling, Docker layer optimization, Kubeflow / MLflow pipelines; able to jointly debug inference latency, batch offline jobs and A/B canary releases with data scientists.

  • 语言与沟通 Language & Communication

英文流利,可作为工作语言;有海外留学或跨国企业工作经历者优先。

Fluent English (working proficiency); overseas education or multinational company experience preferred.

The Successful Applicant

  • 精通 DevOps 端到端流程,具备 AI 基础认知与 AI Ops 实战经历;有云原生落地经验者优先。

Expert in end-to-end DevOps, with foundational AI knowledge and hands-on AI Ops; cloud- experience preferred.

  • 海外留学或工作背景,英文可作为工作语言。

Overseas education or work experience; fluent English for daily business.

  • 熟悉金融行业,尤以银行业务为佳。

Financial-sector familiarity-especially banking-is a strong plus.

  • 其他 AI 方向人才亦热忱欢迎来信交流,尤其具备 ORM(Object-Relation Mapping)或 VLM(Vision-Language Model)经验

Candidates with any other AI expertise-particularly ORM or VLM-are also encouraged to apply or enquire.

What's On Offer

  • 极具竞争力的薪酬包

Highly compensation package

  • 弹性工作与远程友好

Flexible & remote-friendly work arrangement

  • 从 0 到 1 搭建项目,技术跃迁式成长

Green-field 0-to-1 setup for exponential tech growth

  • 全球业务视角,国际化舞台

Global business exposure on an international stage

Contact: Ella Guo
Quote job ref: JN