Senior DevOps Engineer
Acclaim AI · São Paulo
Descrição do cargo
About the role
We are seeking a Senior DevOps Engineer to join our AI‑focused team. You will design, build and operate a micro‑services platform that runs on Kubernetes across multiple clouds and on‑premise environments, supporting GPU‑accelerated ML inference services.
Key responsibilities
- Deploy, operate, and evolve a microservices‑based platform on Kubernetes clusters across AWS, GCP, and on‑premise (Rancher).
- Operate GPU‑based ML inference services (Triton Inference Server, vLLM) on RunPod, Scaleway, and Nebius.
- Build and maintain Docker images for all microservices and ensure a stable service lifecycle.
- Maintain and scale development and production clusters, debug deployments, investigate incidents, and troubleshoot performance.
- Develop and evolve custom Helm charts for each service.
- Design CI/CD pipelines using GitHub and GitLab for on‑premise customer deployments.
- Ensure platform compliance with SOC 2 requirements and improve security and compliance processes.
- Manage cluster access via NetBird VPN with role‑based access control.
- Deploy and manage infrastructure using Terraform and Ansible.
- Develop observability systems: Grafana, Prometheus, and ELK stack for metrics and logs.
- Continuously optimize IaC, IAM, observability, and CI/CD practices.
Required profile
- Minimum 5 years of experience in DevOps or Site Reliability Engineering roles.
- Strong hands‑on Linux system administration.
- Proven experience implementing SRE practices and building observability stacks.
- Ability to thrive in high‑uncertainty environments and rapidly learn new technologies.
- Proactive and strategic mindset to choose long‑term architectural solutions.
Required skills
- Linux
- Kubernetes
- AWS
- Google Cloud Platform
- Rancher
- Docker
- Helm
- GitHub
- GitLab
- Terraform
- Ansible
- Python
- Grafana
- Prometheus
- Loki
- ELK stack (Elasticsearch, Logstash, Kibana)
- PostgreSQL
- ClickHouse
- Kafka
- Superset
- NetBird VPN
- Triton Inference Server
- vLLM
What we offer
- Work on award‑winning AI products for leading tech corporations.
- Access to cutting‑edge technologies such as speech, NLP, generative AI and on‑premise deployment.
- High engineering standards with full ownership of production systems.
- Collaborative team focused on real‑world impact.
Questions fréquentes
Motivo do reporte
Candidate‑se em 30 segundos
Introduza o seu e‑mail para candidatar‑se. Uma conta será criada automaticamente.
Ao continuar, aceita os nossos termos de uso.
Já tem uma conta? Entrar
Publicado há 1 dia
Expira em 1 mês
6 visualizações · 0 candidaturas
Aumente suas chances
Envie seu CV: vamos sugerir as vagas que combinam com seu perfil.
A analisar o seu CV...
Acclaim AI
São Paulo
Ofertas de emprego relacionadas
-
Coordenador de Projetos em Sistemas
Hospital São Camilo SP São Paulo -
Desenvolvedor GeneXus Pleno ou Júnior – São Paulo
Socium Partner São Paulo -
Desenvolvedor(a) React Native
MBRAS São Paulo -
Spécialiste Cyber Sécurité – Niveau II
santander SAO PAULO -
IT Developer II (Cobol) - Pessoa com Deficiência
santander