Lead DevOps Engineer
Noxtua
Seniority
Senior
Model
Remote
Sector
Salary
€98,000 – €118,000
Contract
Full-Time
Lead DevOps Engineer taking ownership of Noxtua's infrastructure — a managed Kubernetes environment across multiple stages on sovereign EU cloud infrastructure (OTC), alongside self-hosted GPU servers. You'll set technical direction for a small DevOps team of 4–5 engineers while staying hands-on with the systems themselves.
What you'll do
- Own and optimize Noxtua's infrastructure across OTC and self-hosted GPU servers — ensuring efficient architecture, reliable operation, and cost control.
- Lead and grow a team of 4–5 DevOps engineers, setting technical direction, supporting their development, and having a strong ownership mindset.
- Operate the self-managed GPU server fleet — provisioning, driver installation, hardening, and connectivity via Ansible — and manage provider SLAs to keep heavy AI workloads running reliably.
- Build and maintain infrastructure automation using Infrastructure as Code (Terraform & Ansible).
- Run the container platform on Kubernetes, support teams with Docker, and keep services stable, accessible, and secure.
- Set up and maintain monitoring and alerting (e.g., Prometheus, Grafana) to ensure system reliability and performance.
- Develop and maintain CI/CD pipelines and collaborate with development and AI teams to automate deployments.
What you'll need
- Experience leading or mentoring a team, setting technical direction, and balancing hands-on operations with people responsibility.
- Managed a fleet of servers and understand the methodology behind it — including OS-level operations on real hardware and working with provider SLAs.
- Strong proficiency in Linux and Bash, plus a scripting language such as Python.
- Proven track record designing, operating, and cost-managing cloud-based architectures — ideally OTC, or transferable experience from AWS, Azure, or Google Cloud — with solid networking fundamentals.
- Strong focus on automating provisioning and configuration with Terraform and Ansible.
- Expertise in containerizing applications with Docker and running them at scale on Kubernetes.
- Able to set up and maintain monitoring/alerting tools (e.g., Prometheus, Grafana).
Nice to have
- Experience with GPU servers.
What they offer
- 100% remote work possible (given a German residence), other countries upon request.
- Flexible working hours.
- 26 days vacation + December 24th & 31st off, + 1 additional day per year of employment (up to 30 days).
- Equipment: Laptop (Lenovo or Mac), plus €1,000 net home office setup budget.
- Urban Sports Club Membership discounts depending on location.

