Senior DevOps Engineer - Video Production
Metaphysic.ai
About Metaphysic:
Metaphysic is the industry leader in developing AI technologies and machine learning research to create photorealistic content at the internet scale. We were recently named TIME100 Most Influential Companies for 2023 and are focused on the ethical development of AI to support the genius of human performance. We’re run by experienced founders and backed by some of the top investors in the world. We’re only just getting started in defining the next generation of content creation. Join our fast-growing team and help bring our groundbreaking vision to life.
Job Description:
The team behind the bespoke workflow automation system is looking for a Senior DevOps Engineer who is happy to dabble with Python. This team is dealing with a long list of interesting challenges like deployments to actual hardware in the data center, execution of GPU-heavy jobs at scale, or management of horizontally scalable NewSQL databases.
We’re looking for someone who strives in a fast-paced startup environment, who is a great communicator and a fast learner. This is a great opportunity to shape an important part of Metaphysic tech and grow while doing so.
Your Mission:
- Dive deep into implementing new tools, optimizations, and automation that make Metaphysic stand out.
- Be the bridge between stakeholders, connecting developers, management, and artists.
- Enable optimal utilization of our shared GPUs.
- Get our databases to scale horizontally.
- Test, examine, and analyze code written by others to keep our codebase top-notch.
- Embrace transparency – document, peer at code, and collaborate with the team.
Must-Haves:
- Previous experience as a DevOps engineer or a similar software engineering role.
- 3+ years of experience in Kubernetes development and management.
- Knowledge of Kubernetes concepts, like pods, services, deployments, and stateful sets.
- Experience with container runtimes like Docker and containers.
- Familiarity with Kubernetes networking, including CNI plugins, ingress controllers, and service meshes.
- Knowledge of infrastructure as code tools, such as Helm, Kustomize, or Terraform.
- Strong knowledge of programming languages, such as Python, Ruby, or Java.
- Excellent communication and collaboration skills.
- A passion for good system design
Bonus Points for:
- Proven experience managing databases deployed to Kubernetes clusters.
- Working knowledge of some of the NewSQL DBs like YugabyteDB or CockroachDB
- Previous experience with some of the DAG-based workflow automation systems like Prefect or Argo
- A problem-solving attitude with a highly collaborative team spirit.
- The ability to work 100% remotely with minimal supervision.
- Exceptional organizational skills.
- The unique ability to break down complex subjects into plain language.
- Experience with enterprise storage systems like Weka.
- Overlap to MLOps.
As part of our team, you’ll enjoy:
- The hustle of a startup with the impact of a global business.
- Tremendous opportunity to join one of the best and fastest-growing AI companies in the world.
- Working with an extraordinary team of smart, creative, fun, and highly motivated people.
- You will be joining a fantastic culture & a team, all highly supportive, collaborative, transparent, and very passionate about our tech and mission.
- Flexible working hours, including remote working – this role is solely remote 🙂
Metaphysic is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.