Back to Jobs

Senior LLM Engineer

Maitai
Redwood City, CA, US
Full-time
$145K–$177K
Estimated
Apply Now

Required Skills

Llm
Rag
Python
R
Typescript
Go
Rust
Scala
React
Sql
Postgresql
Redis
Aws
Gcp
Kubernetes
Github Actions
Terraform
Git
Jira

Job Description

Maitai manages the LLM stack for enterprise companies, enabling the fastest and most reliable inference. The future of enterprise AI revolves around mosaics of small, domain-specific models powering powerful, responsive agents, and Maitai is well positioned to capture the market. If you're looking at getting in early with a company redefining how large companies build with AI, then let's talk. High-level Join Maitai to reshape how enterprise companies build with open-source LLMs. You’ll be at the forefront, driving cutting-edge innovations in model fine-tuning, distillation, and automation to continuously enhance LLM performance. You’ll collaborate directly with founders, engineers, and enterprise customers, building the core management layer that defines enterprise AI infrastructure. We're scaling rapidly and looking for engineers who deeply understand open-source LLM ecosystems and can confidently automate and optimize model improvements at scale. Low-level You will lead the fine-tuning, distillation, and deployment of open-source LLMs tailored for enterprise customers. Your role involves: Preparing, optimizing, and managing large-scale datasets for model training and continuous improvement. Developing and automating sophisticated fine-tuning pipelines to enhance model accuracy, reliability, and inference speed. Distilling models to smaller, faster, and more efficient variants without compromising performance. Implementing new platform features in Python and Go to facilitate seamless dataset curation, correction, and augmentation. Collaborating closely with our infrastructure team and partner model hosts to ensure scalable, reliable model deployments. Preparing, optimizing, and managing large-scale datasets for model training and continuous improvement. Developing and automating sophisticated fine-tuning pipelines to enhance model accuracy, reliability, and inference speed. Distilling models to smaller, faster, and more efficient variants without compromising performance. Implementing new platform features in Python and Go to facilitate seamless dataset curation, correction, and augmentation. Collaborating closely with our infrastructure team and partner model hosts to ensure scalable, reliable model deployments. Who You Are You've spent quite a bit of time in Unsloth notebooks and know your way around fine-tuning opensource models like llama, gemma, mistral, etc. You recognize dataset balancing as an art as much as a science. You subscribe to r/LocalLLaMA. You've got a local model humming on your MacBook Pro to power Cursor. You enjoy attending local AI meetups to see what others are working on in the space. You've spent quite a bit of time in Unsloth notebooks and know your way around fine-tuning opensource models like llama, gemma, mistral, etc. You recognize dataset balancing as an art as much as a science. You subscribe to r/LocalLLaMA. You've got a local model humming on your MacBook Pro to power Cursor. You enjoy attending local AI meetups to see what others are working on in the space. Why Join Us? Massive technical challenges – Pioneering automated, continuous improvements for enterprise-grade opensource LLMs. Ownership and Impact – Drive architecture decisions, shape core product offerings, and influence company strategy from day one. Elite, collaborative team – Join a fast-moving environment working alongside top-tier engineers. Equity Upside – Early-stage, meaningful equity ownership. Zero Red Tape – Ship fast, iterate faster, and enjoy working without heavy processes or Jira epics. Amazing Customers - Not to brag, but our customers are genuinely amazing to work with. Massive technical challenges – Pioneering automated, continuous improvements for enterprise-grade opensource LLMs. Ownership and Impact – Drive architecture decisions, shape core product offerings, and influence company strategy from day one. Elite, collaborative team – Join a fast-moving environment working alongside top-tier engineers. Equity Upside – Early-stage, meaningful equity ownership. Zero Red Tape – Ship fast, iterate faster, and enjoy working without heavy processes or Jira epics. Amazing Customers - Not to brag, but our customers are genuinely amazing to work with. More You Need To Know In-person role in downtown Redwood City, CA. Caltrain or parking pass, lunches, and Starbucks/Philz coffee provided. Engineers own their product decisions. Engage directly with customers, set your own specs, and deliver meaningful features. Merit-based opportunity growth. Prove your capability, and we’ll quickly expand your responsibilities. In-person role in downtown Redwood City, CA. Caltrain or parking pass, lunches, and Starbucks/Philz coffee provided. Engineers own their product decisions. Engage directly with customers, set your own specs, and deliver meaningful features. Merit-based opportunity growth. Prove your capability, and we’ll quickly expand your responsibilities. About Us Maitai ensures LLMs never fail by optimizing reliability, speed, and resilience. Acting as an intelligent proxy, we apply real-time autocorrections, route requests intelligently, and fine-tune models for maximum performance. We're experiencing explosive growth, are well-capitalized, and seizing a massive opportunity to redefine how enterprises build with AI. Our platform delivers AI models that significantly outperform closed-source alternatives in speed and accuracy, supported by robust online guardrails. Leading YC startups and public enterprises trust Maitai to manage their LLM infrastructure. Infra As LLMs are core to our customers' products, resiliency and uptime are our top priorities. Since we act as a proxy, our uptime must exceed that of the providers themselves. We’re multi-cloud, multi-region, and built for seamless failover. Our infrastructure runs on Kubernetes, managed with Terraform, and deployed across AWS and GCP. We use GitHub Actions for CI/CD, with Datadog for monitoring, tracing, and performance insights. Infra stack: Kubernetes, Terraform, AWS, GCP, GitHub Actions, PostgreSQL, Redis, Datadog. Backend Our backend is a set of microservices running Python with Quart for web services and Python-based fine-tuning jobs optimized for speed, cost, and accuracy. We use PostgreSQL for conventional data persistence and vector storage. Go is being introduced where performance gains are critical. Tech stack: Python (Quart), Go (in transition), PostgreSQL. Frontend Tech stack: React (Typescript) Quick Chat (15-minute Video Call) Let’s discuss your experience, interests, and ambitions. Tech Discussion Get on a call and talk tech. What's going on in the industry, what have you worked with recently, latest model you've fine-tuned, last meetup you were at, etc. Hands-On Technical Join us at our office to work through a problem with our team. In-person Meetup Coffee or lunch with our team. Assess fit from both sides and move quickly to a decision.

Job Details

Employment Type

Full-time

Salary Range

$145K–$177K

Estimated

Location

Redwood City, CA, US