Back to Jobs

Senior LLM Engineer at Maitai

Maitai
Redwood City, California
Full-time
$118K–$160K
Estimated
Apply Now

Required Skills

LLM fine-tuning
Model distillation
Open-source LLM ecosystems
Dataset optimization
Python and Go development

Job Description

Maitai manages the LLM stack for enterprise companies, enabling the fastest and most reliable inference. The future of enterprise AI revolves around mosaics of small, domain-specific models powering powerful, responsive agents, and Maitai is well positioned to capture the market. If you're looking at getting in early with a company redefining how large companies build with AI, then let's talk.

Join Maitai to reshape how enterprise companies build with open-source LLMs. You’ll be at the forefront, driving cutting-edge innovations in model fine-tuning, distillation, and automation to continuously enhance LLM performance. You’ll collaborate directly with founders, engineers, and enterprise customers, building the core management layer that defines enterprise AI infrastructure. We're scaling rapidly and looking for engineers who deeply understand open-source LLM ecosystems and can confidently automate and optimize model improvements at scale.

You will lead the fine-tuning, distillation, and deployment of open-source LLMs tailored for enterprise customers. Your role involves:

  • Preparing, optimizing, and managing large-scale datasets for model training and continuous improvement.
  • Developing and automating sophisticated fine-tuning pipelines to enhance model accuracy, reliability, and inference speed.
  • Distilling models to smaller, faster, and more efficient variants without compromising performance.
  • Implementing new platform features in Python and Go to facilitate seamless dataset curation, correction, and augmentation.
  • Collaborating closely with our infrastructure team and partner model hosts to ensure scalable, reliable model deployments.
  • You've spent quite a bit of time in Unsloth notebooks and know your way around fine-tuning opensource models like llama, gemma, mistral, etc.
  • You recognize dataset balancing as an art as much as a science.
  • You subscribe to r/LocalLLaMA.
  • You've got a local model humming on your MacBook Pro to power Cursor.
  • You enjoy attending local AI meetups to see what others are working on in the space.
  • Massive technical challenges – Pioneering automated, continuous improvements for enterprise-grade opensource LLMs.
  • Ownership and Impact – Drive architecture decisions, shape core product offerings, and influence company strategy from day one.
  • Elite, collaborative team – Join a fast-moving environment working alongside top-tier engineers.
  • Equity Upside – Early-stage, meaningful equity ownership.
  • Zero Red Tape – Ship fast, iterate faster, and enjoy working without heavy processes or Jira epics.
  • Amazing Customers - Not to brag, but our customers are genuinely amazing to work with.
  • In-person role in downtown Redwood City, CA. Caltrain or parking pass, lunches, and Starbucks/Philz coffee provided.
  • Engineers own their product decisions. Engage directly with customers, set your own specs, and deliver meaningful features.
  • Merit-based opportunity growth. Prove your capability, and we’ll quickly expand your responsibilities.  

Maitai ensures LLMs never fail by optimizing reliability, speed, and resilience. Acting as an intelligent proxy, we apply real-time autocorrections, route requests intelligently, and fine-tune models for maximum performance. We're experiencing explosive growth, are well-capitalized, and seizing a massive opportunity to redefine how enterprises build with AI. Our platform delivers AI models that significantly outperform closed-source alternatives in speed and accuracy, supported by robust online guardrails. Leading YC startups and public enterprises trust Maitai to manage their LLM infrastructure.

As LLMs are core to our customers' products, resiliency and uptime are our top priorities. Since we act as a proxy, our uptime must exceed that of the providers themselves. We’re multi-cloud, multi-region, and built for seamless failover. Our infrastructure runs on Kubernetes, managed with Terraform, and deployed across AWS and GCP. We use GitHub Actions for CI/CD, with Datadog for monitoring, tracing, and performance insights.

Infra stack: Kubernetes, Terraform, AWS, GCP, GitHub Actions, PostgreSQL, Redis, Datadog.

Our backend is a set of microservices running Python with Quart for web services and Python-based fine-tuning jobs optimized for speed, cost, and accuracy. We use PostgreSQL for conventional data persistence and vector storage. Go is being introduced where performance gains are critical.

Tech stack: Python (Quart), Go (in transition), PostgreSQL.

Tech stack: React (Typescript)

  • Quick Chat (15-minute Video Call)

Let’s discuss your experience, interests, and ambitions. Quick Chat (15-minute Video Call)

Let’s discuss your experience, interests, and ambitions.

  • Tech Discussion 

Get on a call and talk tech. What's going on in the industry, what have you worked with recently, latest model you've fine-tuned, last meetup you were at, etc. Tech Discussion 

Get on a call and talk tech. What's going on in the industry, what have you worked with recently, latest model you've fine-tuned, last meetup you were at, etc.

  • Hands-On Technical

Join us at our office to work through a problem with our team. Hands-On Technical

Join us at our office to work through a problem with our team.

  • In-person Meetup

Coffee or lunch with our team. Assess fit from both sides and move quickly to a decision. In-person Meetup

Coffee or lunch with our team. Assess fit from both sides and move quickly to a decision.

Job Details

Employment Type

Full-time

Salary Range

$118K–$160K

Estimated

Location

Redwood City, California