Back to Jobs

Senior AI/Machine Learning Engineer

Manila Recruitment
πŸ‡΅πŸ‡­ Philippines – Remote
Full-time
$145K–$177K
Estimated
Remote
Apply Now

Required Skills

⏰ Full Time
🟠 Senior
πŸ€– Machine Learning Engineer
Kubernetes
Python
Ray
Docker
JavaScript
MongoDB
Node.js
React
TypeScript
Azure
Cloud
Google Cloud Platform
🟑 Mid-level
AWS
GraphQL
Microservices
Terraform
Apache
ETL
Java
PyTorch
Scala
Spark
SQL
Tensorflow
Go
Keras
Machine Learning
Llm
Rag
R

Job Description

<h3>πŸ“‹ Description</h3> β€’ Design and implement agent evaluation pipelines that benchmark AI capabilities across real-world enterprise use cases β€’ Build domain-specific benchmarks for product support, engineering ops, GTM insights, and other verticals relevant to modern SaaS β€’ Develop performance benchmarks that measure and optimize latency, safety, cost-efficiency, and user-perceived quality β€’ Create search- and retrieval-oriented benchmarks, including multilingual query handling, annotation-aware scoring, and context relevance β€’ Partner with AI and infra teams to instrument models and agents with detailed telemetry for outcome-based evaluation (Member of Technical Staff: AI Performance1) β€’ Drive human-in-the-loop and programmatic testing methodologies for fuzzy metrics like helpfulness, intent alignment, and resolution effectiveness β€’ Contribute to company’s open evaluation tooling and benchmarking frameworks, shaping how the broader ecosystem thinks about SaaS AI performance <h3>🎯 Requirements</h3> β€’ 3 to 7 years of experience in systems, infra, or performance engineering roles with strong ownership of metrics and benchmarking β€’ Fluency in Python and comfort working across full-stack and backend services β€’ Experience building or using LLMs, vector-based search, or agentic frameworks in production environments β€’ Familiarity with LLM model serving infrastructure (e.g., vLLM, Triton, Ray, or custom -Kubernetes-based deployments), including observability, autoscaling, and token streaming β€’ Experience working with model tuning workflows, including prompt engineering, fine-tuning (e.g., LoRA, DPO, or evaluation loops for post-training optimization) β€’ Deep appreciation for measuring what matters β€” whether itΚΌs latency under load, degradation in retrieval precision, or regression in AI output quality β€’ Familiarity with evaluation techniques in NLP, information retrieval, or human-centered AI (e.g. RAGAS, Recall@K, BLEU, etc.) β€’ Strong product and user intuition β€” you care about what the benchmark represents, not just what it measures β€’ Experience contributing to academic or open-source benchmarking projects

Job Details

Employment Type

Full-time

Salary Range

$145K–$177K

Estimated

Location

πŸ‡΅πŸ‡­ Philippines – Remote

Remote Work

Remote Friendly