Back to Jobs

Applied Research Scientist - Text-to-Speech (TTS) at Salient

Salient
San Francisco, California
Full-time
$102K–$118K
Estimated
Apply Now

Required Skills

Text-to-Speech (TTS) systems
Speech synthesis
Low-latency real-time inference
Voice cloning
Transformer-based voice models

Job Description

Salient is an AI workflow automation tool for lenders. We're working with the largest auto lenders in America and are hiring!

About the Role:

We’re looking for an Applied Research Scientist with expertise in Text-to-Speech (TTS) to help us push the boundaries of speech synthesis. You’ll work on developing high-quality, low-latency TTS systems that power real-world applications. The ideal candidate combines deep modeling knowledge with a strong engineering mindset to deliver robust, scalable solutions.

Responsibilities:

  • Develop and improve SOTA TTS models

  • Optimize TTS systems for low-latency real-time inference, voice consistency, and prosody control

  • Contribute to internal tooling for data processing, model training, and inference benchmarking

  • Perform any relevant engineering tasks related to model training and serving. E.g., data ingestion, data cleaning, evaluation Requirements:

  • Proven track record developing SOTA TTS systems, or a PhD in speech synthesis

  • Experience with large-scale training, audio preprocessing, and distributed training Nice to Have:

  • Familiarity with multilingual or code-switched TTS

  • Experience with voice cloning, style transfer, or emotion conditioning in speech

  • Contributions to academic publications or open-source projects in speech generation About Salient

Salient is building the future of consumer loan servicing, starting by transforming the auto lending industry. Since our September 2023 launch, we've:

  • Scaled to more than $10M in ARR
  • Partnered with some of the largest consumer lenders in America
  • Cash flow positive
  • Raised $65m in funding
  • Interfaced with more than 2 million unique US consumers
  • Processed over $150M in cash transactions
  • Prevented $30M in fraud
  • Identified 35k+ CFPB violations
  • In-person office culture in San Francisco, CA Backed by top-tier investors including Andreessen Horowitz, Matrix, General Catalyst, and Y Combinator.

Please also apply here: https://jobs.ashbyhq.com/salient/84831b8e-b09a-491c-90a0-d604a2a95f0d?utm_source=jWxe7rB9Q6

python + LLM stack + transformer based voice models

Job Details

Employment Type

Full-time

Salary Range

$102K–$118K

Estimated

Location

San Francisco, California