Salient is an AI workflow automation tool for lenders. We're working with the largest auto lenders in America and are hiring!
About the Role:
We’re looking for an Applied Research Scientist with expertise in Text-to-Speech (TTS) to help us push the boundaries of speech synthesis. You’ll work on developing high-quality, low-latency TTS systems that power real-world applications. The ideal candidate combines deep modeling knowledge with a strong engineering mindset to deliver robust, scalable solutions.
Responsibilities:
Develop and improve SOTA TTS models
Optimize TTS systems for low-latency real-time inference, voice consistency, and prosody control
Contribute to internal tooling for data processing, model training, and inference benchmarking
Perform any relevant engineering tasks related to model training and serving. E.g., data ingestion, data cleaning, evaluation Requirements:
Proven track record developing SOTA TTS systems, or a PhD in speech synthesis
Experience with large-scale training, audio preprocessing, and distributed training Nice to Have:
Familiarity with multilingual or code-switched TTS
Experience with voice cloning, style transfer, or emotion conditioning in speech
Contributions to academic publications or open-source projects in speech generation About Salient
Salient is building the future of consumer loan servicing, starting by transforming the auto lending industry. Since our September 2023 launch, we've:
Please also apply here: https://jobs.ashbyhq.com/salient/84831b8e-b09a-491c-90a0-d604a2a95f0d?utm_source=jWxe7rB9Q6
python + LLM stack + transformer based voice models
Full-time
$102K–$118K
San Francisco, California