Back to Jobs

Data Engineer

Shaped
New York City
Full-time
$125K–$157K
Estimated
Apply Now

Required Skills

Machine Learning
Python
R
Java
Javascript
Go
Rust
Scala
Snowflake
Bigquery
Apache Flink
Aws
Gcp
Kubernetes
Analytics

Job Description

Shaped is the fastest path to relevant recommendation and search systems. We help companies turn their behavioral data into truly relevant product and website experiences. We're a Series A company based in Brooklyn, New York and backed by top investors from Madrona, Y-Combinator, and executives from Meta, Google, Amazon and Uber! We are looking for a data engineer to design, build and optimize Shaped's real-time and batch streaming infrastructure. You will be a founding engineer working to reliably ingest customer data (both with batch and real-time processing) into our our state-of-the-art AI discovery engine. As one of Shaped’s early employees you will help shape our product, culture and vision. Skills should include Python, Data Warehouses (such as Clickhouse, Snowflake, or BigQuery) Nice-to-have skills should include DBT, Meltano, Kubernetes, and Apache Flink (or other stream processing frameworks) Skills should include Python, Data Warehouses (such as Clickhouse, Snowflake, or BigQuery) Nice-to-have skills should include DBT, Meltano, Kubernetes, and Apache Flink (or other stream processing frameworks) We’re excited to work with you. Come build the future of AI with us! Customers typically use Shaped as follows: Connect your data stack, e.g. data warehouse, database or analytics applications Define your model. This includes your optimization objective (e.g. clicks vs purchases vs shares), item and user catalogs, feature types and model types. Consume your results from our real-time, scalable ranking endpoints Evaluate uplift and model results on our dashboard. To power all of this, under the hood, we've built a multi-tenanted, real-time machine learning architecture which automatically sets-up and ingests data both in real-time and batch, transforms data and stores it into our proprietary feature/vector store. Ranking models are continuously optimized and fine-tuned based on real-time feedback ensuring customers are seeing the most relevant and up-to-date results possible. From a machine-learning perspective we use state-of-the-art large scale neural encoding models to understand multi-modal data types such as image, text, audio and tabular data. We provide an exhaustive library of retrieval, ranking and ordering algorithms which are selected based on the specified model definition. We use both AWS and GCP for cloud. Kubernetes for serverless infrastructure. Python, Javascript and Rust for languages.

Job Details

Employment Type

Full-time

Salary Range

$125K–$157K

Estimated

Location

New York City