ZoomInfo is redefining how 40,000+ revenue teams find, engage, and win customers. Our next leap forward: lightning-fast, hyper-accurate information retrieval powered by Large & Small Language Models. We're assembling a best-in-class Applied AI group and are hiring a Senior Data Scientist to own core retrieval, NER, and aligned entity-resolution & knowledge-graph initiatives that touch billions of records and serve millions of daily queries.
What you will do:
* End-to-End Retrieval Modeling
* Invent and productionize Transformer/RAG architectures that surface the right contact, company, or insight.
* Drive quantization, distillation, and SLM fine-tuning so models stay fast and affordable at petabyte scale.
* Prototype and launch hybrid dense/sparse retrieval pipelines on vector DBs (Pinecone, Weaviate, FAISS, OpenSearch).
* Named-Entity Recognition & Resolution
* Own high-recall NER models that tag people, orgs, locations, and industry-specific entities across multi-language text.
* Build cross-dataset entity-resolution frameworks that dedupe hundreds of millions of records with sub-second latency; enrich with knowledge-graph signals where valuable.
* Experimentation
* Design large-scale A/B and back-testing plans; close the loop from experiment to KPI uplift.
* Cross-Functional Impact
* Translate product goals into measurable ML KPIs; influence roadmap, capacity, and investment decisions.
* Mentor junior scientists/engineers; publish internal requirements documents, external blogs, and present at conferences.
What you will bring:
* 7+ yrs hands-on ML/NLP experience (or 4+ yrs post-PhD/Master's) with at least two delivered, revenue-impacting products.
* Expertise in transformer stacks (BERT/GPT/T5), RAG, vector-based IR, and latency/throughput optimization.
* Proven track record building NER or entity-resolution systems at 100M+ record scale; knowledge-graph experience is a plus.
* Strong applied research chops (PyTorch or TensorFlow) paired with software-engineering rigor (Python, Go/Java a plus).
* Desire to work within MLOps tools and frameworks: Docker, K8s, GitOps, Terraform, feature stores, model registries, automated retraining.
* Ability to persuade exec and non-tech audiences with data-driven storytelling; comfortable owning strategy & budget.
#LI-SK
#LI-Hybrid