ZoomInfo

Senior Data Scientist - Information Retrieval & NLP

SF, CA, US

Hybrid
Full-time
2 days ago
Save Job

Summary

ZoomInfo is redefining how 40,000+ revenue teams find, engage, and win customers. Our next leap forward: lightning-fast, hyper-accurate information retrieval powered by Large & Small Language Models. We're assembling a best-in-class Applied AI group and are hiring a Senior Data Scientist to own core retrieval, NER, and aligned entity-resolution & knowledge-graph initiatives that touch billions of records and serve millions of daily queries. What you will do: * End-to-End Retrieval Modeling   * Invent and productionize Transformer/RAG architectures that surface the right contact, company, or insight.   * Drive quantization, distillation, and SLM fine-tuning so models stay fast and affordable at petabyte scale.   * Prototype and launch hybrid dense/sparse retrieval pipelines on vector DBs (Pinecone, Weaviate, FAISS, OpenSearch). * Named-Entity Recognition & Resolution   * Own high-recall NER models that tag people, orgs, locations, and industry-specific entities across multi-language text.   * Build cross-dataset entity-resolution frameworks that dedupe hundreds of millions of records with sub-second latency; enrich with knowledge-graph signals where valuable. * Experimentation   * Design large-scale A/B and back-testing plans; close the loop from experiment to KPI uplift. * Cross-Functional Impact   * Translate product goals into measurable ML KPIs; influence roadmap, capacity, and investment decisions.   * Mentor junior scientists/engineers; publish internal requirements documents, external blogs, and present at conferences. What you will bring: * 7+ yrs hands-on ML/NLP experience (or 4+ yrs post-PhD/Master's) with at least two delivered, revenue-impacting products. * Expertise in transformer stacks (BERT/GPT/T5), RAG, vector-based IR, and latency/throughput optimization. * Proven track record building NER or entity-resolution systems at 100M+ record scale; knowledge-graph experience is a plus. * Strong applied research chops (PyTorch or TensorFlow) paired with software-engineering rigor (Python, Go/Java a plus). * Desire to work within MLOps tools and frameworks: Docker, K8s, GitOps, Terraform, feature stores, model registries, automated retraining. * Ability to persuade exec and non-tech audiences with data-driven storytelling; comfortable owning strategy & budget. #LI-SK #LI-Hybrid

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job