Data Scientist
Company: Eli Lilly and Company
Location: Indianapolis
Posted on: March 22, 2026
|
|
|
Job Description:
At Lilly, we unite caring with discovery to make life better for
people around the world. We are a global healthcare leader
headquartered in Indianapolis, Indiana. Our employees around the
world work to discover and bring life-changing medicines to those
who need them, improve the understanding and management of disease,
and give back to our communities through philanthropy and
volunteerism. We give our best effort to our work, and we put
people first. We’re looking for people who are determined to make
life better for people around the world. Role Overview Tech@Lilly
is looking for a Data Scientist to join the technology team
supporting Global Regulatory Affairs (GRA), Global Scientific
Communications (GSC), and Global Statistical Sciences (GSS). You
will apply data science, machine learning, and AI techniques to
solve real business problems — from building agents to automate
manual regulatory workflows and optimizing clinical trial processes
to building predictive models that drive smarter decisions across
the portfolio. This role sits at the intersection of data, domain
expertise, and engineering. You won’t just build models in
notebooks — you’ll partner with business stakeholders to understand
the problem and with leadership to quantify the impact of your
work. What You’ll Be Doing As a Data Scientist, you will work
across the portfolio to identify opportunities where data science
and AI can create measurable business value. You’ll analyze complex
datasets — regulatory documents, clinical trial data, submission
timelines, operational metrics — to uncover patterns, build
predictive models, and develop AI-powered solutions. You’ll design
and evaluate machine learning models, build NLP and generative AI
applications for regulatory and scientific content, and collaborate
closely with full stack engineers to move your work from prototype
to production. You operate in a regulated, GxP environment where
data integrity, reproducibility, and validation are not optional.
How You’ll Succeed Partnering with business stakeholders across
GRA, GSC, and GSS to understand their workflows, identify
high-impact problems, and frame them as data science opportunities
— translating business questions into analytical approaches.
Developing and deploying machine learning models — classification,
regression, clustering, time-series forecasting — to solve problems
such as submission timeline prediction, document classification,
regulatory risk scoring, and resource optimization. Building and
evaluating NLP and generative AI solutions — leveraging LLMs, RAG
architectures, text extraction, entity recognition, and document
summarization to automate regulatory authoring, scientific
literature analysis, and content generation workflows. Designing
and executing experiments to evaluate model performance — using
rigorous statistical methods, A/B testing, and evaluation
frameworks (including RAGAS for RAG systems) to ensure solutions
meet quality and accuracy thresholds before deployment. Designing
and building AI agents and agentic workflows — creating multi-step,
tool-using systems that can autonomously execute complex tasks such
as regulatory document drafting, data extraction and
transformation, and cross-system orchestration — moving beyond
single-prompt interactions to production-grade agent architectures
that operate reliably in a validated environment. Collaborating
with full stack engineers and platform teams to productionize
models — building APIs, integrating into existing applications,
deploying on AWS infrastructure (Lambda, EKS, SageMaker,
Databricks), and monitoring model performance in production.
Communicating findings and recommendations to both technical and
non-technical audiences — using data visualization, storytelling,
and clear business-impact framing to ensure your work drives actual
decisions. Staying current with emerging techniques in machine
learning, generative AI, and data science — evaluating new tools,
frameworks, and approaches for applicability to the GRA/GSC/GSS
portfolio and sharing knowledge with the broader team. Basic
Qualifications Bachelor’s degree in Data Science, Statistics,
Computer Science, Mathematics, or a related quantitative field 1
years of professional data science experience in Python, R and core
data science libraries Additional Skills & Preferences Experience
with machine learning frameworks and model deployment patterns
Academic Background in Data Science Hands-on experience with NLP
techniques and/or generative AI — LLM APIs (OpenAI, Anthropic), RAG
architectures, vector databases, prompt engineering Familiarity
with cloud data platforms — AWS (SageMaker, Lambda, S3),
Databricks, or similar Knowledge of statistical methods —
hypothesis testing, experimental design, Bayesian methods,
regression analysis Experience with SAS programming Strong
communication skills — ability to present technical findings to
non-technical audiences and translate business questions into
analytical frameworks Collaborative mindset and experience working
with cross-functional teams including engineers, product owners,
and business partners Lilly is dedicated to helping individuals
with disabilities to actively engage in the workforce, ensuring
equal opportunities when vying for positions. If you require
accommodation to submit a resume for a position at Lilly, please
complete the accommodation request form (
https://careers.lilly.com/us/en/workplace-accommodation ) for
further assistance. Please note this is for individuals to request
an accommodation as part of the application process and any other
correspondence will not receive a response. Lilly is proud to be an
EEO Employer and does not discriminate on the basis of age, race,
color, religion, gender identity, sex, gender expression, sexual
orientation, genetic information, ancestry, national origin,
protected veteran status, disability, or any other legally
protected status. Our employee resource groups (ERGs) offer strong
support networks for their members and are open to all employees.
Our current groups include: Africa, Middle East, Central Asia
Network, Black Employees at Lilly, Chinese Culture Network,
Japanese International Leadership Network (JILN), Lilly India
Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ
Allies), Veterans Leadership Network (VLN), Women’s Initiative for
Leading at Lilly (WILL), enAble (for people with disabilities).
Learn more about all of our groups. Actual compensation will depend
on a candidate’s education, experience, skills, and geographic
location. The anticipated wage for this position is $66,000 -
$165,000 Full-time equivalent employees also will be eligible for a
company bonus (depending, in part, on company and individual
performance). In addition, Lilly offers a comprehensive benefit
program to eligible employees, including eligibility to participate
in a company-sponsored 401(k); pension; vacation benefits;
eligibility for medical, dental, vision and prescription drug
benefits; flexible benefits (e.g., healthcare and/or dependent day
care flexible spending accounts); life insurance and death
benefits; certain time off and leave of absence benefits; and
well-being benefits (e.g., employee assistance program, fitness
benefits, and employee clubs and activities).Lilly reserves the
right to amend, modify, or terminate its compensation and benefit
programs in its sole discretion and Lilly’s compensation practices
and guidelines will apply regarding the details of any promotion or
transfer of Lilly employees. WeAreLilly
Keywords: Eli Lilly and Company, Bloomington , Data Scientist, Science, Research & Development , Indianapolis, Indiana