Statistics for Health Data Science
Students and researchers in the health sciences are faced with greater opportunity and challenge than ever before. The opportunity stems from the explosion in publicly available data that simultaneously informs and inspires new avenues of investigation. The challenge is that the analytic tools required go far beyond the standard methods and models of basic statistics. This textbook aims to equip health care researchers with the most important elements of a modern health analytics toolkit, drawing from the fields of statistics, health econometrics, and data science.
This textbook is designed to overcome students’ anxiety about data and statistics and to help them to become confident users of appropriate analytic methods for health care research studies. Methods are presented organically, with new material building naturally on what has come before. Each technique is motivated by a topical research question, explained in non-technical terms, and accompanied by engaging explanations and examples. In this way, the authors cultivate a deep (“organic”) understanding of a range of analytic techniques, their assumptions and data requirements, and their advantages and limitations. They illustrate all lessons via analyses of real data from a variety of publicly available databases, addressing relevant research questions and comparing findings to those of published studies. Ultimately, this textbook is designed to cultivate health services researchers that are thoughtful and well informed about health data science, rather than data analysts.
This textbook differs from the competition in its unique blend of methods and its determination to ensure that readers gain an understanding of how, when, and why to apply them. It provides the public health researcher with a way to think analytically about scientific questions, and it offers well-founded guidance for pairing data with methods for valid analysis. Readers should feel emboldened to tackle analysis of real public datasets using traditional statistical models, health econometrics methods, and even predictive algorithms.
Accompanying code and data sets are provided in an author site: https://roman-gulati.github.io/statistics-for-health-data-science/