Revisiting John Snow's Cholera Map
A Data Visualisation Case Study for Statistical Education
Keywords:
Visualisation, Exploratory Data Analysis, John Snow, Cholera, Spatial DataAbstract
Data visualisation is a fundamental tool in statistical analysis, enabling the identification of patterns and relationships that might otherwise remain hidden in raw data. One of the most famous historical examples is John Snow's 1854 cholera map, which demonstrated the spatial clustering of cholera cases around a contaminated water pump in London. This study explores how Snow's visualisation can be effectively incorporated into statistics education as an interactive case study. Revisiting Snow’s cholera map in 2025 provides a powerful bridge between foundational epidemiological reasoning and modern statistical practice. It offers students an intuitive, historically grounded pathway into spatial thinking and data visualisation using tools like R. Using R, we outline the steps involved in reproducing Snow's cholera map, demonstrating geospatial data manipulation, visualisation techniques, and spatial analysis. We discuss the pedagogical benefits of historical case studies in statistics courses, emphasising their role in fostering curiosity, critical thinking, and technical proficiency. Additionally, we explore how these methods can be extended beyond epidemiology to applications in public health, urban analytics and environmental science. By integrating historical datasets with modern computational tools, educators can create engaging, hands-on learning experiences that reinforce core statistical principles while illustrating the real-world impact of data analysis.
References
Bach, B., Keck, M., Rajabiyazdi, F., Losev, T., Meirelles, I., Dykes, J., ... & Carpendale, S. (2023). Challenges and opportunities in data visualization education: A call to action. IEEE Transactions on visualization and computer graphics, 30(1), 649-660.
Baumer, B. S., Kaplan, D. T., & Horton, N. J. (2017). Modern data science with R. Chapman and Hall/CRC.
Brody, H., Rip, M. R., Vinten-Johansen, P., Paneth, N., & Rachman, S., 2000. Map-making and myth-making in Broad Street: The London cholera epidemic, 1854. The Lancet, 356 (9223), 64–68.
Brunekreef, B., & Holgate, S. T., 2002. Air pollution and health. The Lancet, 360 (9341), 1233–1242.
Chang, H. Y., Chang, Y. J., & Tsai, M. J. (2024). Strategies and difficulties during students’ construction of data visualizations. International Journal of STEM Education, 11(1), 11.
Cobb, G., 1992. Teaching statistics. Heeding the call for change: Suggestions for curricular action, 22, 3–43.
Cromley, E. K., & McLafferty, S. L., 2011. GIS and public health. Guilford Press.
Franch-Pardo, I., Napoletano, B. M., Rosete-Verges, F., & Billa, L., 2020. Spatial analysis and GIS in the study of covid-19. a review. Science of the total environment, 739, 140033.
Friendly, M., & Denis, D. J., 2001. Milestones in the history of thematic cartography, statistical graphics, and data visualization. [online] Available at: http://www.datavis. ca/milestones [Accessed 07 March 2025]
Kross, S., & Guo, P. J., 2019. Practitioners teaching data science in industry and academia: Expectations, workflows, and challenges. Proceedings of the 2019 CHI conference on human factors in computing systems, 1–14.
Li, M. (2024). Incorporating Data Visualisation into Teaching and Learning. Mathematics Education Research Group of Australasia.
Mimnagh, N., 2025. SnowData: Historical data from John Snow’s 1854 cholera outbreak map [R package version 1.0.0]. Available at: https://CRAN.R-project.org/package=SnowData
Newburger, E., & Elmqvist, N. (2023). Visualization according to statisticians: an interview study on the role of visualization for inferential statistics. IEEE transactions on visualization and computer graphics, 30(1), 230-239.
Nolan, D., & Perrett, J., 2016. Teaching and learning data visualization: Ideas and assignments. The American Statistician, 70 (3), 260–269.
Parker, R., & Bahrami, M. (2021, August 3). John Snow: The birth of epidemiology, data analysis & visualization. Wolfram Blog. https://blog.wolfram.com/2021/08/03/john-snow-the-birth-of-epidemiology-data-analysis-visualization/
Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal, 10 (1), 439–446. https://doi.org/10.32614/RJ-2018-009
Pebesma, E., and Bivand, R., 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC.
Snow, J., 1856. On the mode of communication of cholera. Edinburgh medical journal, 1 (7), 668.
Tufte, E. R., and Robins, D., 1997. Visual explanations. Graphics Cheshire, CT.
Tukey, J. W., 1972. Some graphic and semigraphic displays. Statistical papers in honor of George W. Snedecor, 5, 293–316.
Tukey, J. W., et al., 1977. Exploratory data analysis (Vol. 2). Springer.
Unwin, A., 2018. Graphical data analysis with R. Chapman; Hall/CRC.
Unwin, A., 2020. Why is data visualization important? what is important in data visualization. Harvard Data Science Review, 2 (1), 1.
Wickham, H., 2016. Ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., Fran¸cois, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., M¨uller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K. Yutani, H., 2019. Welcome to the tidyverse. Journal of Open Source Software, 4 (43), 1686. https : / / doi . org / 10 .21105/joss.01686
Wickham, H., François, R., Henry, L., M¨uller, K., & Vaughan, D., 2023. Dplyr: A grammar of data manipulation [R package version 1.1.4] Available at: https://dplyr.tidyverse.org
Wilkerson, M. H., Kim, J., Lee, H. S., Stokes, D. J., & Ferrell, M. (2025). How Teachers Envision Using Data Visualization Discussion Tasks in Classroom Instruction. International Journal of Science and Mathematics Education, 1-35.
Zhang, C. H., & Schwartz, G. G., 2020. Spatial disparities in coronavirus incidence and mortality in the United States: An ecological analysis as of May 2020. The Journal of Rural Health, 36 (3), 433–445.