top of page

STATISTICS FOR HISTORIANS COURSE
UNIT 5: EXPLORATIVE DATA ANALYSIS

This unit is composed of two lectures which will introduce you to DataFrames with Pandas.

Lecture A: Exploring DataFrames with Pandas (Part I)
This session runs in an interactive notebook on MyBinder. Click for more information on how to access and run a notebook. An overview of all interactive materials is available here. 

This notebook introduces the Pandas library and explores tools for working programmatically with tabular data in. We have a closer look at realistic and complex metdata derived from the British Library catalogue and demonstrate how you can refine and reorganize information with the goals of studying trends over time.


Lecture B: Exploring DataFrames with Pandas (Part II)
This session runs in an interactive notebook on
MyBinder. Click for more information on how to access and run a notebook. An overview of all interactive materials is available here. 

This notebook uses  "synthetic" demographic data about age and gender in late Victorian London. We discuss different types of variables and strategies for visualizing distributions. We proceed with summarising information using descriptive statistics, such as mean and median. From a historical point of view, we investigate whether men are generally younger than women in late-Victorian London.


KEY READINGS
* Konrad Jarausch, Kenneth Hardy, Quantitative Methods for Historians: A Guide to Research, Data, & Statistics (1991)
* Roderick Floud, An Introduction to Quantitative Methods for Historians (1973)
* Paul Kellstedt, Guy Whitten, The Fundamentals of Political Science Research (2008)
* Robert Fogel and Geoffrey Elton, Which Road to the Past? Two Views of History (1984)
[This book is a debate between an outspoken quantifier, and historian critical of quantification]



EXERCISES
Integrated into the lectures above.

bottom of page