Mapping Ancient Forests: Bayesian Inference for Spatio-temporal Trends in Forest Composition Using the Fossil Pollen Proxy Record

Christopher J. Paciorek, Harvard School of Public Health
Jason S. McLachlan, University of Notre Dame


Ecologists use the relative abundance of fossil pollen in sediments to estimate how tree species abundances change over space and time. To predict historical forest composition and investigate how much information is available from such data, we build a Bayesian hierarchical model that predicts forest composition in central New England, USA, based on fossilized pollen from sediments in a network of ponds. After preprocessing the pollen composition time series from individual ponds, the critical relationships between abundances of taxa in the pollen record and abundances in actual vegetation are estimated for the modern and colonial periods, for which both pollen and direct vegetation data are available. For each time period, the Bayesian model relates pollen and vegetation data to a latent spatial process representing forest composition. For time periods in the past with only pollen data, we use the estimated model parameters to make predictions about the latent spatial process conditional on the relevant pollen data, and, through the parameter estimates, conditional on information in the modern and colonial data. Careful parameterizations allow us to borrow strength across taxa, space, and time. We develop an innovative graphical assessment of feature significance to help to infer which spatial patterns are reliably estimated.

Using this approach, we estimate the spatial distribution and relative abundances of tree species over the last 2000 years, with an assessment of uncertainty, and draw inference about how these patterns have changed over time. Cross-validation suggests that our feature significance approach can reliably indicate certain large-scale spatial features for many taxa, but that features on scales smaller than 50 km are difficult to distinguish, as are large-scale features for some taxa. The model fits also allow us to investigate the covariate effects on taxa abundance over time and the degree of long-distance pollen dispersal and differences between taxa in pollen dispersal characteristics.

The assessment of uncertainty in this paper is the critical advantage of our modeling approach over current ecological analyses. In addition, by analyzing the data in the context of such a model, we can investigate ecological hypotheses. Building ecological research on a solid statistical framework will allow us to extend this collaboration in several areas such as population spread after the last ice age and genetic consequences of population dynamics.