Assessing the quality of the fossil record is notoriously hard, and many recent attempts have used sampling proxies that can be questioned. For example, counts of geological formations and estimated outcrop areas might not be defensible as reliable sampling proxies: geological formations are units of enormously variable dimensions that depend on rock heterogeneity and fossil content (and so are not independent of the fossil record), and outcrop areas are not always proportional to rock exposure, probably a closer indicator of rock availability. It is shown that in many cases formation counts will always correlate with fossil counts, whatever the degree of sampling. It is not clear, in any case, that these proxies provide a good estimate of what is missing in the gap between the known fossil record and reality; rather they largely explore the gap between known and potential fossil records. Further, using simple, single numerical metrics to correct global-scale raw data, or to model sampling-driven patterns may be premature. There are perhaps four approaches to exploring the incompleteness of the fossil record, (1) regionalscale studies of geological completeness; (2) regionalor clade-scale studies of sampling completeness using comprehensive measures of sampling, such as numbers of localities or specimens or fossil quality; (3) phylogenetic and gap-counting methods; and (4) model-based approaches that compare sampling as one of several explanatory variables with measures of environmental change, singly and in combination. We suggest that palaeontologists, like other scientists, should accept that their data are patchy and incomplete, and use appropriate methods to deal with this issue in each analysis. All that matters is whether the data are adequate for a designated study or not. A single answer to the question of whether the fossil record is driven by macroevolution or megabias is unlikely ever to emerge because of temporal, geographical, and taxonomic variance in the data. The fossil record is far from perfect, and palaeontologists must be concerned about inadequacy and bias (Raup 1972; Benton 1998; Smith 2001, 2007a). Fundamental issues concerning the quality and completeness of the fossil record were enunciated clearly by Charles Darwin (1859, pp. 287–288), who wrote: That our palaeontological collections are very imperfect, is admitted by every one. The remark of that admirable palaeontologist, the late Edward Forbes, should not be forgotten, namely, that numbers of our fossil species are known and named from single and often broken specimens, or from a few specimens collected on some one spot. Only a small portion of the surface of the earth has been geologically explored, and no part with sufficient care, as the important discoveries made every year in Europe prove. No organism wholly soft can be preserved. Shells and bones will decay and disappear when left on the bottom of the sea, where sediment is not accumulating. . . With respect to the terrestrial productions which lived during the Secondary and Palaeozoic periods, it is superfluous to state that our evidence from fossil remains is fragmentary in an extreme degree. Raup (1972) clarified the situation when he compared the ‘empirical’ model of Valentine (1969), a literal reading of the fossil record, with his ‘bias simulation model’ that explained the bulk of the apparent low diversity levels of marine invertebrates in the Palaeozoic as a sampling error. Two opposite viewpoints have been argued, either that the fossil record is good enough (e.g. Sepkoski et al. 1981; Benton 1995; Benton et al. 2000; Stanley 2007) or not good enough (e.g. Raup 1972; Alroy et al. 2001, 2008; Peters & Foote 2002; Alroy 2010) to show the main patterns of global diversification through time. A resolution between these opposite viewpoints does not appear close (Benton 2009; Erwin 2009; Marshall 2010). From: McGowan, A. J. & Smith, A. B. (eds) Comparing the Geological and Fossil Records: Implications for Biodiversity Studies. Geological Society, London, Special Publications, 358, 63–94. DOI: 10.1144/SP358.6 0305-8719/11/$15.00 # The Geological Society of London 2011. Key objective evidence for bias in the fossil record could be the extraordinary and ubiquitous correlation of sampling proxies and diversity curves: why is there such close tracking of measures of rock volume by palaeodiversity? There are three possible explanations: (1) rock volume/sampling drives the diversity signal (Peters & Foote 2001, 2002; Smith 2001, 2007a; Butler et al. 2011); (2) both signals reflect a third, or ‘common’, cause such as sea-level fluctuation (Peters 2005; Peters & Heim 2010); or (3) both signals are entirely or partially redundant (1⁄4 identical) with each other. In reality, the close correlation probably reflects a combination of all three factors in different proportions in any test case, and so it is probably fruitless to prolong the debate about which of the three models is correct, and which incorrect. Much of the literature on the quality of the rock and fossil records has focused on marine settings. This reflects the interests of palaeontologists who engage with these questions, and the fact that many marine rock records are more complete than most terrestrial (continental) rock records. However, the terrestrial fossil record is worth considering for several reasons: terrestrial life today is much more diverse than marine life, perhaps representing 85% of modern biodiversity (May 1990; Vermeij & Grosberg 2010), terrestrial life includes many major taxa that are sensitive to atmospheric, temperature, and topographic change and so are key indicator species in studies of global change, and for many terrestrial groups (e.g. angiosperms, insects, vertebrates) there are mature morphological and molecular phylogenies that enable cross-comparison between stratigraphic and cladistic data. In this paper, we explore the use of sampling proxies, and suggest that some commonly used measures, notably formation counts and outcrop areas, may not be useful or accurate measures of sampling. Indeed, we suggest that there is probably no single numerical metric that captures all aspects of sampling (1⁄4 rock volume, accessibility, effort), and recent attempts to correct the raw data, or to model sampling-driven patterns, may be premature. We then look at some case studies of patchy fossil records in taxa with good phylogenetic data, and suggest that in some cases at least the rock volume and fossil occurrence measures are identical, and so correlate almost perfectly. Finally, we suggest that such global-scale confrontations of sampling proxies and fossil data are not adequate at present, and recommend instead study-scale approaches to detect and correct sampling, involving direct evidence for missing data (e.g. Lazarus taxa; ghost ranges), direct evidence for sampling (e.g. number of localities or samples per time bin; fossil specimen completeness), and an integrated, model-based approach to incorporating sampling and explanatory models into explaining particular diversity curves. The fossil record, reality and sampling The fossil record, collector curves and