I have been thinking a lot about the problem of publication bias and analysis bias in empirical research.
Even though everyone knows it is wrong, it is common practice in empirical social science to select what results to report only after analysing data. This practice of "data fishing" can result in enormous bias and an unreliable body of published research. Peter van der Windt and Raul Sanchez de la Sierra and I did a paper on the possible merits of a non-binding but comprehensive registration scheme. We describe the scope for bias under weak registration systems and discuss likely effects of registration on the sort of research that gets produced and reported. "Fishing, Commitment, and Communication" (Preprint) (Political Analysis) .
Here is a little app I made that illustrated the problem with data fishing. The point of the app is not to show that generating real random numbers is hard; it is to show that for some problems you can always choose your tests to get the result you want. Here is another really nice recent demonstration of this idea from NYT.
Al Fang and Grant Gordon and I have another working paper on the effects of registration in medical sciences. We don't find much evidence that the way it is done there makes a difference. See here.
See here for a paper ut together with three sections at APSA to think through what a registry for social sciences would look like.
See also our DeclareDesign project where we are developing a tool to clarify research designs and facilitate the development of pre-analysis plans.
2015: Promoting an Open Research Culture (Science) (With B Nosek and many others)
2014: Promoting Transparency in Social Science Work (Science) (with E Miguel and others)
2013: Monkey Business (Reprinted in the APSA Comparative Politics Newsletter)