Tuesday, April 26, 2016

On this day in 1986

Lost in an African Jungle*

It all began in L.A. California when I had to go to Africa on safari, to hunt the Wild Weirdo Snake. Because I was a scientist and had to study one. So I went to Africa to the big bush (the grassy swamp land). I was there but suddenly I got lost. Luckily, I brought a map. But it started to rain and my map got soaked. So I couldn't use that. But then, in the midst of the jungle, I heard what sounded like Indians. So, I ran, but they were coming from all directions. They came and got me. They had a Wild Weirdo Snake for a pet. So they wrapped it around me. I died. So they ate me. And for the rest of my life my head hangs from a stick.

The End

Is this the Wild Weirdo Snake? (source)




*Thanks to my mother for saving this.

Thursday, April 14, 2016

The last orang standing (no place to swing): why do we care?

Every time we see documentation of threats to the existence of particular species, the issue arises as to whether human activities are responsible.  And if so, there's usually a plea to stop the devastation. Recent examples that led me to think about what this means include the April 5 NY Times story about devastation being experienced by the orangutans in Indonesia, whose range is becoming ever more limited.  Here is an image from that story:

All alone in a devastated habitat.  NYTimes story 4/5/2016
The pathos of the image is evocative, and even heart wrenching, but fortunately this particular animal was rescued.  Here is a heart-warming image from the same story:


Rescue!  From the same Times story
And there was a recent, truly uplifting story, from Al Jazeera, about a conservationist who has spent decades carefully and patiently enabling orangs to return to the forest.

Orangutans are in trouble, and at least some people seem to care.  Indeed, the pathos is about much more than individual orangs orphaned to a cruel fate.  It is about the endangerment of their species itself. There are similar concerns about other species, such as the likely impending doom of other great apes (besides ourselves): chimps and gorillas, in particular.  But, in fact, why are we concerned?

Why the concern?
It may seem obvious.  After all, every individual dies--humans, ants, birds, our pets, our children, ourselves--and orangs.  We don't want to die or suffer, so maybe it's natural in some way for us to empathize with those who are dying or suffering.  Knowing we ourselves will die some day, we don't like to see other individuals die, and in a collective sense, we extend the same feelings of empathy rather automatically to whole species.

But while it may be uncomfortable to think about it, 'we care' doesn't apply to everyone and it's at least worth thinking a bit about why anyone would care.  Every species becomes extinct.  There will always be a last one standing (or, if there are still trees around, swinging).  And it, too, will go.  Even if we except those lineages of life that continue to produce offspring which some day we would dub with a new species name, many if not by far most species eventually disappear without issue.  Extinction is a permanent loss (even if some geneticists occasionally resuscitate a dodo, mammoth or Neanderthal from DNA), but so is every death.

Orangutans and chimpanzees may be cute fellows we can relate to, but there have been quite a few other ape species over the past few million years.  Each was presumably just as cute and cuddly and person-like in its own way, as the orangs, and chimps are today (not to mention those lumbering gorillas, and gibbons, those acrobatic swingers).  But something did each of them in, except for the one lineage that led to us--and it may have been that one--our ancestors--who did the others in at the time!  They didn't set up nature reserves for the other, unlucky, apes.

Extinction is a normal part of life.  So I again think it's fair to ask why our empathy is so poignant when we can document fates such as those of the orangs, or the many other endangered species, that we can see in the flesh.

The daily obituaries and the vanishing orangs are instances of business-as-usual, that happen to be taking place in our own particular time. It may well be that the specific forces at work today are uniquely due to the predominance of humans on the earth, but from a more distanced view, that is just one of many more or less unique eons or events during earth history.  We've been changing things very rapidly, but perhaps not particularly more dramatically than major cataclysmic meteor strikes or volcanic eruptions that have, or may have, quickly changed global climates and led to mass extinctions.

So even if the details of human agency are specific, the phenomena of change are generic.  The comings and goings of individual plants and animals, of species and ecosystems have always been specific to a given time and place.  That is the essence of evolution as we know it.  One could even say that, as biologists, we should be glad that we can see the theoretically hypothesized, inferred process of extinction in action in diverse ways.

It is also interesting to me that while we may rue the passing of a few orangs in a jungle, that means we are being unsympathetic with the people who need (or want) more rice fields or timber, or to make a living by selling ivory.  Indeed, in many ways we often seem less concerned about the many individuals of our own species who are daily subjected to marginal or lethal living circumstances, or who are bombed out of existence, also by humans. There are far, far more such victims--each of them individuals--than there are living orangs who could be subjected to such forlorn fates as are seen in these recent news stories.  So in some ways the 'we' who are so empathetic are sitting in protected privilege, and isn't that empathy a kind of self-flattering feeling?

Vegetarians have various reasons for following their diets, one of them being that they don't want to be the cause of death or suffering of animals.  This may be similar to sentiments about displaced orangs.  And of course, most people who can do eat meat, even if we want to save the orangs, and are antagonistic to poachers or encroaching farmers.  It may seem less existential because we can, after all, always make more chickens or cows.  And if salmon become endangered, we try to stop catching them (for a while).  That is rather selfish even if there might be a twinge of compassion involved.

It is rather similar with climate change, I think.  We know very well that not even continents are forever.  Islands and shores come and they go. We owe today's gorgeous mountain ranges, and spectacles like the Grand Canyon, to yesterday's destruction of what was before.  Why do we care about climate change when, in essence, climates have always changed and the major effects of this epoch will occur decades or a century or more from now, when neither we nor even our children, will be here to see them?  Yes, humans may be the immediate cause of this cycle, but if we stopped driving and flying today, biogeography will change anyway, in its own ways and on its own times.

In every objective sense, climate change and the conflicts and dislocation that will be associated with it, will just be another part of the earth's long and dynamic history, an evolutionary history, not always pleasant, in which every today is left behind to become a yesterday to its tomorrow.

The meaning of 'life' is local
Obviously the key is what 'in every objective sense' means, or doesn't, to these feelings.  Perhaps the fact is that this is not a phenomenon in any objective sense.  Instead, it is personal and emotional. Is it a form of sappy nostalgia, or some strange empathy that we extend to a few furry friends if not always even to members of our own kind?  Is it deeper and more self-referential, an extension of the empathy for our children and close kin, that evolution has programmed in our particular species? Do we shed tears for the hapless orangs because we know our own similar fate awaits us?  Is our empathy for orangs and other endangered species a way of pretending, somehow, that by 'saving the planet' we are saving ourselves--a need to feel important, or a desire to avoid facing the fact that it will be me sometime soon, too?

To many people this sort of empathy gives life its 'meaning', a vague term that refers to values we choose to hold.  These values are subjective, and we know that such 'meaning' is just for our own personal, temporary lives.  Some deeply religious people believe God gave us the earth to exploit, yet other equally religious people believe we must cherish it and keep it as pristine as God made it. Indeed, the atheists I know are at least as empathetic to these values for reasons they might not even think need explaining: it's just how they feel about the cosmos.

Personal value systems are in essence how we choose to live and what to value in the time we happen to have here, even when we also know they make little difference to long-term nature.  It isn't nice to do so, but I think it's appropriate to note that even this kindly view is not so innocent: What people value is also what they so often seem to feel they must force others to value as well, which is more or less how we behave about sociopolitics generally.  We sympathize with the orangs and elephants but demonize the farmers who are clearing the forest, or the ivory hunters.

To me personally, at least, if we want to try to be objective about the cosmos, and we accept that evolution, with its emphasis on reproduction and survival and thats pretty much it, explains how we got here, i't's unclear why we should particularly care about anything other than what affects us directly, or perhaps indirectly in the sense of our children, and even that seems to be purely from survival instinct.  Maybe it just pleases us to see orang-rescue stories.  But I still find it curious that we care or even wish to prevent what we know very well has always happened, and is indeed the basis of the evolutionary processes that made us possible.  In some ways, we act as if extinction were some new phenomenon, newly ominous in the world.

So what if, in our time, it happens to be a few apes or coral or whatever that disappear?  The 'so what?' would have to be either that we are oblivious to the realities of evolutionary existence or that we know those realities but choose to hold some values that give us a sense of existential value or purpose, in our own lives, even if they are individually fleeting.

I myself like attempts to preserve what seems good in the world.  I am warmed by the fact that some lucky orangs will be given a chance at life, because somebody cares to do that for them.  I feel that way even if I know that their eventual death, out there in cruel Nature, is likely to be very unpleasant. In fact, it may be a blessing, because the last orang standing, or swinging, will be very lonely.  And I love our three cats!

I personally agree with sustainability people in thinking that we should cut down--way down--on our consumption and pollution, and on causing major ecological changes.  But I also realize that my feelings are just my own, probably rather egotistical, way of making it through the temporary maze.

There will be a last hurrah for the orangs.  It is likely to be soon.  But, despite all I've just said, I don't want to see it, because while it may be naive vanity, I too care.

Tuesday, March 29, 2016

Statistical Reform.....or Safe-harbor Treadmill Science?

We have recently commented on the flap in statistics circles about the misleading use of significance test results (p-values) rather than a more complete and forthright presentation of the nature of the results and their importance (three posts, starting here).  There has been a lot of criticism of what boils down to misrepresentative headlines publicizing what are in essence very minor results.  The American Statistical Association recently published a statement about this, urging clearer presentation of results.  But one may ask about this and the practice in general. Our recent set of posts discussed the science.  But what about the science politics in all of this?

The ASA is a trade organization whose job it is, in essence, to advance the cause and use of statistical approaches in science.  The statistics industry is not a trivial one.  There are many companies who make and market statistical analytic software.  Then there are the statisticians themselves and their departments and jobs.  So one has to ask is the ASA statement and the other hand-wringing sincere and profound or, or to what extent, is this a vested interest protecting its interests?  Is it a matter of finding a safe harbor in a storm?

Statistical analysis can be very appropriate and sophisticated in science, but it is also easily mis- or over-applied.  Without it, it's fair to say that many academic and applied fields would be in deep trouble; sociopolitical sciences and many biomedical sciences as well fall into this category.  Without statistical methods to compare and contrast sampled groups, these areas rest on rather weak theory.  Statistical 'significance' can be used to mask what is really low level informativeness or low importance under a patina of very high quantitative sophistication.  Causation is the object of science, but statistical methods too often do little more than describe some particular sample.

When a problem arises, as here, there are several possible reactions.  One is to stop and realize that it's time for deeper thinking: that current theory, methods, or approaches are not adequately addressing the questions that are being asked.  Another reaction is to do public hand-wringing and say that what this shows is that our samples have been too small, or our presentations not clear enough, and we'll now reform.  

But if the effects being found are, as is the case in this controversy, typically very weak and hence not very important to society, then the enterprise and the promised reform seem rather hollow. The reform statements have had almost no component that suggests that re-thinking is what's in order. In that sense, what's going on is a stalling tactic, a circling of wagons, or perhaps worse, a manufactured excuse to demand even larger budgets and longer-term studies, that is to demand more--much more--of the same.

The treadmill problem

If that is what happens, it will keep scientists and software outfits and so on, on the same treadmill they've been on, that has led to the problem.  It will also be contrary to good science.  Good science should be forced by its 'negative' results, to re-think its questions. This is, in general, how major discoveries and theoretical transformations have occurred.  But with the corporatization of academic professions, both commercial and in the sense of trade-unions, we have an inertial factor that may actually impede real progress.  Of course, those dependent on the business will vigorously resist or resent such a suggestion. That's normal and can be expected, but it won't help unless a spirited attack on the problems at hand goes beyond more-of-the-same.




Is it going to simulate real new thinking, or mainly just strategized thinking for grants and so on?

So is the public worrying about this a holding action or a strategy? Or will we see real rather than just symbolic, pro forma, reform? The likelihood is not, based on the way things work these days.

There is a real bind here. Everyone depends on the treadmill and keeping it in operation. The labs need their funding and publication treadmills, because staff need jobs and professors need tenure and nice salaries. But if by far most findings in this arena are weak at best, then what journals will want to publish them? They have to publish something and keep their treadmill going. What news media will want to trumpet them, to feed their treadmill? How will professors keep their jobs or research-gear outfits sell their wares?

There is fault here, but it's widespread, a kind of silent conspiracy and not everyone is even aware of it. It's been built up gradually over the past few decades, like the frog in slowly heating water who does't realize he's about to be boiled alive. We wear the chains we've forged in our careers. It's not just a costly matter, and one of understandable careerism. It's a threat to the integrity of the enterprise itself.
We have known many researchers who have said they have to be committed to a genetic point of view because that's what you have to do to get funded, to keep your lab going, to get papers in the major journals or have a prominent influential career. One person applying for a gene mapping study to find even lesser genomic factors than the few that were already well-established said, when it was suggested that rather than find still more genes, perhaps the known genes might now be investigated instead, "But, mapping is what I do!".  Many a conversation I've heard is a quiet boasting about applying for funding for work that's already been done, so one can try something else (that's not being proposed for reviewers to judge).

If this sort of 'soft' dishonesty is part of the game (and if you think it's 'soft'), and yet science depends centrally on honesty, why do we think we can trust what's in the journals?  How many seriously negating details are not reported, or buried in huge 'supplemental' files, or not visible because of intricate data manipulation? Gaming the system undermines the very core of science: its integrity.  Laughing about gaming the system adds insult to injury.  But gaming the system is being taught to graduate students early in their careers (it's called 'grantsmanship').


We have personally encountered this sort of attitude, expressed only in private of course, again and again in the last couple of decades during which big studies and genetic studies have become the standard operating mode in universities, especially biomedical science (it's rife in other areas like space research, too, of course).  


There's no bitter personal axe being ground here.  I've retired, had plenty of funding through the laboratory years, our work was published and recognized.  The problem is of science not personal.  The challenge to understand genetics, development, causation and so forth is manifestly not an easy one, or these issues would not have arisen.  

It's only human, perhaps, given that the last couple of generations of scientists systematically built up an inflated research community, and the industries that serve it, much of which depends on research grant funding, largely at the public trough, with jobs and labs at stake.  The members of the profession know this, but are perhaps too deeply immersed to do anything major to change it, unless some sort of crisis forces that upon us. People well-heeled in the system don't like these thoughts being expressed, but all but the proverbial 1%-ers, cruising along just fine in elite schools with political clout and resources, know there's a problem and know they dare not say too much about it.


The statistical issues are not the cause.  The problem is a combination of the complexity of biological organisms as they have evolved, and the simplicity of human desires to understand (and not to get disease).  We are pressured not just to understand, but to translate that into dramatically better public and individual health.  Sometimes it works very well, but we naturally press the boundaries, as science should.  But in our current system we can't afford to be patient.  So, we're on a treadmill, but it's largely a treadmill of our own making.

Wednesday, March 23, 2016

Playing the Big Fiddle while Rome burns?

We've seemed to have forgotten the trust-busting era that was necessary to control monopolistic acquisition of resources.  That was over a century ago, and now we're again allowing already huge companies to merge and coalesce.  It's rationalized in various ways, naturally, by those on the gain.  It's the spirit and the power structure of our times, for whatever reason.  Maybe that explains why the same thing is happening in science as universities coo over their adoption of 'the business model'.

We're inundated in jargonized ways of advertising to co-opt research resources, with our  'omics' and 'Big Data' labeling.  Like it or not, this is how the system is working in our media and self-promotional age.  One is tempted to say that, as with old Nero, it may take a catastrophic fire to force us to change.  Unfortunately, that imagery is apparently quite wrong.  There were no fiddles in Nero's time, and if he did anything about the fire it was to help sponsor various relief efforts for those harmed by it.  But whatever imagery you want, our current obsession with scaling up to find more and more that explains less and less is obvious. Every generation has its resource competition games, always labeled as for some greater good, and this is how our particular game is played.  But there is a fire starting, and at least some have begun smelling the smoke.

Nero plucks away.  Sourcc: Wikipedia images, public domain
The smolder threatens to become an urgent fire, truly, and not just as a branding exercise.  It is a problem recognized not just by nay-saying cranks like us who object to how money is being burnt to support fiddling with more-of-the-same-not-much-new research.  It is an area where a major application of funds could have enormously positive impact on millions of people, and where causation seems to be quite tractable and understandable enough that you could even find it with a slide rule.

We refer to the serious, perhaps acute, problem with antibiotic resistance.  Different bugs are being discovered to be major threats, or to have evolved to become so, both for us and for the plants and animals who sacrifice their lives to feed us. Normal evolutionary dynamics, complemented with our agricultural practices, our population density and movement, and perhaps other aspects of our changing of local ecologies, is opening space for the spread of new or newly resistant pathogens.

This is a legitimate and perhaps imminent threat of a potentially catastrophic scale.  Such language is not an exercise in self-promotional rhetoric by those warning us of the problem. There is plenty of evidence that epidemic or even potentially pandemic shadows loom.  Ebola, zika, MRSA, persistent evolving malaria, and more should make the point and we have history to show that epidemic catastrophes can be very real indeed.

Addressing this problem rather than a lot of the wheel-spinning, money-burning activities now afoot in the medical sciences would be where properly constrained research warrants public investment.  The problem involves the ecology of the pathogens, our vulnerabilities as hosts, weaknesses in the current science, and problems in the economics of such things as antibacterial drugs or vaccinations.  These problems are tractable, with potentially huge benefit.

For a quick discussion, here is a link to a program by the statistical watchdog BBC Radio program MoreOrLess on antibiotic resistance  Of course there are many other papers and discussions as well.  We're caught between urgently increasing need, and the logistics, ecology, and economics that threaten to make the problem resistant to any easy fixes.

There's plenty of productive science that can be done that is targeted to individual causes that merit our attention, and for which technical solutions of the kind humans are so good at might be possible. We shouldn't wait to take antibiotic resistance seriously, but clearing away the logjam of resource commitments in genetic and epidemiological research to large weakly statistical efforts well into diminishing returns, or research based on rosy promises where we know there are few flowers, will not be easy...but we are in danger of fiddling around detecting risk factors with ever-decreasing effect sizes until the fire spreads to our doorsteps.

Tuesday, March 22, 2016

The statistics of Promissory Science. Part II: The problem may be much deeper than acknowledged

Yesterday, I discussed current issues related to statistical studies of things like genetic or other disease risk factors.  Recent discussion has criticized the misuse of statistical methods, including a statement on p-values by the American Statistical Association.  As many have said, the over-reliance on p-values can give a misleading sense that significance means importance of a tested risk factor.  Many touted claims are not replicated in subsequent studies, and analysis has shown this may preferentially apply to the 'major' journals.  Critics have suggested that p-values not be reported at all, or only if other information like confidence intervals (CIs) and risk factor effect sizes be included (I would say prominently included). Strict adherence will likely undermine what even expensive major studies can claim to have found, and it will become clear that many purported genetic, dietary, etc., risk factors are trivial, unimportant, or largely uninformative.

However, today I want to go farther, and question whether even making these correctives doesn't go far enough, and would perhaps serve as a convenient smokescreen for far more serious implications of the same issue. There is reason to believe the problem with statistical studies is more fundamental and broad than has been acknowledged.

Is reporting p-values really the problem?
Yesterday I said that statistical inference is only as good as the correspondence between the mathematical assumptions of the methods and what is being tested in the real world.  I think the issues at stake rest on a deep disparity between them.  Worse, we don't and often cannot know which assumptions are violated, or how seriously.  We can make guesses and do all auxiliary tests and the like, but as decades of experience in the social, behavioral, biomedical, epidemiological, and even evolutionary and ecological worlds show us, we typically have no serious way to check these things.

The problem is not just that significance is not the same as importance. A somewhat different problem with standard p-value cutoff criteria is that many of the studies in question involve many test variables, such as complex epidemiological investigations based on long questionnaires, or genomewide association studies (GWAS) of disease. Normally, p=0.05 means that by chance one test in 20 will seem to be significant, even if there's nothing causal going on in the data (e.g., if no genetic variant actually contributes to the trait).  If you do hundreds or even many thousands of 0.05 tests (e.g., of sequence variants across the genome), even if some of the variables really are causative, you'll get so many false positive results that follow-up will be impossible.  A standard way to avoid that is to correct for multiple testing by using only p-values that would be achieved by chance only once in 20 times of doing a whole multivariable (e.g., whole genome) scan.  That is a good, conservative approach, but means that to avoid a litter of weak, false positives, you only claim those 'hits' that pass that standard.

You know you're only accounting for a fraction of the truly causal elements you're searching for, but they're the litter of weakly associated variables that you're willing to ignore to identify the mostly likely true ones.  This is good conservative science, but if your problem is to understand the beach, you are forced to ignore all the sand, though you know it's there.  The beach cannot really be understood by noting its few detectable big stones.

Sandy beach; Wikipedia, Lewis Clark

But even this sensible play-it-conservative strategy has deeper problems.

How 'accurate' are even these preferred estimates?
The metrics like CIs and effect sizes that critics are properly insisting be (clearly) presented along with or instead of p-values face exactly the same issues as the p-value: the degree to which what is modeled fits the underlying mathematical assumptions on which test statistics rest.

To illustrate this point, the Pythagorean Theorem in plane geometry applies exactly and universally to right triangles. But in the real world there are no right triangles!  There are approximations to right triangles, and the value of the Theorem is that the more carefully we construct our triangle the closer the square of the hypotenuse is to the sum of the squares of the other sides.  If your result doesn't fit, then you know something is wrong and you have ideas of what to check (e.g., you might be on a curved surface).

Right triangle; Wikipedia

In our statistical study case, knowing an estimated effect size and how unusual it is seems to be meaningful, but we should ask how accurate these estimates are.  But that question often has almost no testable meaning: accurate relative to what?  If we were testing a truth derived from a rigorous causal theory, we could ask by how many decimal places our answers differ from that truth.  We could replicate samples and increase accuracy, because the signal to noise ratio would systematically improve.  Were that to fail, we would know something was amiss, in our theory or our instrumentation, and have ideas how to find out what that was.  But we are far, indeed unknowably far, from that situation.  That is because we don't have such an externally derived theory, no analog to the Pythagorean Theorem, in important areas where statistical study techniques are being used.

In the absence of adequate theory, we have to concoct a kind of data that rests almost entirely on internal comparison to reveal whether 'something' of interest (often that we don't or cannot specify) is going on.  We compare data such as cases vs controls, which forces us to make statistical assumptions such as that, other than (say) exposure to coffee, our sample of diseased vs normal subjects differ only in their coffee consumption, or that the distribution of other variation in unmeasured variables is random with regard to coffee consumption among our cases and controls subjects. This is one reason, for example, that even statistically significant correlation does not imply causation or importance. The underlying, often unstated assumptions are often impossible to evaluate. The same problem relates to replicability: for example, in genetics, you can't assume that some other population is the same as the population you first studied.   Failure to replicate in this situation does not undermine a first positive study.  For example, a result of a genetic study in Finland cannot be replicated properly elsewhere because there's only one Finland!  Even another study sample within Finland won't necessarily replicate the original sample.  In my opinion, the need for internally based comparison is the core problem, and a major reason why theory-poor fields often do so poorly.

The problem is subtle
When we compare cases and controls and insist on a study-wide 5% significance level to avoid a slew of false-positive associations, we know we're being conservative as described above, but at least those variables that do pass the adjusted test criterion are really causal with their effect strengths accurately estimated.  Right?  No!

When you do gobs of tests, some very weak causal factor may by good luck pass your test. But of those many contributing causal factors, the estimated effect size of the lucky one that passes the conservative test is something of a fluke.  The estimated effect size may well be inflated, as experience in follow-up studies often or even typically shows.

In this sense it's not just p-values that are the problem, and providing ancillary values like CIs and effect sizes in study reports is something of a false pretense of openness, because all of these values are vulnerable to similar problems.  The promise to require these other data is a stopgap, or even a strategy to avoid adequate scrutiny of the statistical inference enterprise itself.

It is nobody's fault if we don't have adequate theory.  The fault, dear Brutus, is in ourselves, for using Promissory Science, and feigning far deeper knowledge than we actually have.  We do that rather than come clean about the seriousness of the problems.  Perhaps we are reaching a point where the let-down from over-claiming is so common that the secret can't be kept in the bag, and the paying public may get restless.  Leaking out a few bits of recognition and promising reform is very different from letting all it all out and facing the problem bluntly and directly.  The core problem is not whether a reported association is strong or meaningful, but, more importantly, that we don't know or know how to know.

This can be seen in a different way.   If all studies including negative ones were reported in the literature, then it would be only right that the major journals should carry those findings that are most likely true, positive, and important.  That's the actionable knowledge we want, and a top journal is where the most important results should appear.  But the first occurrence of a finding, even if it turns out later to be a lucky fluke, is after all a new finding!  So shouldn't investigators report it, even though lots of other similar studies haven't yet been done?  That could take many years or, as in the example of Finnish studies, be impossible.  We should expect negative results should be far more numerous and less interesting in themselves, if we just tested every variable we could think of willy-nilly, but in fact we usually have at least some reason to look, so it is far from clear what fraction of negative results would undermine the traditional way of doing business.  Should we wait for years before publishing anything? That's not realistic.

If the big-name journals are still seen as the place to publish, and their every press conference and issue announcement is covered by the splashy press, why should they change?  Investigators may feel that if they don't stretch things to get into these journals, or just publish negative results, they'll be thought to have wasted their time or done poorly designed studies.  Besides normal human vanity, the risk is that they will not be able to get grants or tenure.  That feeling is the fault of the research, reputation, university, and granting systems, not the investigator.  Everyone knows the game we're playing. As it is, investigators and their labs have champagne celebrations when they get a paper in one of these journals, like winning a yacht race, which is a reflection of what one could call the bourgeois nature of the profession these days.

How serious is the problem?  Is it appropriate to characterize what's going on as fraud, hoax, or silent conspiracy?  Probably in some senses yes; at least there is certainly culpability among those who do understand the epistemological nature of statistics and their application.  Plow ahead anyway is not a legitimate response to fundamental problems.

When reality is closely enough approximated by statistical assumptions, causation can be identified, and we don't need to worry about the details.  Many biomedical and genetic, and probably even some sociological problems are like that.  The methods work very well in those cases.  But this doesn't gainsay the accusation that there is widespread over-claiming taking place and that the problem is a deep lack of sufficient theoretical understanding of our fields of interest, and a rush to do more of the same year after year.

It's all understandable, but it needs fixing.  To be properly addressed, an entrenched problem requires more criticism even than this one has been getting recently.  Until better approaches come along, we will continue wasting a lot of money in the rather socialistic support of research establishments that keep on doing science that has well-known problems.

Or maybe the problem isn't the statistics, after all?
The world really does, after all, seem to involve causation and at its basis seems to be law-like. There is truth to be discovered.  We know this because when causation is simple or strong enough to be really important, anyone can find it, so to speak, without big samples or costly gear and software. Under those conditions, numerous details that modify the effect are minor by comparison to the major signals.  Hundreds or even thousands of clear, mainly single-gene based disorders are known, for example.  What is needed is remediation, hard-core engineering to do something about the known causation.

However, these are not the areas where the p-value and related problems have arisen.  That happens when very large and SASsy studies seem to be needed, and the reason is that there causal factors are weak and/or so complex.  Along with trying to root out misrepresentation and failure to report the truth adequately, we should ask whether, perhaps, the results showing frustrating complexity are correct.

Maybe there is not a need for better theory after all.  In a sense the defining aspect of life is that it evolves not by the application of external forces as in physics, but by internal comparison--which is just what survey methods assess.  Life is the result of billions of years of differential reproduction, by chance and various forms of selection--that is, continual relative comparison by local natural circumstances.  'Differential' is the key word here.  It is the relative success among peers today that determines the genomes and their effects that will be here tomorrow.  In a way, in effect and if often unwittingly and for lack of better ideas, that's just the sort of comparison made in statistical studies.

From that point of view, the problem is that we don't want to face up to the resulting truth, which is that a plethora of changeable, individually trivial causal factors is what we find because that's what exists.  That we don't like that, don't report it cleanly, and want strong individual causation is our problem, not Nature's.

Wednesday, March 16, 2016

The statistics of Promissory Science. Part I: Making non-sense with statistical methods

Statistics is a form of mathematics, a way devised by humans for representing abstract relationships. Mathematics comprises axiomatic systems, which make assumptions about basic units such as numbers; basic relationships like adding and subtracting; and rules of inference (deductive logic); and then elaborates these to draw conclusions that are typically too intricate to reason out in other less formal ways.  Mathematics is an awesomely powerful way of doing this abstract mental reasoning, but when applied to the real world it is only as true or precise as the correspondence between its assumptions and real-world entities or relationships. When that correspondence is high, mathematics is very precise indeed, a strong testament to the true orderliness of Nature.  But when the correspondence is not good, mathematical applications verge on fiction, and this occurs in many important applied areas of probability and statistics.

You can't drive without a license, but anyone with R or SAS can be a push-button scientist.  Anybody with a keyboard and some survey generating software can monkey around with asking people a bunch of questions and then 'analyze' the results. You can construct a complex, long, intricate, jargon-dense, expansive survey. You then choose who to subject to the survey--your 'sample'.  You can grace the results with the term 'data', implying true representation of the world, and be off and running.  Sample and survey designers may be intelligent, skilled, well-trained in survey design, and of wholly noble intent.  There's only one little problem: if the empirical fit is poor, much of what you do will be non-sense (and some of it nonsense).

Population sciences, including biomedical, evolutionary, social and political fields are experiencing an increasingly widely recognized crisis of credibility.  The fault is not in the statistical methods on which these fields heavily depend, but in the degree of fit (or not) to the assumptions--with the emphasis these days on the 'or not', and an often dismissal of the underlying issues in favor of a patina of technical, formalized results.  Every capable statistician knows this, but of course might be out of business if openly paying it enough attention. And many statisticians may be rather disinterested or too foggy in the philosophy of science to understand what goes beyond the methodological technicalities.  Jobs and journals depend on not being too self-critical.  And therein lie rather serious problems.

Promissory science
There is the problem of the problems--the problems we want to solve, such as in understanding the cause of disease so that we can do something about it.  When causal factors fit the assumptions, statistical or survey study methods work very well.  But when causation is far from fitting the assumptions, the impulse of the professional community seems mainly to increase the size, scale, cost, and duration of studies, rather than to slow down and rethink the question itself.  There may be plenty of careful attention paid to refining statistical design, but basically this stays safely within the boundaries of current methods and beliefs, and the need for research continuity.  It may be very understandable, because one can't just quickly uproot everything or order up deep new insights.  But it may be viewed as abuse of public trust as well as of the science itself.

The BBC Radio 4 program called More Or Less keeps a watchful eye on sociopolitical and scientific statistical claims, revealing what is really known (or not) about them.  Here is a recent installment on the efficacy (or believability, or neither) of dietary surveys.  And here is a FiveThirtyEight link to what was the basis of the podcast.

The promotion of statistical survey studies to assert fundamental discovery has been referred to as 'promissory science'.  We are barraged daily with promises that if we just invest in this or that Big Data study, we will put an end to all human ills.  It's a strategy, a tactic, and at least the top investigators are very well aware of it.  Big long-term studies are a way to secure reliable funding and to defer delivering on promises into the vague future.  The funding agencies, wanting to seem prudent and responsible to taxpayers with their resources, demand some 'societal impact' section on grant applications.  But there is in fact little if any accountability in this regard, so one can say they are essentially bureaucratic window-dressing exercises.

Promissory science is an old game, practiced since time immemorial by preachers.  It boils down to promising future bliss if you'll just pay up now.  We needn't be (totally) cynical about this.  When we set up a system that depends on public decisions about resources, we will get what we've got.  But having said that, let's take a look at what is a growing recognition of the problem, and some suggestions as to how to fix it--and whether even these are really the Emperor of promissory science dressed in less gaudy clothing.

A growing at least partial awareness
The problem of results that are announced by the media, journals, universities, and so on but that don't deliver the advertised promises is complex but widespread, in part because research has become so costly, that some warning sirens are sounding when it becomes clear that the promised goods are not being delivered.

One widely known issue is the lack of reporting of negative results, or their burial in minor journals. Drug-testing research is notorious for this under-reporting.  It's too bad because a negative result on a well-designed test is legitimately valuable and informative.  A concern, besides corporate secretiveness, is that if the cost is high, taxpayers or share-holders may tire of funding yet more negative studies.  Among other efforts, including by NIH, there is a formal attempt called AllTrials to rectify the under-reporting of drug trials, and this does seem at least to be thriving and growing if incomplete and not enforceable.  But this non-reporting problem has been written about so much that we won't deal with it here.

Instead, there is a different sort of problem.  The American Statistical Association has recently noted an important issue, which is the use and (often) misuse of p-values to support claims of identified  causation (we've written several posts in the past about these issues; search on 'p-value' if you're interested, and the post by Jim Wood is especially pertinent).  FiveThirtyEight has a good discussion of the p-value statement.

The usual interpretation is that p represents the probability that if there is in fact no causation by the test variable, that its apparent effect arose just by chance.  So if the observed p in a study is less than some arbitrary cutoff, such as 0.05, it means essentially that if no causation were involved the chance you'd see this association anyway is no greater than 5%; that is, there is some evidence for a causal connection.

Trashing p-values is becoming a new cottage industry!  Now JAMA is on the bandwagon, with an article that shows in a survey of biomedical literature from the past 25 years, including well over a million papers, a far disproportionate and increasing number of studies reported statistical significant results.  Here is the study on the JAMA web page, though it is not public domain yet.

Besides the apparent reporting bias, the JAMA study found that those papers generally failed to provide adequate fleshing out of that result.  Where are all the negative studies that statistical principles might expect to be found?  We don't see them, especially in the 'major' journals, as has been noted many times in recent years.  Just as importantly, authors often did not report confidence intervals or other measures of the degree of 'convincingness' that might illuminate the p-value. In a sense that means authors didn't say what range of effects is consistent with the data.  They report a non-random effect, but often didn't give the effect size, that is, say how large the effect was even assuming that effect was unusual enough to support a causal explanation. So, for example, a statistically significant increase of risk from 1% to 1.01% is trivial, even if one could accept all the assumptions of the sampling and analysis.

Another vocal critic of what's afoot is John Ionnides; in a recent article he levels both barrels against the misuse and mis- or over-representation of statistical results in biomedical sciences, including meta-analysis (the pooling of many diverse small studies into a single large analysis to gain sufficient statistical power to detect effects and test for their consistency).  This paper is a rant, but a well-deserved one, about how 'evidence-based' medicine has been 'hijacked' as he puts it.  The same must be said of  'precision genomic' or 'personalized' medicine, or 'Big Data', and other sorts of imitative sloganeering going on from many quarters who obviously see this sort of promissory science as what you have to do to get major funding.  We have set ourselves a professional trap, and it's hard to escape.  For example, the same author has been leading the charge against misrepresentative statistics for many years, and he and others have shown that the 'major' journals have in a sense the least reliable results in terms of their replicability.  But he's been raising these points in the same journals that he shows are culpable of the problem, rather than boycotting those journals.  We're in a trap!

These critiques of current statistical practice are the points getting most of the ink and e-ink.  There may be a lot of cover-ups of known issues, and even hypocrisy, in all of this, and perhaps more open or understandable tacit avoidance.  The industry (e.g., drug, statistics, and research equipment) has a vested interest in keeping the motor running.  Authors need to keep their careers on track.  And, in the fairest and non-political sense, the problems are severe.

But while these issues are real and must be openly addressed, I think the problems are much deeper. In a nutshell, I think they relate to the nature of mathematics relative to the real world, and the nature and importance of theory in science.  We'll discuss this tomorrow.

Tuesday, March 15, 2016

Obesity and diabetes: Actual epigenetics or just IVF?

This press release that appeared in my newsfeed titled "You are what your parents ate!" caught my eye because I'm a new mom of a new human and also because I study and teach human evolution.

So I clicked on it.

And after that title primed me to think about me!, the photo further encouraged my assumption that this is really all about humans.


"You are what your parents ate!"

But it's about mice. Yes, evolution, I know, I know. We share common ancestry with mice which is why they can be good experimental models for understanding our own biology. But we have been evolving separately from mice for a combined total of over 100 million years. Evolution means we're similar, yes, but evolution also means we're different.

Bah. It's still fascinating, mice or men, womice or women! So I kept reading and learned how new mice made with IVF--that is, made of eggs and sperm from lab-induced obese and diabetic mouse parents, but born of healthy moms--inherited the metabolic troubles of their biological parents. And by inherited, we're not talking genetically, because these phenotypes are lab-induced. We're talking epigenetically. So the eggs and sperm did it, but not the genomes they carry!

This isn't so surprising if you've been following the burgeoning field of epigenetics, but it's hard to look away. This fits with how we see secular increases in human obesity and adult-onset diabetes--it can't be genomic evolution, it must be epigenetic evolution, whatever that means!

As the press release says...
"From the perspective of basic research, this study is so important because it proves for the first time that an acquired metabolic disorder can be passed on epigenetically to the offspring via oocytes and sperm- similar to the ideas of Lamarck and Darwin," said Professor ...
Whole new ways of thinking are so exciting.

Except when you remember a two-year-old piece by Bethany Brookshire (because you use it to teach a course on sex and reproduction) which explained something that suggests we may have a major experimental problem with the study above.

In IVF, the sperm gets isolated (or "washed") from the semen.

You know what happens, to mice in particular, when there's no semen? Obesity and other symptoms of metabolic syndrome! There are placental differences too. This was published in PNAS.


"Offspring of male mice without seminal fluid had bigger placentas (top right) and increased body fat (bottom right) compared with offspring of normal male mice (left images)" from The fluid part of semen plays a seminal role by Bethany Brookshire.

So I went back to look at the original paper that the press release with the donut lady was about. I wanted to see if they are aware of this potential problem with IVF and whether it explains their findings, rather than the trendy concept of epigenetics...

So even though they titled it "Epigenetic germline inheritance of diet-induced obesity and insulin resistance," I wanted to see if they at least accounted for this trouble with semen, like how it's probably important, how its absence may bring about the same phenotypes they're tracking, and how IVF doesn't use semen.

But I don't have access to Nature Genetics.

Who has access to Nature Genetics, can check out the paper, and wants to write the ending of this blog post?

Step right up! Post your work in the comments (or email me holly_dunsworth@uri.edu, and please include a pdf of the paper so I can see too) and I'll paste it right here.

Update 12:19 pm
Two very good comments below are helpful. Please read those.

I'll add that I now have the pdf of the paper (but not the Supplemental portion where all the methods live and other important information resides). This quote from the second paragraph implies they do not agree with the finding of (or have forgotten about) the phenotypic variation apparently caused by sperm washed of their seminal fluid:
"The use of IVF enabled us to ensure that any inherited phenotype was exclusively transmitted via gametes."
As the second commenter (Anonymous) pointed out below, there does not appear to be a comparison of development or behavior between any of the IVF mice and mice made by mouse sex. So there is no way to tell whether their IVF mice exhibit the same metabolic changes that the semen/semenless study found. Therefore, it is neither possible to work the semen issue into the explanation nor to rule out its effects. Seems like a missed opportunity.

Completely unrelated and inescapable... I'm a little curious about how the authors decided to visualize their data like this: