33 Comments
User's avatar
Cool Librarian's avatar

This makes me think about science in general, in which researchers are encouraged to generate theories that contain lots of complicated factors, but may not lead to results outside the lab. My cognitive distortions are telling me that this means that all researchers are basically over analyzing the data, with the same fervor as severely anxious people do, to uncover the results that confirm their biases toward getting a particular outcome. This makes me question the reliability and usefulness of theoretical work in general, but what do you think?

Expand full comment
Tommy Blanchard's avatar

I think there's something to the idea that science has picked the low-hanging fruit, making further important findings more difficult and scarce. So the value of scientific findings might have an inverse power law thing going on. When we knew nothing, figuring out that a lot of illnesses are caused by germs was a big step forward

I don't think this is reason to think the data is being overanalyzed, though! A lot of the world is just complicated, and to understand with increasing nuance we need increasingly complex models and theories.

Expand full comment
Cool Librarian's avatar

But how do you know if a model is too complex and you need to remove redundancies? Doesn’t Occam’s razor apply here?

Expand full comment
Tommy Blanchard's avatar

If we're talking about an actual statistical or machine learning model, there are standard tests to see if additional variables are adding explanatory power or not--like Bayesian Information Criterion or using cross-validation to check for over fitting. All else equal you should prefer a simple model, but you can just check if a more complicated model with additional variables fits new data better

Expand full comment
John's avatar

Paul Bloom wrote about this recently though not on this topic specifically.

https://open.substack.com/pub/paulbloom/p/its-fine-if-your-studies-dont-work

Expand full comment
Ruv Draba's avatar

CL, the rate at which institutions produce academics is tied to the availability of money to fund them, and not the availability of useful science -- and the principle products of academics are cost and papers rather than (say) novel and economically successful patents.

Further, the main gateway of scientific publication is peer review. While that does detect bias in data, it doesn't do so great a job on assessing the tractability of the question, or on the public benefit of a prospective answer if such an answer were ever found.

So while papers are burgeoning there's extensive evidence that the relevance of scientific output is falling -- not so much the accuracy (which remains high) but the disruptive insights that we'd embrace economically and socially because of their benefits.

However, this problem isn't just with theoretical or so-called 'pure' science, but also in applied science among academics. Shifting more funding from pure to applied research alone is unlikely to fix it. To say it simply: academic scientists are picking stupider problems with each generation, and aren't being penalised for doing so.

Science communicator Sabine Hossenfelder covers the question here: (https://www.youtube.com/watch?v=QtxjatbVb7M) She's very critical of what she calls 'bullshit science', and the same concern saw me leave academe for industry some decades ago.

Expand full comment
Cool Librarian's avatar

What do you mean by accuracy? Do you mean the degree of precision of the research tools used?

Expand full comment
Ruv Draba's avatar

CL, let's define 'understanding' to be at minimum, our ability to correctly predict what's going to happen. (You can define it to mean more than this, but it can't mean less than this.) Science then is our attempt to understand whatever we can observe.

Understanding can be measured in precision and accuracy. Precision means how clearly we can explain what's going to happen; accuracy means how correct we are when we observe it.

Precision and accuracy are built on standards and measures. The idea is that over time these become more exacting, rather than less. This allows (for example), weather forecasting to become more accurate over longer periods of time (in the late 20th century you could get accurate weather forecasts three days out; now forecasts can be accurate up to ten days out.)

You correctly said that humans can have biases toward finding evidence that supports what they already believe (we're notorious for it.) There are multiple forms of this bias but a common one is 'confirmation bias': we rapidly recognise supporting evidence, but are slower to recognise when we're wrong.

In the sciences, peer review is very good at detecting confirmation bias, and the methods of science itself are very good at preventing this from occurring (though it still sometimes occurs in papers.) Unfortunately, keeping the biases out of precision and accuracy don't help with productivity when a growing number of scientists spend their days trying to understand insignificant, unhelpful or intractable things.

I hope this may help. More info on it if you're interested.

Expand full comment
Cool Librarian's avatar

What is your definition of what is insignificant, unhelpful or intractable? You said scientists are picking stupider problems to study, which is a imprecise and frankly cynical take on scientific innovation. Or maybe you are referring to statistical significance?

Expand full comment
Ruv Draba's avatar

> What is your definition of what is insignificant, unhelpful or intractable?

My definitions are probably the ones you'd expect, CL. What I think you're looking for is some indication of the measures, so here they are.

We can measure the significance of a scientific result in arrears by looking at how a population responds to a scientific result being published. There are multiple possible measures for this and I can list them, but the major measures show that while the volume of scientific papers increases, the population response remains static -- which means that the overall significance of scientific output is falling.

We can measure 'helpful' as the benefit from population response according to needs that we recognise, so again it's measured in arrears. As a single example, the benefits of medical treatments or safety technologies can be measured actuarily as Years of Life Lost (YLL) that have been avoided. YLL is a measure that can apply to diverse calamities from earthquakes through to road accidents or deaths to disease. There are multiple such measures, but YLL is a handy example of how different beneficial results can be measured in the same way. Generally, 'benefit' is a mixed bag. Some fields are doing amazing things almost routinely; for others it's less so. But the cost is also significant. As you might expect, if population response to scientific publication falls, benefit may fall too.

Finally, we can measure the tractability of a problem in arrears by how long it takes to produce a target benefit. Please note that scientific inquiry doesn't happen arbitrarily -- there are grant applications, position papers and so on. So we can tie investigation to the outcomes that it hopes to produce. So there are multiple measures of 'how long' -- elapsed years, staff years, dollars, papers published. We have whole areas of science that claim to pursue key results, are throwing greater and greater resources at the problem, yet have repeatedly under-estimated how long that will take. Such measures recognise an intractable problem.

> a imprecise and frankly cynical take on scientific innovation

As I hope you can see, once we have the measures, we can make them precise.

Unfortunately this evaluation isn't cynical; just disagreeable -- these problems are predicted by the economics in how we presently develop scientific capability and how we support, recognise and reward scientific inquiry. The trending is that they're worsening generationally. So while we could choose to ignore it, if we want a different result then we need to reform the structures by which we develop and engage scientific research, which also means reforming our academic institutions.

> maybe you are referring to statistical significance?

No, statistical significance is a different topic, CL. It refers to the volume of data we need to ensure that we're detecting what we set out to find, rather than something else.

Again, I hope that might help.

Expand full comment
Cool Librarian's avatar

Are you saying that even as more papers are being published, their results can not be replicated?

Expand full comment
The Internet Wife's avatar

i agree, i think there’s a simpler way to do this. I’m not academically trained BUT i have studied everuthing in the article on my own and i realize building an internal ‘science system’ that one tests are probably more understandable

Expand full comment
Cool Librarian's avatar

How is this relevant to what I’m saying?

Expand full comment
The Internet Wife's avatar

just ignore it…was more a note for myself 😬

Expand full comment
Ram's avatar

it's brilliant how you explain stuff so easily. you're a great writer

Expand full comment
Tommy Blanchard's avatar

Thanks so much for this, truly. Even when getting views and "likes" on an article, it's hard to tell if I'm really doing anything worthwhile--they're just numbers. Comments like this are what keep me going.

Expand full comment
Domenic C. Scarcella's avatar

Didn't know you worked as a data scientist. Cool!

I worked briefly in web development, and had some learning units with data science students in bootcamp. When I saw what the DS folks were working with, I wondered if I should've chosen the DS track instead of WebDev! Despite being a writer, I'm more of a "math guy" and enjoyed using Python for coding algorithms.

Expand full comment
Tommy Blanchard's avatar

Yeah Python is fun! My undergrad was in computer science so I always had the coding skills and found programming fun, but the science background gave me stats and ML experience. When I left academia I was surprised these skills were considered valuable, I had no idea

Expand full comment
Dr. Joel M. Hoffman's avatar

Very interesting! I wonder how much overlap there is between this kind of decision paralysis and perfectionism...

https://ancientwisdommodernlives.com/p/monday-motivation-perfectionism-stranglehold

Expand full comment
Tommy Blanchard's avatar

I suspect a lot!

Expand full comment
Ruv Draba's avatar

Tommy, thank you for this exposition, which I enjoyed immensely. Some additional information follows. I leave it to you and your readers as to whether they're insights. In any case, the critics of Buridan's Ass missed some practical points.

Another rule of thumb is the 'Rule of Three' which says that we can survive three minutes without air, three days without water or three weeks without food.

Which points out something we all know, but don't always consider: the rate of change of constraints alters the dynamics of the problem. So sometimes waiting ('do nothing') helps to resolve a problem that analysis alone can't readily resolve.

For example, even if hunger and thirst were equipoised for Buridan's Ass for an hour, they wouldn't be for a day. Donkeys are warm, medium-sized creatures. In all but the coldest climates, thirst would eventually beat hunger.

Further, donkeys are a browsing animal, good at finding food in deserts and able to eat almost anything when there are grasses. Among donkey-keepers, they're known for 'getting fat by breathing air'.

Which also illustrates that the food you've supplied may not be all the food that a donkey could find. The value of a resource is contextualised by its availability, which also changes over time, and in response to effort. So even if the rationale for a decision is ambiguous today, it might not be tomorrow.

And the reason this all matters in data science is that data science itself is often quite slow to produce a sound answer. There's an old joke among data-warehouse developers that you spend 80% of the budget building the repository, and the 'other' 80% cleaning the data...

TLDR:Sometimes the best answer is just to wait and watch the donkey.

Expand full comment
Tommy Blanchard's avatar

Maybe the rule of three is why modern versions usually make it between two equally attractive bales of hay

Expand full comment
Ruv Draba's avatar

Mmm, but in Information Engineering, straw doesn't come in standard bales, but has been blown about, trampled and raked into piles; the piles are never the same size, age, or in the same place, and all are of unknown quality and mixed in with who knows what-else.

Which is why information engineers don't go to philosophers for time-critical, budget-limited solutions, but *will* still carefully study the donkey.😏

Expand full comment