Two service announcements before we go into the topic of today's video, how psychology lies to you using statistics.
And so the first service announcement has again, unfortunately, to do with Richard Grannon.
Many of you have written to me to ask about my participation in some kind of product that Grannon has issued or published or is selling or whatever.
I want to clarify, I am not associated directly or indirectly with anything that Richard Grannon does. Definitely not with his products.
Thankfully, he is out of my life for good.
I am not working with him. I am not collaborating with him. Not now, not forever in any shape, way, manner or form, in any forum, in any setting, in any seminar ever again.
No Grannon, thank you.
I hope I made myself clear once and for all.
Stop writing to me.
I have no idea what this guy is selling you, nor do I want to be involved in any of his commercial enterprises.
Thank you very much for listening to the first service announcement and now the second one.
A seminar in Poland, Gdansk. Gdansk is a beautiful city in Poland.
At the end of March and beginning of April, if you want to register, I think the cost is about 20 euros just to cover the venue and the recording of the seminar.
So if you want to participate, there is a link in the description.
A link in the description.
Go to the description, click on the link, it will take you to a video by Dalia Zhukorska.
Dalia is the organizer of the seminar.
Write to her, communicate with her, see how you can secure your seat.
Dalia is about recent developments in cluster B personality disorders and of course cold therapy.
The latest bleeding edge, cutting edge news.
And now let's go straight into lies, Dan lies and statistics.
Anyone who is studying psychology nowadays knows that a huge part of the syllabus, a huge part of the curriculum revolves around math, mathematics and especially statistics.
In their desperate attempt to appear to be scientists, psychologists have adopted mathematics as a way to align themselves with more respectable disciplines such as physics.
Yes, psychology is a pseudoscience, but it's still useful as a narrative of human affairs, the human condition and various states of mind.
Yet psychology uses mathematics, more specifically statistics, in ways which I find to be misleading, pernicious and extremely problematic.
I'm going to identify five issues with the use of statistics in psychology, but trust me that's the tip of a very, very submerged iceberg, starting with the fact that the vast majority of psychologists don't know how to use statistics properly.
And this coming from a physicist.
My name is Sam Vaknin. I'm the author of Malignant Self-Love: Narcissism Revisited. And I'm a professor of psychology and a professor of finance in the Center for International Advanced and Professional Studies, the outreach program of the SIAS-CIAPS consortium of universities.
And if that's not respectable enough for you, switch to another channel.
Let's start with the problems in using statistics.
First of all, you have to know what you're doing.
And as I mentioned, very few psychologists actually do.
But even if you know what you're doing and you mastered all the considerable number of techniques available, there is a problem.
The vast majority of psychological studies are comprised of a tiny sample.
It's not uncommon to see an N. N is the number of participants. It's not uncommon to see an N of, let's say, 30 or 20 or even 6.
This presents a problem called normative validation.
Your sample is very small.
Or when the selection of the participants in your sample is skewed or wrong, you can't validate the outcome.
We leave aside at this stage the problem that the subject matter of psychology is human beings. And human beings are mutable. They are changeable. It's impossible to replicate a study because a participant in the study have had a nightmare during the night or have divorced over the preceding three months. People change in short. You can't even test the same person because the very act of testing changes the person. It's very reminiscent of the uncertainty principle in physics, in quantum physics.
Okay, but let's assume for a minute that by some miracle, there's a sizable sample, let's say 600 people or even 80,000 people.
I know of at least one study with 80,000 people.
Let's assume that the sample is large enough because the sample is not large enough, which is 99% of the studies in psychology. The studies are useless.
Moreover, most of these studies cannot be replicated. And this is known as the replication crisis.
So how do you gather a sample?
First of all, you construct a profile, a profile of the cohort, a profile of the population studied of the demographic. And then you select people to fit the profile to represent the cohort. This is called a representative sample.
A representative sample is a sample that eliminates possible biases.
And the problem is that samples are either small, cannot accurately represent the entire cohort or group, and assessing large samples is nearly impossible in many cases, owing also to mathematical limitations.
The problem is the number of biases, the number of potential biases, which is ginormous. It's simply huge.
For example, in market research, in polling, and in many branches of psychology, we have something called stratified random sampling. Stratified random sampling is that you break the sample into small groups and then you examine the each subgroup separately.
This is, of course, very inaccurate because any division is arbitrary and the subgroups are often very misleading when it comes to representing the entire group.
Consider, for example, conducting a door-to-door surveyor poll.
Well, it excludes all employed residents. Instead of calling people on their smartphones and conducting a poll or a survey, it excludes certain poor people.
If you conduct a poll in the evening, it excludes people who socialize or run errands. It's extremely difficult to get a representative sample.
Indeed, it is safe to say that the vast majority of samples in most psychological studies are either way too small, way too tiny to give us any meaningful answers, or totally non-representative.
And this shovahvim is only the beginning.
This is your blue professor of psychology leading you deep into the recesses of the fault lines of psychology.
Second problem is when you identify biases that limit a specific set of statistics, then you can determine the data's accuracy.
You need, in other words, to identify these biases and somehow eliminate them.
Consider, for example, the following bias.
People lie, minor bias. People lie. You ask them how many times they have cheated on their intimate partner. They lie.
What is their body count? They lie.
How many sexual partners they have had in the last year? They lie.
How many abortions they have had? They lie.
People lie, and not only about sex and not only about intimate matters. People lie all the time to make themselves look good or in order to compete with others, a process known as relative positioning.
Bias are extremely sensitive, seemingly susceptible to bias and lying. People sometimes give you an answer because they think that you want them to give you this answer.
They avoid unpopular answers. There's a lot of peer pressure. Popular opinion polls are actually pretty useless because of that.
Minor common bias is the way we present an average in the statistical process.
We can represent an average as the mean, the mode, or the median.
The mean results usually in a larger figure because it's the arithmetical average of several numbers.
However, if we want the figure to appear smaller, we would use the median. The median is the middle figure around which all other figures converge.
You can also use the mode average. This is actually the most frequently used measure.
As you see, even the statistical measure you choose has a massive impact on your outcomes.
Statistics are therefore extremely easy to manipulate.
There's also conscious bias. Only favorable data is chosen for presentation. And inconvenient or inconvenient data is totally discarded.
Filtering out that data is bad unethical practice, but more common than you know.
Or emphasizing the favorable outcome at the expense of the unfavorable result.
We use different types of averages and so we consciously choose what we want to convey.
Conscious bias is less difficult than an unconscious bias because in an unconscious bias, we need to recreate the entire process. We need to find the source behind the statistics and we need to understand the motivation or motivations of the scholars and the researchers and the publishers of the statistics. Are they somehow invested in the poll or the survey or the study? What are the sources of financing? What are the conflicts of interest? Why they had chosen a particular set of statistical representation?
It takes research, meta research. It's a mess and very few people bother.
Graphs therefore in psychology, especially facts founded on statistics are highly suspect.
And then there is the issue of graphical presentation. Charts, maps, other types of graphs. They're never what they see. They're exceedingly misleading. They're built to mislead.
What do you choose?
A linear scale? An exponential scale perhaps?
How do you present the data? What are your x and y?
And so a researcher or presenter of data can skew the statistical graphic representation in a way that's very deceptive.
You change the numerical representation on a given axis. You zoom in on a rise in the chart and the publisher can easily manipulate you to see what he wants you to see. It's very seductive and it's very wrong.
Line charts, for example, are the most commonly used in psychological studies. And line charts are also the most easily manipulable and most commonly manipulated.
Even a small increase in the chart can be shown as a much larger rise simply by changing the numerical increments on one or both axes.
I think the chart seems impressive and very, very objective.
But if you take the time to check the numerical representation, it's very underwhelming. Trust me, bar charts can be deceptive as well.
When you change the width of the bar, when you show a truncated version of the bar, only the top half, the information on display can easily seem to represent something that is not true, that is counterfactual.
And so bars and charts are a serious problem.
Size is chosen with length, numerical axis. They all can mislead.
What about statistical fallacies?
Yes, there is such a thing.
Many techniques in statistics and in failure, one of the major fallacies in statistics is if you use a specific type of formula.
For example, if you say if B follows A, then A must be caused by B. It's a very, very ancient fallacy.
And so this formula can be easily turned around. And this is known as the post hoc fallacy. It is easy to misrepresent data when you use this formulation.
Both A and B could be the product of a third factor. Correlation is never causation.
And so representing A as the inevitable outcome of B or vice versa is almost always wrong.
We must look closely at the information when we are presented with such an argument.
B may follow A for any reason whatsoever. It could have occurred by chance.
You need to test and test again.
And then B may vanish altogether. We call it an artifact.
Testing continually may yield the same result.
It still doesn't mean that the result is valid because the testing methodology itself has a huge influence on the outcome.
Choosing the methodology often determines the outcome.
The variables B and A may be related in some way.
Causation is only one way of relatedness.
But is it?
There are instances of causality between two factors which represent an overriding third factor, as I say.
Another statistical manipulation is when data is presented that does not accurately represent the correlation of factors.
A positive correlation can suddenly become a negative correlation if it is applied beyond the information given, for example, in extrapolations.
So the correlation of factors sometimes exists, but we must look closely to determine all the factors.
Very often we connect A to B, we correlate them, we measure the correlation, and then we extrapolate it, we enlarge it, we expand and apply the outcome wrongly.
For example, rain and crops. Measured amount of rain? Healthy crops. Too much rain. Ruined crops.
The correlation breaks down. Of course, not all statistics are misleading.
Statistics is a very useful tool, and believe it or not, it's even used in physics. For example, in statistical dynamics.
Statistics is really an amazing development of the last 250 years. It is used in a variety of settings, for example, actuarial tables and insurance.
So statistics is a blessing, but you need to ask yourself, who says so? How does he know what's missing from the data? Did somebody change the subject? Does it make sense? What was the size of the sample? Was the sample representative? What measures were chosen?
Statistical mathematical measures were chosen. And why? Had different measures been chosen, what would have been the outcome?
Who presents the statistics is very crucial.
For example, if you have a feminist psychologist, that's a bias. I'm sorry to say. It's an ideology.
Conversely, if you have a racist psychologist, it's also somewhat of a bias.
Is the statistics coming from a source that has something to prove? Is the study financed, for example, by the tobacco companies?
There were numerous statistical studies financed by the tobacco companies at the time, in the 50s and 60s. Do these studies wish to sway us, to persuade us in a highly specific way?
We need to check for both conscious and unconscious biases.
Always scrutinize the validity of the source. Is the source reputable? Is it trustworthy? What previous work was published by this source? And how does this work tie in to the current study?
Look at the size of this sample. One thousand two hundred companies, one thousand two hundred participants is another thing. One hundred and twenty or twelve.
It's an entirely different picture.
Now, we do have measures of confidence and measures of significance. They tell us how valid the answer is, how likely the answer is valid.
But even they are subject to both bias and mathematical inaccuracy.
Ask yourself, what figures are missing?
For example, did anyone bother to indicate the number of cases?
Yes, believe it or not. There are studies where the number of cases, n, is not indicated.
And when someone says fourteen percent of something, what is this something?
And when someone says eighty six percent of something, that leaves fourteen percent and maybe the important message isn't the fourteen percent left out, not in the eighty six percent.
For example, if I tell you, sixty percent of women don't cheat.
Well, forty percent of women do.
And that's where the problem lies. That's where the pain arises. That's where the agony resides.
Should focus on that.
So selectivity in presenting data is often meant to obscure, camouflage, masquerade and disguise, choosing a median or a mode or a mean, shift the result substantially. Such choices should be scrutinized, should be questioned.
And omitting certain factors is also very important.
Why some factors are not mentioned?
This in itself is a misuse and misrepresentation of statistics.
Suddenly in some studies, the subject changes. The study starts with a general presentation of the goals of the study, the aims of the study and so on. And somewhere between the figures and the conclusion, something shifts.
Certainly the outcomes or the results or the conclusions of the study have nothing to do with what the study purportedly was aiming to verify.
When a long term trend is used, there is no evidence to back up what is being represented, for example.
And so do the statistic, does the statistic make sense? Use your common sensebecause common sense is a relatively reliable guide. Not your intuition, your intuition is wrong 50% of the time, but common sense is a reliable guide.
And if you come across a statistic that strikes you as nonsense, then feel free to question it, to delve deeper, to pull it apart, to unearth, explore and reveal the inner mechanisms and workings of the study.
The use of precise figures gives the erroneous impression of objectiveness and validity. That you use numbers doesn't make you objective. That you use figures doesn't make your claims valid. And that you use mathematics doesn't render your study scientific, nor does it render the entire discipline scientific.
People often fabricate, not intentionally, not maliciously, but because they want to. They support existing biases. This is called confirmation bias. Sometimes they round up figures, that's wrong.
Many statistics cannot be that precise. We should question whether statistics, if the statistics does not make sense, because the statistics somehow assume an infinite extension into the future. And that's always wrong.
This use of numbers is a fallacy, a fallacy, because it creates the impression of authority.
Take for example, IQ. IQ tests results beyond 140 or 160 cannot be normatively validated. I have 190 IQ, and I can tell you it's totally meaningless, because according to statistics, there are only another eight people in the world with my IQ. That's a tiny sample. We can learn nothing from it, and therefore my IQ cannot be normatively validated.
The more we know what to look for, the easier it is to determine if a statistic in psychology is trustworthy.
We need to establish all these things.
How is the average presented?
We need to look closely at line graphs. We need to identify whether first impressions are real and the statistic is accurate.
There's a lot of play in statistics, and that is why lies, damn lies, and statistics.