Tag Archives: statistics

By the numbers

There is an old ‘saying’, it comes from the late 70’s and it goes a little like this:

In the 50 years that followed we learned that the first option might be the prettiest, but you still end up with a working company. The second one is still an issue, but the third one is still under consideration, Especially with the presumed setting of AI (or as I call is NIP or Fake AI.

This all came to me when I was bombarded with charts and there are numerous ways that we are handed these charts, but it also gave me a consideration. You see, no matter how deep you believe the data to be true, it remains a consideration that any data is flawed and through that setting not entirely trustworthy. 

You see, this is the country with the most migrants, but what I am missing is where they came from. I saw another article in the BBC, which gave us ‘La dolce vita: Is Italy the new tax haven for the global rich?’ (at https://www.bbc.com/worklife/article/20260421-is-italy-the-new-tax-haven-for-the-global-rich)here we see “In France you also have to pay a property tax (taxe foncière or land tax). “We don’t have that here for the prima casa (first home),” says Robert, although he notes “there is a high charge for refuse collection”. The best thing as far as he is concerned is that there is no inheritance tax on property you own in Italy up to €1 million ($1.1 million) and it’s only 4% beyond that threshold. In France the tax-free limit is much lower – €100,000 ($110,000) – and beyond that it’s a sliding scale up to a top rate of 45%.” The story is about the ‘global rich’? All this might be true, but I believe that there is a larger migration into Europe. The setting that Americans are leaving, a setting we got in the Wall Street Journal on February 25th 2026, where we saw “The U.S. experienced net negative migration in 2025, with an estimated loss of 150,000 people, a trend not seen since the Great Depression.” And if you are ‘really wealthy’, you skip Italy and go straight to Monaco, which is a zero tax nation. So that first chart is nice, but where they came from is more interesting, especially in the era 2026-2028. 

We then get the second chart, which shows us where the youth is scientifically. Here we get the first issue. There is consideration that these numbers are flawed n some cases. As some give us: “There are approximately 1.2 billion young people aged 15 to 24 globally”, and I know enough of the failing of data, to give you the fact that there are no data sets giving us 1.2 billion records. As such plenty of nations have worked with mean values and that is the first failing on that chart. Second it is nice to see the USA in 17th position, but they have a population of 349 million and not all can afford to go to University, then we get foreign students in MIT, UCLA,
Princeton, Harvard and Yale. So how are they counted and what is disregarded? Several questions on a chart because the data is missing (and footnotes too). So whilst these numbers might be indicative that those scoring over 500 are in a ‘safe’ place, but that is if we accept this number. And the explanation of those scores, with added footnotes on what is regarded as ‘valid’ is up for grabs. 

And then we get the main event, the one that baffled me for a moment, because is gave my thoughts optional validity, but then I need to be wary of a few settings, because without data, a chart is merely a weighted result and without N (total responses) there are reliability issues. 

We now see the top countries by natural resource value. It gives me my validity as the United States is show to have $45T in value and that is the setting that makes them optionally almost insolvent. Their debt is growing faster and faster and as it is now said to be $38.9 trillion, which amounts to exceeding 100% of their Gross Domestic Product (GDP), but as we see it, they have almost spend the total of their natural resources. I have an issue with that, because the rare metals are not in that list all whilst Wyoming, Utah, Colorado, New Mexico, and Arizona have it, as such that number is off (by a lot) and other nations have more (or less) natural numbers as the chart sets out, all whilst these numbers are not given either as such it is a nice chart, but incomplete and as such redundant. If I was to hazard a guess, this was a chart to show how ‘good’ Russia is doing, but as I never saw data on it all as such I have my issues with it. All charts look pretty cool, but cool doesn’t pay the baker (or the butcher for that matter). As such we need to wonder what the chart was doing, not what they tell you, but what they aren’t telling you.

That was just my setting on this and there is a lot more to consider so whilst the first chart gave us “The U.S. hosts 17% of the world’s migrants”, my initial question was “Based on what data?” And as people m ight give us the setting that the AI gave them the numbers and we know that AI doesn’t yet exist. We are given the thought that it is merely DML and that is done by a programmer and that programmer might miss a few beats to be optimistic (many more beat are likely to have been missed) and all this on flawed data? 

So what was the designer of that chart trying to persuade you to consider what was ‘their’ issue? Because when someone makes a chart, they want you to look into a specific area, or not look in an area that also mattered. Have a great day, another Monday parked on front of my door, Vancouver still has the bulk of Sunday to get through. Ah well.

Leave a comment

Filed under Finance, IT, Media, Science

A social direction

This happens, in all the stupidity, the harshness and the fatalities of war, we look in other directions, we look for the good in places, in people, in foods and in entertainment. Our bodies and our souls can only take so much negativity until we start seeking out positivity in any way we can. This is pretty much on all of us. The problem for some is that they CANNOT avoid the negativity. Through war, through social issues, through personal issues. It is a clambake of barriers that we set up and that keep us in place. We all have these moments and these time stages. We can try to avoid them, but the negativity draws in, just like positivity when it happens. So there I was sitting on the couch watching Blindspot season 4 on dvd when I saw ‘Saudi Arabia ranks 25th in UN World Happiness Report’ (at https://www.arabnews.com/node/2045881/saudi-arabia). Of all the things I expected to see, that was not one of them. To be honest  I have no idea where they were, but they moved up one step from 26 in a year. The full report (at https://happiness-report.s3.amazonaws.com/2021/WHR+22.pdf) gives us more. You see the numbers show that they are one place behind the UAE and both are really close to the scores of France, Belgium, UK and US. Yet there is also the setting that Arab News gives us “The report has been based on two key ideas: That happiness or life evaluation can be measured through opinion surveys, and that we can identify key determinants of well-being and thereby explain the patterns of life evaluation across countries,” That is a little more than I bargained for. I am not disputing the approach but how many people? The PDF does give us that. 156 countries and 1853 observations (per nation I guess). Yet if that is the case and we know Saudi Arabia has 35 million people, we might see that stage. Yet Belgium has 12 million people and the US has 330 million people, so how is there a stage of equality? How can 1853 people be a genuine stage for happiness in the US? How is the stage of opinions towards regression become a scale of happiness? How were these numbers created? Technical box 2 gives us more (page 20), but there is a larger issue. We see 2017 World Development Indicators (WDI) that came BEFORE covid. They use GDP time series from the OECD economic outlook no. 110 (edition December 2021) with the added ‘or if missing’ and there the problem lies. Statistical result connected to other statistical results. I once learned (1992) that this is a really wrong setting to work from. Apart from the stage that it could be based on very different people, there were different economic boundaries and other issues in play. But overall it took me three minutes to combine data into questions and reservations on this report. It is nice to see all these happy people pictures, but it is window dressing, and it makes me more apprehensive of the report then less. There is a feeling of orchestration. The image of a man wearing an ‘offline hustler’ t-shirt with the small caption of ‘every move won’t be posted’, it merely brings out the negativity in me. And it is ‘consistency of emotion changes across countries in the 5 weeks after the outbreak’, you see what date was used for the 5 week stage? December in China? When? It matters because covid hit us at different times, there seems to be no real explanation there. So how was Twitter used for these 1853 people? Is twitter separate, how many twitter observations per nation? The list goes on and grows. Still, it is an impressive piece of work, if there was a way to get better and more complete explanations it could work. But I hesitate when page 144 gives me “we approached the analyses by 2 interlinked hypotheses. (1) balance/harmony matter to all people; and (2) balance/harmony are dynamics at the heart of well-being. As we have seen, both hypotheses were corroborated to some extent” Really? 1853 observations out of 330 million Americans? How does that show any level of corroboration? 

The more of the report I saw, the more questions I ended up with. I wonder who else have a serious set of questions and I wonder when the media will ask Gallup more questions, Personally I doubt they will ever bother.

Leave a comment

Filed under IT, Media, Politics, Science

What is the good place?

It’s Sunday, I am currently watching the original Planet of the Apes, it is one way to pass the time, just watch an actual classic. I have had an interesting day. With my mind on creativity, the series ‘the Good Place’ inspired me to a different miniseries. Consider a simple inventor, the man is a Civil Engineer, and he comes up with the bright notion of inventing a different kind of heavy metal filter, a way that is based on centrifugation. It is a simple yet novel idea, and he submits the idea and registers the patent. It is not a week later when he is suddenly invited for a meeting with an ‘interested party’, when he gets there he sees that the ‘interested parties’ involve a politician and a former employer. Before he knows what is happening, his patent is under attack, the politician and the former boss have sought legal assistance and they claim that the patent was stolen. In the end, after the court case they take 100% and he ends with nothing. Over time he learns that the judge was in on it, he had become a silent partner in the event that scores them $6,000,000,000 split 6 ways. 

This starts the plan where he starts to get even. The clock now jumps 12 years, the 6 are of course really happy, and during that event the social engineer walks in and shoots the entire party, the thieves their partners and 11 children. 

The next moment we see that they are in heaven, the 6 families are there, and they have large mansions, cherubs that take care of their needs and they are seemingly happy. It is at this point that the floor comes down from under them. The idea is that the civil engineer got away from hell somehow and is not wreaking havoc in heaven, yet in all this, heis focused on the 6 families, taking one child after another, all to be collected and placed on a cage of emonic thorns, making the children sign over their souls for their parents non prosecution of theft, all the kids agree and are dropped into hell, at that point the parents have to select heaven or hell, in the end only one accepts the exchange and jumps into hell with the key that will unshackle his child. 

That is as far as I got, the link to the good place is seen when in the end it shows that they were all in hell, the heaven impression was done in a deal between the head demon (Asmodeus) and the civil engineer to get his revenge. The deal was his soul freely given to teach the thieves a lesson. Yet the civil engineer has another part in all this, as the thieves sign over their allegiance towards their soul for the absolution of their crime, the second plan comes into effect. It will be his ticket out of hell. 

That is as far as I got, it took about an hour to think it through, there is a lot more, but I will not bore you with that part, not for as long as it is not properly scripted. 

This was my state of mind as the news hit me on the Coronavirus, the idea that the US has 52,000 deaths to the coronavirus, 25% of all global deaths, and they are now reopening (to some degree) their country, apart from a president making (sarcastic or not) some claim that the body can be cured from the coronavirus by applying detergent to it. In all this, we get the realisation that Mickey Mouse and Donald Duck Trump both seem to live at Pennsylvania Avenue 1600,it seems that Disney characters are getting a living upgrade. Not bad for a weekend, is it?

Yet the idea of making an iteration of the Good Place wasn’t my initial idea, yet I reckon is that this series has had its impact, yet my version is not a comedy, it is much darker, heaven almost exceptionally puritan, and hell is dark and fire red in aspects (I haven’t been there yet so my view on this place is still shaping. Yet if we agree that we are driven by the seven sins and the seven virtues, the trap we make for ourselves is that there is a lack of balance. If Chastity, Temperance, Charity, Diligence, Patience, Kindness and humility are on one side, what is on the other? Pride? Avarice? Envy? Wrath? Lust? Gluttony? Sloth? As I see it, they nullify one another, like a seesaw we are on the twisting point between sin and virtue, how can we chose the balance? If Lust and Chastity are on a seesaw and our setting is the axial, how can we select? Love is a combination of many facets, lust and chastity are part of it, but they do not stand alone and we are in a stage to keep the seesaw balanced. The issue is not lust or chastity, it is fear and greed. The US reopening their places (to some extent). In all this the numbers are screaming part of the idiocy (as I see it), all these nations, the EU at almost 750 million citizens. Then we get India, Asia and in all this, the registered deceased spans 25% of the world, that is the US. Reopening stores and trying to get back into the swing of things is a choice and perhaps it is the better one, I do not have the knowledge to debunk it, but the larger healthcare message is lockdown, there is wisdom in that too. Knowing what is best is not for me to say. I understand both sides and as we see Bloomberg giving us one side,the NPR rolls in another direction, like the seesaw, the axial of balance is in the middle, yet to what direction should we swing? The problem is that for most of us the balance point is influenced by fear and equally by greed. Greed might not be pronounced outspokenly by a lot of people, but greed is a reason we must address. Even as for most it will be about the ability to pay the bills, is it any less greed driven? We might all see greed as evil (I at times do that too), yet the need to survive is also laced with greed, the need we have to pay our bills, we call it differently, but it remains a form of greed and not all greed is pure evil. This reminds me of an original Star Trek episode, it was called ‘the Enemy within’, the realisation that the good Kirk and the evil Kirk need one another, if balance of ruthlessness and empathy is essential to make sensible decisions, we see another path that we all face. The more primal the drive,the more direct the balance between both elements is seen. We are at times driven to deny the negative emotions, yet the early lockdown the harsh decision (or logic) behind it might have lowered the curve for the US, we cannot tell for certain, it is too late now, but the fact remains, in a nation with 325 million people, out of a population of 8,000 whilst there are over 2.5 million of cases and the number of deceased in the USA represents 25% of all corona victims. The numbers seem to indicate that lowering the curve sooner would have been better, but I have always stated that there is more to this virus, and so far the US is still in a beginning stage, as such the 25% might be a low number for now and the bad side is that the reopening of the US in a week might signal a very negative situation. We can speculate on this until we are old and grey, but reality will show us what will be soon enough. 

We will all make up our own minds, some will blame 5G and burn down Wifi masts, some will blame the chinese (again) and come up with more matters to prosecute and in all this, the history of the other versions of corona are ignored, we can ask any cat, but they are seemingly merely the victims in all this. It seems that the bosses that mploy a lot of us are soon to be seen as good and bad, which will upset the curve a lot, but no matter how we look towards the future and where we look, we need to find a balance within ourselves and propagate that outwards, that is essential to create a balance within ourselves, because no matter how this all goes, there is every indication that the month may will see the first of several cycles of blaming the people around us. 

 

Leave a comment

Filed under Media, movies, Politics, Science

An outlying frame of prediction

The Guardian had another interesting article to present, it came online on Jan 1st, but I just read it a mere moment ago. The nice part that this is about data, it is a little bit more about statistics, but I am not a statistician, I am a Data Miner. The title ‘Alarmingly for pollsters, EU referendum poll results depend heavily on methods‘ gave me the jolt I needed (at http://www.theguardian.com/politics/2016/jan/01/eu-referendum-polling-results-depend-methods). From my point of view, the entire exercise is a failed event, no matter how you slice it. Before we go into the results, let’s take a quick look at the nations involved:

  1. UK, population 65,081,276
  2. France, population 67,063,000
  3. Germany, population 81,276,000
  4. Italy, population 60,963,000
  5. Spain, population 46,335,000
  6. Sweden, population 9,816,666
  7. Finland, population 5,475,000
  8. Denmark, population 5,673,000
  9. Portugal, population 10,311,000

Now look at two quotes: “It found strong support for the UK’s continuing membership, with an average of 53% of respondents favouring Britain’s continuing membership across nine other countries surveyed“, which might be fair enough, but then we get quote two, which is “Only in Norway, which is not a member of the European Union, would a slight plurality, of 34% to 27%, prefer to see the UK leave and join it outside the club“, this is interesting, because Norway is not one of the nine countries in the mix, which now implies that additional nations had been interviewed, so what happened, the others were less in favour?

Now we add the optional considerations “ICM also investigated the appetite in all these countries to call time on their own membership, in the event that their country staged an in/out referendum“, So ICM had another reasoning entirely, the ‘in the event that their country staged a referendum‘ is central to this, because that means that the questionnaire, the hypotheses and the methodology would be different from the get go, which is not even that central in my thinking process, but it is elemental to the entire event. Now, the question becomes whether this is all part of ICM Research a UK Market Research company, was it done as part of the umbrella called Creston Insight, or perhaps even a third part and I am linking the wrong ICM to the wrong company.

These are all valid considerations and in my case the assumption was done intentionally (and most likely to be correct).

You see the paragraph in the Guardian “Alarmingly for the polling industry, however, the result substantially depends on the method used. Nineteen of the 21 polls were done online, and among these the average advantage for remain shrivels to a dangerously slim two points. But the two telephone surveys that have been undertaken point to far bigger pro-EU leads of 17 and 21 points” shows the issue for me. The paragraphs result in the question, were 19 nations interviewed? If so, why are they not all mentioned, in another option, were two methodologies used in the nine countries? One via phone and one via online, which makes perfect sense, but then an even amount of polls should have been used. All the article does is wonder how reliable the approach is, and if at all, are politicians even interested in doing it fair and square?

You see, if the results can sway a lingering vote (which is a given fact) than we can see that the poll could be used to sway some to ‘follow’ the largest group (with a tie a much harder thing to influence), but influence is a given.

For me, the number one issue were none of these items, in my case it was the mention at the very end. The quote “ICM interviewed a representative sample of at least 1000 adults online in each of nine European countries on 15 and 30 November 2015. Interviews in each country have been weighted to the profile of adults living within it” this is the issue, because a sample of 1,000 can never ever be representative of a population of 81 million, not even representative of a population of 46 million, there is no amount of weighting that can give anything but the roughest of estimations. The more representative the sample is for households, the larger the interviewing sample needs to be. There might have been the slightest reliability if a sample of at least 10,000 was used per nation and I use the word ‘slightest’ in the most liberal of ways. The moment we introduce, gender, income and education 10,000 might not slice it either. You see, yes, weighting can be applied, but than a single response could represent a group of 50,000-100,000, how reliable do you think that one voice would be regarding the other 49,999-99,999?

1,000 might be budget based, but this would then reflect a budgeted population that holds no reliability at all.

Sampling can be a real science, but when we see frequency weighing to this amount, we can safely say that science has been replaced by educated guessing, which is not the way to go. Consider France for a moment. Consider that in regions people feel very different, the two regions where Le Pen are powerful, they will not be in favour of the EEC at all, the others regions might be (read: might be). Now consider that France has 22 administrative regions, so in fairness we get roughly 50 responses per region, 25 males and 25 ladies, so per education level en perhaps even per age group, how much remains? How representative are these 25 people for that region? Now consider that not every region has the same population, so the 50 people representing the 11 million that make up for get a very different weight from those representing the 4 million in Normandy. Are you catching on how utterly unreliable those numbers have become? And how is this done for the UK? Or did ICM decide to get in quick and fast so the capitals make up for the bulk of the votes, which in case of Sweden makes sense as the bulk lives in Stockholm, Goteborg or Malmo. So as there is a hint of truth that it might all be about methodology, the required setting can never be met by 1,000 responses per nation as I see it, in addition there is still the unlisted Norway. So ether the article made a few jumps (which could be fair enough) or the reference to ICM in all this should be answerable to a lot more questions than the article is currently giving.

I need to end this with one final quote: “if the huge differences between online and telephone surveys persist, one method or the other can expect to face a bruising referendum, because they cannot both be right“, from the parts I responded to, there is another option all together, neither are correct. They are not flawed, but wrong for the simple fact of sampling size and the quote given “in the event that their country staged an in/out referendum“, which means that there would have been a different hypothesis that needed answering and even then, the sample of 1,000 would never been have anywhere near useful.

A group of 9,000 can never be representative of a group surpassing a third of a billion that should be massively clear to anyone from the get go, even more so when you consider the different lifestyles and values held in Scandinavian nations versus most of Western Europe and that is just the tip of the statistical considerations.

Leave a comment

Filed under IT, Media, Politics, Science