Referring to the previous

OK, this time it is not merely a case of Microsoft people (the non thinkers). I also left a piece out of the previous article. In part because I thought it was self evident, in part because it is a little harder to explain, not harder exactly, there is a lot more to this than meets the eye, but those in this environment will get it fast enough. You see, my solution might not help Elon Musk, he doesn’t have enough time. Even if his Tesla Mobile department gets cracking, he might not get the minimum numbers to convince a judge (I reckon 9%-11% would do the trick). First we need to look at a specific population. 

This is a representation of a fake account population. For arguments sake I kept an even distribution (which is not the case). The top segment are governments, really clever hackers and a few others. We won’t be able to get to them. Then we get the clever click-farms and trolls and last the eager beaver click-farms and stupid trolls. It’s the lower two segments that matter, the lowest tier is the easiest to get to, but will need work. The other one needs a lot more work and that is the path others have not trodden on (at least I think they have not).

In this there are two groups click-farms and trolls. We can get to a lot in the same way. A click farm get the revenue, by clicking (yes, it is that easy), the problem is that they need 10 clicks for a cent, as such China has a lot of these farms. People pressing buttons one after another. But here is the little surprise. There is a method, there are paths we can use to find them, and the lowest group first. There are all matters of ways that some hunters go through, because they have specific targets. In this case we also have targets, but the lowest group, it also means the most work. 

The simple click farm has one text. We need to find the first text that goes to any click farm, when we have that (from experience we know where the recipient is), so we know the text. Now we need to backtrace as much as possible and find EVERY transmitter (clicker) of that message. We do that by seeking 90 seconds before and 90 seconds after and seek the system for that text. Depending on how fast the click-farm is, we could find 200-300 click mobiles in that time, if needed we extent to 30 seconds in both direction. Now we have our first cluster. We can seek and capture these identities and set them in a database. The slightly more clever click-farm will do this with a collection of tweets (as such I showed you 3 text icons). This is also important. You see one cluster is fine, but we need a hell of a lot more, but we get a little help from the people at the click-farm, they tend to be lazy (or greed driven) so the more they transmit, the more money they get. As such we then seek who had these three messages in succession. Here we need to filter, some recipients are gullible and take anything that this click-farm sends out and some click-farm have recipient clusters. The salesperson often has a story to tell, but he’ll take any listener (even the useless ones). So we need to distinguish between the two. The recipient farm often does not send forward, some do. But now we start shaping an image. An image and a message path (a pattern) and these paths are not always the same, they can sent the covid misinformation one day and Russian propaganda the next. But in this way we get at least a dozen clusters. The problem is that this needs one hell of a server and optionally a rack of servers seeking in different regions. So Musk will have the hardware and he has the people, but does he have the time to get this all done? (perhaps he already did). And the third path is to engage with a click-farm to send your own message which you seek online. When you get that cluster you can seek what else came from there and then you have a nice setting to compare. You see, Twitter is about engaging an audience, the click-farm does not care so they are actually more exposed then others. Then there are the the click messages that hand over the #FF statement. It is risky, but the lower tier does this to get results faster and as such we get a node of connections and all connecting to clusters. 

I reckon that this approach could bank up to 20%-25% of the fake account, showing that the Twitter idea of 5% was a joke from day one and has been so for years. There are of course a few more ways to get there, but revealing them would also show the hand to the more clever click-farms and that is what we are trying to prevent. I reckon that it should be possible to get 90% of the green group and 30% of the yellow group. In this I set the graph to reveal equal groups, but the green group is 20% smaller and yellow group is at least 40% larger. The red group is relatively small, and does the most damage, but that was not the exercise, it was to show that the Twitter claim of 5% fake accounts was folly (from the start mind you) and I reckon that this could be relatively easy to show, but to get these numbers takes a serious amount of server power. It would even be better to set the results in something like IBM Modeller or Palantir Gotham to see where else the data leads, because that would become the next task, mapping the disinformation streams and how it is distributed. Even if these people do not break any laws, they are helping and propelling disinformation, optionally endangering their own nation and that too needs to be known. There comes a point where the right to be stupid is no longer an excuse.


Leave a comment

Filed under IT, Science

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.