Tag Archives: Palantir Gotham

The congressional sham

The papers are ‘covering’ live the entire Facebook hearing, we see several papers covering it and I think that this is a good thing. Yet, most papers are not without flaws. The fact that I have been writing about the entire mess of data privacy since 2013 makes it to the best of my knowledge a Capitol sham at best (pun intended) . you see, these so called senators are all up in arms and we see the Washington Post (at https://www.washingtonpost.com/news/the-switch/wp/2018/04/10/mark-zuckerberg-facebook-hearing-congress-testimony) give quotes like “from data privacy to Russian disinformation“, you see, it is a lot less about data privacy than it is about the Russians. The anti-communist gene in Americans is too strong; the yanks get too emotional and become utterly useless in the process. So is it about the 44 senators grilling Mark Zuckerberg, is it about their limelight and about their re-election visibility, or is it about global data privacy? I can guarantee you now that it will not be about the last part and as such we will see a lot more warped issues shine on the congressional dance floor.

In that regard, when you read “They demanded new detail about how Facebook collects and uses data and elicited assurances that it will implement major improvements in protecting personal privacy“, it might be about that, but it will be a lot more on oversight and how the US government wants to be able to ‘check’ all that data. They wanted access to all that data since Facebook became one year old. So when we see ‘Sen. Kennedy: “I don’t want to have to vote to regulate Facebook, but by god, I will. That depends on you.”‘ you better believe that the ‘depends on you‘ can be read as ‘as long as you give us access to all your data‘, which contains the shoe that fumbles.

So when we see “Several asked for detailed answers about how private, third-party companies, such as the political consultancy Cambridge Analytica, gained access to personal data on 87 million Facebook users, including 71 million Americans“, we see the valid question, yet that did not require a congressional hearing, so that is merely the icing that hides the true base element of the cake. It is the honourable Sen. John Thune (R-S.D.), chairman of the Commerce Committee that gives the first goods: “Many are incredibly inspired by what you’ve done. At the same time, you have an obligation, and it’s up to you, to ensure that dream doesn’t become a privacy nightmare for the scores of people who use Facebook”, you see, freedom of data and misuse of information as set by insurances. The statements like ‘Insurance companies warn that under certain circumstances, posting about your holidays on social media could result in your claim being declined if you are burgled‘. These senators were not really that interested in all this whilst the entire insurance issues have been playing as early as 2010; they were likely too busy looking somewhere else. The entire privacy mess is a lot larger. We see this at the Regis University site when we take a look at: “A new survey by the National Cyber Security Alliance (NCSA) reveals nearly one in five Americans (19%) has been the victim of some form of cyber stalking, defined as any persistent and unwanted online contact with another individual. Through aggressive social media contact, repeated emails or other methods of online connectivity, cyber stalkers represent a serious and growing threat to men and women who otherwise wish to disengage from those who make them feel uncomfortable. Still, the NCSA report shows only 39% of those who believed they were being stalked online reported the incident to authorities“, so was there a senatorial hearing then? No, there was not. In addition, a situation where one in 5 Americans is subject to stalking, yet in all those years almost nothing was done. Why is that? Is that because the overwhelming numbers of these victims have tits and a vagina, or merely because they are less likely to be communist in nature?

Does this offend you?

Too bad, it is the direct consequence of inaction which makes todays issue almost a farce. I stated almost! So, is the issue that the data was downloaded, or that the data on millions of Americans is now in the hands of others and not in the hands of the US government? This loaded question is a lot more important than you might think.

The fact that this is a much larger farce is seen when the Democrat from Illinois decides to open his mouth. It is seen in “Sen. Richard Durbin (D-IL), asked Zuckerberg what hotel he stayed at Monday night and the names of anyone he messaged this week“, was it to break the ice? If all 44 senators do that, then we see evidence why the US government can’t get anything done. It is actually another Democrat that gives rise to issues. It is seen in Sen. Richard Blumenthal (D-Conn.) said, “We’ve seen the apology tours before… I don’t see how you can change your business model unless there are different rules of the road.”, the man makes a good case, but I am not certain if he is correct. You see, unless the US government is ready to lash out massively in the abuse of data towards any corporation found using social media on exploiting the privacy of its members, and insurers are merely one part in all this. You see, the rules of the road have been negated for some time in different directions, unless you are willing to protect the users of social media by corporate exploitation, Richard Blumenthal should not really be talking about traffic rules, should he? This directly links to the fact that 90% of hedge funds were using social media in 2014. Were they properly looked at? I wonder where those 44 senators were when that all went down.

The one part that will actually become a larger case comes from Massachusetts. “Democratic Sen. Edward J. Markey (Mass.) plans to introduce a new bill Tuesday called the CONSENT Act that would require social giants like Facebook and other major web platforms to obtain explicit consent before they share or sell personal data“, it will change the business model where data is no longer shared, or sold, but another model where all this is set up by Facebook and he advertiser can get the results of visibility in top line results. That is the path Facebook would likely push for, a more Google approach in their setting of AdWords and Google analytics. Facebook is ready to a much larger extent on this and it is a likely path to follow for Facebook after all this. Yet in all this the theatre of congress will go on a little longer, we will know soon enough. In the end 44 senators will push regarding “The Federal Trade Commission is investigating violations of a 2011 consent decree over privacy policy at Facebook that could lead to record fines against the company“, in the end it will be about money and as it is more likely that the data on Americans made it to Russia, the fine will be as astronomically high as they could possibly make it. They will state in some way that the debt of 21 trillion will have nothing to do with that, or so they will claim. In the end Mark Zuckerberg partially did this too himself, he will get fined and so he should, but the entire theatre and the likelihood that the fine is going to be way overboard, whilst in equal measure these senators will not chase the other transgressors is a much larger case and calls for even more concern. You see, there is a much larger congressional sham in play. It was exposed by Clay Johnson, formerly of the Sunlight Foundation, (more at http://www.congressfoundation.org/news/blog/912). The issue is not merely “On the Hill, congressional staff do not have the tools that they need to quickly distill meaning from the overwhelming volume of communications that they receive on any given day“, it is that Facebook has been able to add well over 400% pressure to that inability. That given is what also drives the entire matter of division in American voters. I myself did not think that ‘fake’ news on events did any serious damage to Democrat Hillary Clinton, from my point of view; she did that all to herself during her inaction of the Benghazi events.

In the end I believe that the bulk will go after Mark Zuckerberg for whatever reason they think they have, whilst all hiding behind the indignation of ‘transplanted data‘. The fact that doing this directly hit the value that the rest of his data has is largely ignored by nearly all players. In addition, the fact that the BBC gave us ‘More than 600 apps had access to my iPhone data‘ less than 12 hours ago is further evidence still. So when will these 44 senators summon Tim Cook? The fact that the BBC gives us “Data harvesting is a multibillion dollar industry and the sobering truth is that you many never know just how much data companies hold about you, or how to delete it” and the fact that this is a given truth and has been for a few years, because you the consumer signed over your rights, is one of those ignored traffic rules, so the statement that Richard Blumenthal gave is a lot larger than even he might have considered. It is still a good point of view to have, yet this shown him to be either less correct on the whole, or it could be used as evidence that too many senators have been sitting on their hands for many years and in that matter the least stated on the usefulness of the European Commission the better. So when we read “The really big data brokers – firms such as Acxiom, Experian, Quantium, Corelogic, eBureau, ID Analytics – can hold as many as 3,000 data points on every consumer, says the US Federal Trade Commission“, we see that Equifax is missing from that list is also a matter for concern, especially when we consider the events that Palantir uncovered, whilst at the same time we ignore what Palantir Gotham is capable of. I wonder how many US senators are skating around that subject. We see part of that evidence in Fortune, were (at http://fortune.com/2017/10/10/equifax-attack-avoiding-hacks/) we see “Lauren Penneys, who heads up business development at Palantir, advised companies to get their own data and IT assets in order—both to better understand what risks do exist and to improve readiness to respond when a breach does happen“, she is right and she (validly) does not mention what Palantir Gotham is truly capable of when we combine the raw data from more than one corporate source. With the upcoming near exponential growth of debt collection, and they all rely on data and skip tracing of social media data, we see a second issue, which these senators should have been aware of for well over two years. So how protective have they been of citizens against the invasion of privacy on such matters from the Wall Street Golden Child? Even in London, places like Burford Capital Ltd are more and more reliant on a range of social media data and as such it will not be about traffic rules as the superrich are hunted down. We might not care about that, mainly because they are superrich. Yet as this goes on, how long until the well dries up and they set their nets in a much wider setting?

We claim that we are humane and that we set the foundation for morally just actions, but are we? The BBC actually partially addresses this with: “Susan Bidel, senior analyst at Forrester Research in New York, who covers data brokers, says a common belief in the industry is that only “50% of this data is accurate” So why does any of this matter? Because this “ridiculous marketing data”, as Ms Dixon calls it, is now determining life chances” and that is where the shoe truly hurts, at some point in the near future we will be denied chances and useless special rebates, because the data did not match, we will be seen as a party person instead of a sport person, at which point out premiums would have been ‘accidently’ 7% too high and in that same person we will be targeted for social events and not sport events, we will miss out twice and soon thereafter 4 fold, with each iteration of wrong data the amount of misconceptions will optionally double with each iteration. All based on data we never signed up for or signed off on, so how screwed is all this and how can this congressional hearing be seen as nothing more than a sham. Yes, some questions needs to be answered and they should, yet that could have been done in a very different setting, so as we see the Texan republican as the joke he is in my personal view, we see “Sen. Ted Cruz (R-TX) asked Zuckerberg about 2016 reports that the company had removed conservative political news from its trending stories box, and followed up with questions about its moderators’ political views. When Zuckerberg said he didn’t ask employees for their political views, Cruz followed up with “Why was Palmer Luckey fired?”“, we wonder if he had anything substantial to work with at all. So when you wonder why Zuckerberg is being grilled, ask yourself, what was this about? Was it merely about abuse of data by a third party? If that is so, why is Tim Cook not sitting next to Zuckerberg? More important, as I have shown some of these issues for close to 5 years, why was action not taken sooner? Is that not the more pressing question to see answered?



Leave a comment

Filed under Uncategorized

Room for Requirement

I looked at a few issues 3 days ago. I voiced them in my blog ‘The Right Tone‘ (at https://lawlordtobe.com/2016/09/21/the-right-tone/), one day later we see ‘MI6 to recruit hundreds more staff in response to digital technology‘ (at https://www.theguardian.com/uk-news/2016/sep/21/mi6-recruit-digital-internet-social-media), what is interesting here is the quote “The information revolution fundamentally changes our operating environment. In five years’ time there will be two sorts of intelligence services: those that understand this fact and have prospered, and those that don’t and haven’t. And I’m determined that MI6 will be in the former category“, now compare it to the statement I had made one day earlier “The intelligence community needs a new kind of technological solution that is set on a different premise. Not just who is possibly guilty, but the ability of aggregation of data flags, where not to waste resources“, which is just one of many sides needed. Alex Younger also said: “Our opponents, who are unconstrained by conditions of lawfulness or proportionality, can use these capabilities to gain increasing visibility of our activities which means that we have to completely change the way that we do stuff”, I reckon the American expression: ‘He ain’t whistling Dixie‘ applies.

You see, the issue goes deeper than mere approach, the issue at hand is technology. The technology needs to change and the way data is handled requires evolution. I have been in the data field since the late 80’s and this field hasn’t changed too much. Let’s face it, parsing data is not a field that has seen too much evolving, for the mere reason that parsing is parsing and that is all about speed. So to put it on a different vehicle. We are entering an age where the intelligence community is about the haulage of data, yet in all this, it is the container itself that grows whilst the haulage is on route. So we need to find alternative matters to deal with the container content whilst on route.

Consider the data premise: ‘If data that needs processing grows by 500 man years of work on a daily basis‘, we have to either process smarter, create a more solutions to process, be smarter on what and how to process, or change the premise of time. Now let’s take another look. For this let’s take a look at a game, the game ‘No Man’s Sky’. This is not about gaming, but about the design. For decades games were drawn and loaded. A map, with its data map (quite literally so). Usually the largest part of the entire game. 11 people decided to use a formula to procedurally generate 18 quintillion planets. They created a formula to map the universe with planets, planet sized. This has never been done before! This is an important part. He turned it all around and moreover, he is sitting on a solution that is worth millions, it could even be worth billions. The reason to use this example is because games are usually the first field where the edge of hardware options are surpassed, broken and redesigned (and there is more at the end of this article). Issues that require addressing in the data field too.

Yet what approach would work?

That is pretty much the ‎£1 billion question. Consider the following situation: Data is being collected non-stop, minute by minute. Set into all kinds of data repositories. Now let’s have a fictive case. The chatter gives that in 72 hours an attack will take place, somewhere in the UK. It gives us the premise:

  1. Who
  2. Where
  3. How

Now consider the data. If we have all the phone records, who has been contacting who, through what methods and when? You see, it isn’t about the data, it is about linking collections from different sources and finding the right needle, that whilst the location, shape and size of the haystack are an unknown. Now, let’s say that the terrorist was really stupid and that number is known. So now we have to get a list of all the numbers that this phone had dialled. Then we get the task of linking the information on these people (when they are not pre-paid or burner phones). Next is the task of getting a profile, contacts, places, and other information. The list goes on and the complexity isn’t just the data, the fact that actual terrorists are not dumb and usually massively paranoid, so there is a limit to the data available.

Now what if this was not reactive, but proactive?

What if the data from all the sources could be linked? Social media, e-mail, connections, forums and that is just the directly stored data. When we add mobile devices, Smartphones, tablets and laptops, there is a massive amount of additional data that becomes available and the amount of data from those sources are growing at an alarming rate. The challenge is to correctly link the data from sources, with added data sources that contain aggregated data. So, how do you connect these different sources? I am not talking about the usage, it is about the impaired data on different foundations with no way to tell whether pairing leads to anything. For this I need to head towards a 2012 article by Hsinchun Chen (attached at end), Apart from the clarity that we see in the BI&A overview (Evolution, Application and Emerging Research), the interesting part that even when we just look at it from a BI point of view, we see two paths missing. That is, they seem to be missing now, if we look back to 2010-2011, the fact that Google and Apple grew a market in excess of 100% quarter on quarter was not to be anticipated to that degree. The image on page 1167 has Big Data Analytics and Mobile Analytics, yet Predictive Interactivity and Mobile Predictive Analytics were not part of the map, even though the growth of Predictive Analytics have been part of BI from 2005 onwards. Just in case you were wondering, I did not change subject, the software need that part of the Intelligence world uses comes from the business part. A company usually sees a lot more business from 23 million global companies than it gets from 23 intelligence agencies. The BI part is often much easier to see and track whilst both needs are served. We see a shift of it all when we look at the table on page 1169. BI&A 3.0 now gets us the Gartner Hype Cycle with the Key Characteristics:

  1. Location-aware analysis
  2. Person-centred analysis
  3. Context-relevant analysis
  4. Mobile visualization & HCI

This is where we see the jump when we relate to places like Palantir that is now in the weeds prepping for war. Tech Crunch (at https://techcrunch.com/2016/06/24/why-a-palantir-ipo-might-not-be-far-off/) mentioned in June that it had taken certain steps and had been preparing for an IPO. I cannot say how deep that part was, yet when we line up a few parts we see an incomplete story. The headline in July was: ‘Palantir sues investor Marc Abramowitz for allegedly stealing company secrets‘, I think the story goes a little further than that. It is my personal belief that Palantir has figured something out. That part was seen 3 days ago (at http://www.defensenews.com/articles/dcgs-commentary), the two quotes that matter are “The Army’s Distributed Common Ground System (DCGS) is proof of this fact. For the better part of the last decade, the Army has struggled to build DCGS from the ground up as the primary intelligence tool for soldiers on the battlefield. As an overarching enterprise, DCGS is a legitimate and worthwhile endeavour, intended to compute and store massive amounts of data and deliver information in real time“, which gives us (actually just you the reader) the background, whilst “What the Army has created, although well-intentioned, is a sluggish system that is difficult to use, layered with complications and unable to sustain the constant demands of intelligence analysts and soldiers in combat. The cost to taxpayers has been approximated at $4 billion“, gives us the realistic scope and that all links back to the Intelligence Community. I think that someone at Palantir has worked out a few complications making their product the one winning solution. When I started to look into the matter, some parts did not make sense, even if we take the third statement (which I was already aware of long before this year “In legal testimony, an Army official acknowledged giving a reporter a “negative” and “not scientific” document about Palantir’s capabilities that was written by a staff member but formatted to appear like a report from the International Security Assistance Force. That same official stated that the document was not based on scientific data“, it would not have added up. What does add up (remember, the next part is speculative), the data links required in the beginning of the article, have to a larger extent been resolved by the Palantir engineers. In its foundation, what the journal refers to as BI&A 3.0 has been resolved by Palantir (top some extent). If true, we will get a massive market shift. To make a comparison, Google Analytics might be regarded as MSDOS and this new solution makes Palantir the new SE-Linux edition, the difference on this element could be that big. The difference would be that great. And I can tell you that Google Analytics is big. Palantir got the puzzle piece making its value go up with billions. They could raise their value from 20 billion to 60-80 billion, because IBM has never worked out that part of analytics (whatever they claim to have is utterly inferior) and Google does have a mobile analytics part, but limited merely as it is for a very different market. There have always been issues with the DCGS-A system (apart from it being as cumbersome as a 1990 SAS mainframe edition), so it seems to me that Palantir could not make the deeper jump into government contracts until it got the proper references and showing it was intentionally kept out of the loop is also evidence that could help. That part was recently confirmed by US Defense News.

In addition there is the acceptance of Palantir Gotham, which offered 30% more work with the same staff levels and Palantir apparantly delivered, which is a massive point that the Intelligence groups are dealing with, the lack of resources. The job has allowed NY City to crack down on illegal AirBnB rentals. A task that requires to connect multiple systems and data that was never designed to link together. This now gets us to the part that matters, the implication is that the Gotham Core would allow for dealing with the Digital data groups like Tablet, mobile and streaming data from internet sites.

When we combine the information (still making it highly speculative) the fact that one Congressman crossed the bridge (Duncan Hunter R-CA), many could follow. That part matters as Palantir can only grow the solution if it is seen as the serious solution within the US government. The alleged false statements the army made (as seen in Defence News at http://www.defensenews.com/articles/dcgs-commentary) with I personally believe was done to keep in the shadows that DCGS-A was not the big success some claimed it to be, will impact it all.

And this now links to the mentions I made with the Academic paper when we look at page 1174, regarding the Emerging Research for Mobile Analytics. The options:

  1. Mobile Pervasive Apps
  2. Mobile Sensing Apps
  3. Mobile Social Networking
  4. Mobile Visualization/HCI
  5. Personalization and Behavioural Modelling

Parts that are a given, and the big players have some sort of top line reporting, but if I am correct and it is indeed the case that Palantir has figured a few things out, they are now sitting on the mother lode, because there is currently nothing that can do any of it anywhere close to real-time. Should this be true, Palantir would end being the only player in town in that field, an advantage corporations haven’t had to this extent since the late 80’s. The approach SPSS used to have before they decided to cater to the smallest iteration of ‘acceptable’ and now as IBM Statistics, they really haven’t moved forward that much.

Now let’s face it, these are all consumer solutions, yet Palantir has a finance option which is now interesting as Intelligence Online reported a little over a week ago: “The joint venture between Palantir and Credit Suisse has hired a number of former interception and financial intelligence officials“, meaning that the financial intelligence industry is getting its own hunters to deal with, if any of those greedy jackals have been getting there deals via their iPhone, they will be lighting up like a Christmas tree on those data sets. So in 2017, the finance/business section of newspapers should be fun to watch!

The fact that those other players are now getting a new threat with actual working solutions should hurt plenty too, especially in the lost revenue section of their spreadsheet.

In final part, why did I make the No Man’s Sky reference? You see, that is part of it all. As stated earlier, it used a formula to create a planet sized planet. Which is one side of the equation. Yet, the algorithm could be reversed. There is nothing stopping the makers to scan a map and get us a formula that creates that map. For the gaming industry it would be forth a fortune. However, that application could go a lot further. What if the Geospatial Data is not a fictive map, but an actual one? What if one of the trees are not trees but mobile users and the other type of trees are networking nodes? It would be the first move of setting Geospatial Data in a framework of personalised behavioural modelling against a predictive framework. Now, there is no way that we know where the person would go, yet this would be a massive first step in answering ‘who not to look for‘ and ‘where not to look‘, diminishing a resource drain to say the least.

It would be a game changer for non-gamers!



Leave a comment

Filed under Finance, IT, Military, Politics, Science