I looked at a few issues 3 days ago. I voiced them in my blog ‘The Right Tone‘ (at https://lawlordtobe.com/2016/09/21/the-right-tone/), one day later we see ‘MI6 to recruit hundreds more staff in response to digital technology‘ (at https://www.theguardian.com/uk-news/2016/sep/21/mi6-recruit-digital-internet-social-media), what is interesting here is the quote “The information revolution fundamentally changes our operating environment. In five years’ time there will be two sorts of intelligence services: those that understand this fact and have prospered, and those that don’t and haven’t. And I’m determined that MI6 will be in the former category“, now compare it to the statement I had made one day earlier “The intelligence community needs a new kind of technological solution that is set on a different premise. Not just who is possibly guilty, but the ability of aggregation of data flags, where not to waste resources“, which is just one of many sides needed. Alex Younger also said: “Our opponents, who are unconstrained by conditions of lawfulness or proportionality, can use these capabilities to gain increasing visibility of our activities which means that we have to completely change the way that we do stuff”, I reckon the American expression: ‘He ain’t whistling Dixie‘ applies.
You see, the issue goes deeper than mere approach, the issue at hand is technology. The technology needs to change and the way data is handled requires evolution. I have been in the data field since the late 80’s and this field hasn’t changed too much. Let’s face it, parsing data is not a field that has seen too much evolving, for the mere reason that parsing is parsing and that is all about speed. So to put it on a different vehicle. We are entering an age where the intelligence community is about the haulage of data, yet in all this, it is the container itself that grows whilst the haulage is on route. So we need to find alternative matters to deal with the container content whilst on route.
Consider the data premise: ‘If data that needs processing grows by 500 man years of work on a daily basis‘, we have to either process smarter, create a more solutions to process, be smarter on what and how to process, or change the premise of time. Now let’s take another look. For this let’s take a look at a game, the game ‘No Man’s Sky’. This is not about gaming, but about the design. For decades games were drawn and loaded. A map, with its data map (quite literally so). Usually the largest part of the entire game. 11 people decided to use a formula to procedurally generate 18 quintillion planets. They created a formula to map the universe with planets, planet sized. This has never been done before! This is an important part. He turned it all around and moreover, he is sitting on a solution that is worth millions, it could even be worth billions. The reason to use this example is because games are usually the first field where the edge of hardware options are surpassed, broken and redesigned (and there is more at the end of this article). Issues that require addressing in the data field too.
Yet what approach would work?
That is pretty much the £1 billion question. Consider the following situation: Data is being collected non-stop, minute by minute. Set into all kinds of data repositories. Now let’s have a fictive case. The chatter gives that in 72 hours an attack will take place, somewhere in the UK. It gives us the premise:
Now consider the data. If we have all the phone records, who has been contacting who, through what methods and when? You see, it isn’t about the data, it is about linking collections from different sources and finding the right needle, that whilst the location, shape and size of the haystack are an unknown. Now, let’s say that the terrorist was really stupid and that number is known. So now we have to get a list of all the numbers that this phone had dialled. Then we get the task of linking the information on these people (when they are not pre-paid or burner phones). Next is the task of getting a profile, contacts, places, and other information. The list goes on and the complexity isn’t just the data, the fact that actual terrorists are not dumb and usually massively paranoid, so there is a limit to the data available.
Now what if this was not reactive, but proactive?
What if the data from all the sources could be linked? Social media, e-mail, connections, forums and that is just the directly stored data. When we add mobile devices, Smartphones, tablets and laptops, there is a massive amount of additional data that becomes available and the amount of data from those sources are growing at an alarming rate. The challenge is to correctly link the data from sources, with added data sources that contain aggregated data. So, how do you connect these different sources? I am not talking about the usage, it is about the impaired data on different foundations with no way to tell whether pairing leads to anything. For this I need to head towards a 2012 article by Hsinchun Chen (attached at end), Apart from the clarity that we see in the BI&A overview (Evolution, Application and Emerging Research), the interesting part that even when we just look at it from a BI point of view, we see two paths missing. That is, they seem to be missing now, if we look back to 2010-2011, the fact that Google and Apple grew a market in excess of 100% quarter on quarter was not to be anticipated to that degree. The image on page 1167 has Big Data Analytics and Mobile Analytics, yet Predictive Interactivity and Mobile Predictive Analytics were not part of the map, even though the growth of Predictive Analytics have been part of BI from 2005 onwards. Just in case you were wondering, I did not change subject, the software need that part of the Intelligence world uses comes from the business part. A company usually sees a lot more business from 23 million global companies than it gets from 23 intelligence agencies. The BI part is often much easier to see and track whilst both needs are served. We see a shift of it all when we look at the table on page 1169. BI&A 3.0 now gets us the Gartner Hype Cycle with the Key Characteristics:
- Location-aware analysis
- Person-centred analysis
- Context-relevant analysis
- Mobile visualization & HCI
This is where we see the jump when we relate to places like Palantir that is now in the weeds prepping for war. Tech Crunch (at https://techcrunch.com/2016/06/24/why-a-palantir-ipo-might-not-be-far-off/) mentioned in June that it had taken certain steps and had been preparing for an IPO. I cannot say how deep that part was, yet when we line up a few parts we see an incomplete story. The headline in July was: ‘Palantir sues investor Marc Abramowitz for allegedly stealing company secrets‘, I think the story goes a little further than that. It is my personal belief that Palantir has figured something out. That part was seen 3 days ago (at http://www.defensenews.com/articles/dcgs-commentary), the two quotes that matter are “The Army’s Distributed Common Ground System (DCGS) is proof of this fact. For the better part of the last decade, the Army has struggled to build DCGS from the ground up as the primary intelligence tool for soldiers on the battlefield. As an overarching enterprise, DCGS is a legitimate and worthwhile endeavour, intended to compute and store massive amounts of data and deliver information in real time“, which gives us (actually just you the reader) the background, whilst “What the Army has created, although well-intentioned, is a sluggish system that is difficult to use, layered with complications and unable to sustain the constant demands of intelligence analysts and soldiers in combat. The cost to taxpayers has been approximated at $4 billion“, gives us the realistic scope and that all links back to the Intelligence Community. I think that someone at Palantir has worked out a few complications making their product the one winning solution. When I started to look into the matter, some parts did not make sense, even if we take the third statement (which I was already aware of long before this year “In legal testimony, an Army official acknowledged giving a reporter a “negative” and “not scientific” document about Palantir’s capabilities that was written by a staff member but formatted to appear like a report from the International Security Assistance Force. That same official stated that the document was not based on scientific data“, it would not have added up. What does add up (remember, the next part is speculative), the data links required in the beginning of the article, have to a larger extent been resolved by the Palantir engineers. In its foundation, what the journal refers to as BI&A 3.0 has been resolved by Palantir (top some extent). If true, we will get a massive market shift. To make a comparison, Google Analytics might be regarded as MSDOS and this new solution makes Palantir the new SE-Linux edition, the difference on this element could be that big. The difference would be that great. And I can tell you that Google Analytics is big. Palantir got the puzzle piece making its value go up with billions. They could raise their value from 20 billion to 60-80 billion, because IBM has never worked out that part of analytics (whatever they claim to have is utterly inferior) and Google does have a mobile analytics part, but limited merely as it is for a very different market. There have always been issues with the DCGS-A system (apart from it being as cumbersome as a 1990 SAS mainframe edition), so it seems to me that Palantir could not make the deeper jump into government contracts until it got the proper references and showing it was intentionally kept out of the loop is also evidence that could help. That part was recently confirmed by US Defense News.
In addition there is the acceptance of Palantir Gotham, which offered 30% more work with the same staff levels and Palantir apparantly delivered, which is a massive point that the Intelligence groups are dealing with, the lack of resources. The job has allowed NY City to crack down on illegal AirBnB rentals. A task that requires to connect multiple systems and data that was never designed to link together. This now gets us to the part that matters, the implication is that the Gotham Core would allow for dealing with the Digital data groups like Tablet, mobile and streaming data from internet sites.
When we combine the information (still making it highly speculative) the fact that one Congressman crossed the bridge (Duncan Hunter R-CA), many could follow. That part matters as Palantir can only grow the solution if it is seen as the serious solution within the US government. The alleged false statements the army made (as seen in Defence News at http://www.defensenews.com/articles/dcgs-commentary) with I personally believe was done to keep in the shadows that DCGS-A was not the big success some claimed it to be, will impact it all.
And this now links to the mentions I made with the Academic paper when we look at page 1174, regarding the Emerging Research for Mobile Analytics. The options:
- Mobile Pervasive Apps
- Mobile Sensing Apps
- Mobile Social Networking
- Mobile Visualization/HCI
- Personalization and Behavioural Modelling
Parts that are a given, and the big players have some sort of top line reporting, but if I am correct and it is indeed the case that Palantir has figured a few things out, they are now sitting on the mother lode, because there is currently nothing that can do any of it anywhere close to real-time. Should this be true, Palantir would end being the only player in town in that field, an advantage corporations haven’t had to this extent since the late 80’s. The approach SPSS used to have before they decided to cater to the smallest iteration of ‘acceptable’ and now as IBM Statistics, they really haven’t moved forward that much.
Now let’s face it, these are all consumer solutions, yet Palantir has a finance option which is now interesting as Intelligence Online reported a little over a week ago: “The joint venture between Palantir and Credit Suisse has hired a number of former interception and financial intelligence officials“, meaning that the financial intelligence industry is getting its own hunters to deal with, if any of those greedy jackals have been getting there deals via their iPhone, they will be lighting up like a Christmas tree on those data sets. So in 2017, the finance/business section of newspapers should be fun to watch!
The fact that those other players are now getting a new threat with actual working solutions should hurt plenty too, especially in the lost revenue section of their spreadsheet.
In final part, why did I make the No Man’s Sky reference? You see, that is part of it all. As stated earlier, it used a formula to create a planet sized planet. Which is one side of the equation. Yet, the algorithm could be reversed. There is nothing stopping the makers to scan a map and get us a formula that creates that map. For the gaming industry it would be forth a fortune. However, that application could go a lot further. What if the Geospatial Data is not a fictive map, but an actual one? What if one of the trees are not trees but mobile users and the other type of trees are networking nodes? It would be the first move of setting Geospatial Data in a framework of personalised behavioural modelling against a predictive framework. Now, there is no way that we know where the person would go, yet this would be a massive first step in answering ‘who not to look for‘ and ‘where not to look‘, diminishing a resource drain to say the least.
It would be a game changer for non-gamers!