The Best of Both – A Road to Explainable Accurate AI
Embedding Highly-Accurate Unexplainable AI within Highly-Trustworthy Explainable AI
As the most popular modelling method in machine learning, neural networks provide generic ways to combine the modelling of complex relationships with accurate predictions in static and dynamic data analytical decision problems. The problem with neural networks however is in their inexplicable decision-making: the rules automatically (“machine”) learned from the data cannot be explicated to the human mind. Therefore, where knowing your knowledge (i.e. knowing which rules apply when) has always been the exclusive purview of Man, with neural network AI in some areas this may become the exclusive purview of the Machine. How shall this convince end users that their risks are minimized and objectives will be reached? And when unfortunate events do happen, who is going to do the explaining? From business, ethical and legal perspectives, a reality in which operational knowledge is unknowable to the human mind seems too risky to accept. AI therefore has to be explainable to manage its utilization wisely, and to win the trust required for its acceptance in business and society.
At Massive Analytic, our focus is on Explainable AI (XAI) which can be delivered by combining complementary AI techniques that result in accurate and trustworthy predictive solutions. In particular, our patented Precognition AI (P-AI) decision tree technology, already in development since 2014, is in many instances capable of delivering highly accurate yet explainable predictions, and complements inexplicable AI by providing insight into the nature of the decision-making problems.
Modern AI, Classical Understanding
A recent controversy around the AI-driven Epic’s Sepsis Model (ESM) where researchers at the University of Michigan found that the model developed by Epic fell far short in identifying sepsis cases accurately, illustrates a typical problem with real-life application of Modern AI. The ongoing discussion it has generated will no doubt help identify the crucial questions when more complex prediction tools start presenting insurmountable barriers to human understanding and external validation. Definitely, one of these questions will highlight the importance that for AI to be “reasonable”, notwithstanding that it is “artificial”, it should also be “explainable”.
Modern AI is characterized by the rapid advance of neural networks in AI applications, the progress of which made possible by the mass availability of cheap computing power roughly since the start of the 3rd millennium. By nature computation-intensive, supervised learned or “data trained” neural networks turn out to provide generic ways to modelling complex relationships in static as well as dynamic data analysis problems, fields addressed by the Data Science and Reinforcement Learning communities, respectively. Instead of encoding the relationships or “rules” explicitly (“upfront”), as with Classical AI, in ML AI the algorithms “learn” the decision rules automatically from the input data; in the case of dynamic data, the AI algorithm (or Agent) is said to "experience the world" by interacting with its surrounding context in a trial-and-error fashion.
Dennis Hassabis, the co-inventor of DeepMind (a powerful, multi-purpose deep neural network), in a 2015 interview phrased it as follows: with the "… ability to learn for itself from experience … it can do stuff that maybe we don't know how to program". Indeed, a remarkable achievement worthy of science's mission to overcome Man's limitations, yet unwittingly, Hassabis' phrase also illustrates the dilemma: as knowing your knowledge essentially consists of "knowing which rules to apply when" (which models essentially do), with Modern AI for the first time in history, in delegating the rule discovery to machines without knowing "how to program" them, and retrospectively without ways for Man to learn and understand from the Machine how and what it discovered as its preferred reasoning, opens up the flood gates to "unintended consequences". In experimental (“toy”) environments this is commonly accepted, but in practical applications that touch on our everyday lives, it certainly is not.
Since ancient times scientific knowledge has been the exclusive purview of Man, with modern AI in some areas this may become the exclusive purview of the Machine, that is inaccessible by the human mind. However, where consequences are grounded in accountability (in most judiciary systems we are familiar with) the absence or lack of Man's understanding of a Machine's behaviour poses fundamental questions about the Man/Machine relationship and accountability at large, for instance: "Who is accountable for what?", and "What cannot be accounted for?" In the absence of explainable AI (XAI), the chain of accountability is broken. With the inexplicable Machine in the middle, even finger pointing between AI user and AI developer seems pointless (as one ESM related article demonstrates).
A 2021 article in Forbes (Artificial Intelligence’s Biggest Stumbling Block: Trust) pointedly describes companies' awareness of this Modern AI dilemma. Consequentially, many hesitate to accept and decisively buy into it, despite its demonstrable predictive power (accuracy) in many practical applications. The key issue is “Risk of Unaccounted and Unaccountable Consequences”: the unaccounted risk of inadvertently introducing adverse scenarios (either through erroneous control and/or management policies, or the lack thereof), and the unaccountability risk of inexplicable cause-to-consequence (i.e. implicit model) relationships. This healthy lack of trust is born out of a refusal to take responsibility and be held accountable for all outcomes and consequences of a Machine's decision making that is insufficiently transparent to Man.
Interestingly, “The more that companies use AI in decision-making, the more confident they become in these technologies’ ability to deliver.” Meanwhile, a vast sceptical majority is holding off: a “... self-reinforcing interplay of three factors is impeding progress: failure to appreciate AI’s full decision-making potential, low levels of trust in AI and limited adoption of these technologies.” Apparently, two contrasting trends are developing simultaneously: a minority of companies that are embracing modern AI, reaping its benefits and assuming the accountability issues that come with it; and a majority of companies avoiding the accountability issues, but also not reaping AI’s benefits either. Where the adopting companies prioritize market risk over legal risk, a sceptical majority of them prioritizes the other way around. Indeed, "… building trust in the use of AI to make superior decisions takes time”. The responsibility of building that trust falls predominantly to the inventors and sellers of AI solutions.
When restricted to the cyber-social domain, human Trust is measured by the degree of belief in the predictability of behaviour of Man and Machines. This Trust rests on three complementary sources: (1) Trust assumed from authority, (2) Trust built upon (scientific) understanding (knowledge), and (3) Trust derived from statistical evidence (e.g. demonstration). For instance, our Trust in using complex man-made systems, like cars and aeroplanes, comes primarily from a combination of (1) and (3): (1) trusting the (publicly responsible) regulatory authorities to filter out and penalize unreliable designs and trusting the (educated) builders in knowing their science (note that delegated understanding bears on authority), and (3) allowing ourselves to be convinced by the publicly observable statistical evidence of the safety of cars and aeroplanes. For most of us, lacking a personal understanding of how these complex systems work internally, our level of Trust in them derives indirectly from these public institutional, educational and informational systems that surround them. Yet this infrastructure only exists because of today’s maturity of the underlying industries; in their early beginnings, they were in the same situation where AI is now: at pioneering stage, embraced by the daring heroes, mistrusted by the cautious many. Towards a widespread acceptance, in growing from pioneering to early adoption stages, winning the Trust of authorities and public was crucially important, and required a lot of science-based explaining along the way.
The crucially difference between past and today’s AI pioneering stage is the speed of deployment: compared to “slow” material deployment (cars, aeroplanes, …), the cyberspace deployment in space and time of new AI technologies is virtually global and instant. The good and bad will travel equally wide and fast. (Experience with the systemic fragility of the financial system serves us with an appropriate example.)
Modern AI, Complexity and Explainability
At Massive Analytic we believe that a responsible deployment of AI technology shall be matched with a maximum of transparency regarding its inner workings so that Man not only can know what she/he is accountable for but that he/she can also learn and mitigate when things go wrong. Responding to industries’ call for an Explainable AI (XAI), AI transparency must demystify much of ML’s neural net technology, from the inside by using reasoning nets more amenable and adapted to the human mind, and from the outside by complementing neural nets’ inexplicable “reasoning” schemes with explicit, explainable ones.
To identify where and how unexplainable AI should be complemented, we first study some of the assumptions (explicit and implicit) that underlie the ML neural net methodology:
Just as a (classical mathematical) singular value decomposition (SVD) captures the unique information in a data matrix, a neural network (NN) captures the complex correlations between inputs and outputs. Usually, in both cases the underlying latent space holding the essential information is of much lower dimensionality than that of the original data. However, where SVD is a well-grounded scientific theory that guarantees to converge to a parsimonious underlying space, with neural networks (still) based on art rather than science, such parsimony cannot be guaranteed. Therefore, a NN over-fitting the data is the rule rather than the exception, and with no scientific handle to measure this, even an extensive testing cannot exclude the occurrence of noisy or absurd outcomes at sometime.
Leveraging combinatorial flexibility / complexity, a NN can avoid much bias that inherently occurs when modelling complex relationships. Inevitably though, every model constrains the data and will therefore also introduce bias. Important sensitivities in complex decision problems will therefore inevitably be “overlooked”, and since there is no way to inspect the NN’s learned rules (“knowledge”), bias in data and model will be raised only after an extensive use of the algorithm in practice. Therefore, the machines in many cases (if not in most cases) cannot be trusted to learn the correct rules by themselves. As a matter of fact, there is a lot of discussion about bias in certain AI (neural network based) algorithms.
Characteristic to the implied reductionism in every analytical (i.e. scientific) method, automatic rule learning only works for problems with rather simple value systems. Many real-world problems, however, are complex in the sense that decisions must be balanced against multiple heterogeneous values (numerical and categorical alike). In such cases, severe model reductions (compounding and/or eliminating values and features) must be applied to render the ML problem feasible (i.e. to make it produce a “solution”). Alternatively, an XAI approach would give human insight in alternative reduction schemes, their various trade-offs, and by leveraging human mind’s uniqueness to arrive at the good and reliable results. (This touches on the complementarity aspects between Man and Machine and the balance between Decision Support versus Automation.)
n arena of uncertainties underlies every real-world complex decision making problem. For instance, data/information may be lacking, conflicting (ambiguous), duplicitous or imprecise (non-specific); our knowledge (encoded in models) is biased, reductionist (simplistic in behaviour and/or features) or extravagant (opposite of reductionist). Sophisticated data/info fusion schemes are required to deal with this compounded complexity (multi-valued info & heterogeneous-uncertainty) in order propagate content and uncertainties faithfully through the information ranks. Neural network based ML methods use a limited set of simple combination schemes unlike any sophisticated form of data/information fusion. Instead, XAI incorporating sophisticated fusion schemes based on possibilistic techniques is adaptive to a much wider space of complex decision making problems.
Neural network ML methods are learned from real-world data. Like any predictive scheme, the aim always is to predict events based on the identified historical correspondence between (input) features and (output) events. The bulk of this data however represents the world's "nominal" behaviour (the "business as usual events"); capturing the rare but equally real freak events requires data volumes orders of magnitude larger than what's usually available, or what can be processed in a reasonable time. Not being able to handle corner cases implies that in general singular (“freak”) events will not be predicted because of (necessary training) limitations to the features value domain (false negatives), and spurious freak events may be predicted when (online input) feature values veer outside their valid (training) domain (false positives). Moreover, singular events often have unique behaviours (pertaining to a different more catastrophic regime) that deviate strongly from the predominant ML’d patterns predicting the mainstream behaviour. Conflating these dissimilar regimes in a single NN is bad policy. However, as all regimes are relevant, and especially the catastrophic ones should be anticipated and handled properly, only an XAI methodology will be able to explain which regime a system is operating in at any given time.
Another issue is that the lack of data limits the applicability of supervised learning, but not that of rule-based systems. In many cases, getting access to labelled data is very challenging, either because getting a good-size dataset is very demanding (e.g. when trying to construct a carbon labelled dataset for agriculture, you would need to take hundreds to thousands of soil samples, each of which should be analysed in a lab), and/or are massively bogged by GDPR issues. A rule-based system does not require large datasets; obviously, a good-sized dataset has always value for validation that the system works, but it is not needed for its training and development, which is the most important.
A chain is only as strong as the weakest link; if such link is on a critical path, the same applies to much of its surrounding network. As decision-making workflows are built bottom-up by interconnecting decision-making nodes, insufficient attention usually is given to the implied increase in the network’s complexity and fragility. Keeping a top-down handle on such compounding complexity requires the regular evaluation and prediction of the quality and robustness of each decision node, and their sub clusters. At such meta-level only an XAI can provide the actionable intelligence and systemic insight needed to maintain operational integrity and plan for robust strategic innovation.
Towards Explainable AI (XAI)
Well before neural networks became the GOTO method in Machine Learning, decision trees were already studied in the mid-60s. In psychology, they were used to model the human concept of learning and exploration of the human mind. When researchers discovered the algorithm was also useful for programming it became a proper ML technique when in 1972 the first classification tree appeared in the THAID project. Distinct from neural networks (which derives from conceptualising the mind’s underlying more abstract neurological framework), decision trees, with their attention focused on the logical aspects of the human mind, are well-attuned to human reasoning and decision making.
With neural networks and decision trees at opposite ends in the explainability spectrum (Figure 1), both make valuable and unique contributions to human cognition (which requires both logical structure and it valuation). Analogous to particle physics, neural nets can be conceived of as modelling the micro-level interactions between elementary (“Gibbs”) particles, with decision trees integrating these micro-behaviours into macro-entropic (thermodynamic) cognizable laws (i.e. the decision rules). This analogue also provides us with a scientific entropic framework to systematically integrate the complementary decision tree and neural network technologies. The quality of the end result then rests on a successful integration of both. Bottom-up neural networks will provide high-quality evaluation of the problem features (as main constituents representing the underlying logic), top-down decision trees give steer to the cognition network layout and attention span (as lowest level data and information processors). Specification, implementation and evaluation of alternatives and trade-offs under this paradigm makes for an interesting research programme indeed.
The XAI Developer and Vendor Role
Understandably, building trust in an AI application will almost always be intermediated by AI decision support to human decision makers (“human-in-the-loop”), rather than fully automated AI control. Currently, assisted-AI decision support technologies find more applications, because they do not require full trust upfront; and professionals using an AI-assisted tools that cannot explain their outputs are found to be much more reluctant and hesitant to follow its advice.
As most real-world decisions have consequences with multifarious values ranging over the business, legal, social/moral and ethical domains, at Massive Analytic we are convinced that to have control over complexity (rather than being controlled by it), AI is required to be explainable and interpretable. By implication, AI shall have to be comprehensible to the human mind. As an AI technology inventor, developer, products and solutions seller, we are fully committed to the mission of developing and deploying trustworthy and accurate AI. Our vision is in combining the best of both, decision trees and neural nets, according to the celebrated and proven entropic framework of old.
The uncompromising requirement for AI to be logically transparent and accurate not only drives decision tree and neural network technologies towards each other, within each technology serious work is already done to overcome their respective weaknesses: symbolic neural networks and explainability added to deep neural networks are being experimented with; on the side of decision trees MAL’s patented Precognition AI (P-AI) technology, based on possibilistic decision trees is already in development since 2014. Our P-AI provides insight into the nature of the decision-making problems and in many instances already delivers highly accurate yet explainable predictions. In complementing unexplainable ML methodology many results that may appear mysterious at first, become transparent. We also have already embarked on research aimed at its integration within a comprehensive information theoretic (entropic) ML framework.
Many companies still rightfully cautious, sceptical and averse towards the mystery and magic that surrounds much of Modern AI, can be won over when they see P-AI clearing much fog in competitive problem spaces, able to reach explicable prediction results with a comparable accuracy as in traditional neural network models.