Archive for the Risk Category

Remembering the Master of All Trades

Exiting graduate school in ‘91, I interviewed with the Thinking Machines Corporation (TMC). It was a job for which I was woefully unqualified. No surprise, I didn’t get the chance to find out. And anyway, in two years, the company had gone under, victimized by being ahead of its time. But for the ten years it was in existence, TMC burned a spectacular trail through applied computation science. Inarguably its brightest light was the physicist Richard Feynman. (TMC founder Daniel Hillis affectionately recounted their collaboration in Physics Today, and you can read it here.)

Feynman’s legend as a Javert-like pursuer of enlightenment is well-established. (See, for instance, the Feynman tribute site.) Sadly, set next to his spectacular Physics work, Feynman’s contributions to analytic decision-making are less well-known. Hillis’ article described some of Feynman’s pioneering work on the numerics of the Connection Machine. But more interesting to me is Feynman’s work on the Challenger disaster commission.Feynman Dunking O-ring in Ice-water
Alone among the “wise men” on the commission, Feynman went to the source – interviewing the shuttle engineers whose 1 in 100 failure estimate magically inflated to 1 in 100,000 in the hands of NASA management! His
appendix to the final report is worth reading as a primer on (desirable) structured design and (self-deluding) risk management. For instance, here is Feynman talking about how shuttle uniquely engines were built:

The usual way that such engines are designed (for military or civilian aircraft) may be called the component system, or bottom-up design. First it is necessary to thoroughly understand the properties and limitations of the materials to be used (for turbine blades, for example), and tests are begun in experimental rigs to determine those. With this knowledge larger component parts (such as bearings) are designed and tested individually. As deficiencies and design errors are noted they are corrected and verified with further testing. Since one tests only parts at a time these tests and modifications are not overly expensive. Finally one works up to the final design of the entire engine, to the necessary specifications. There is a good chance, by this time that the engine will generally succeed, or that any failures are easily isolated and analyzed because the failure modes, limitations of materials, etc., are so well understood. There is a very good chance that the modifications to the engine to get around the final difficulties are not very hard to make, for most of the serious problems have already been discovered and dealt with in the earlier, less expensive, stages of the process.

The Space Shuttle Main Engine was handled in a different manner, top down, we might say. The engine was designed and put together all at once with relatively little detailed preliminary study of the material and components. Then when troubles are found in the bearings, turbine blades, coolant pipes, etc., it is more expensive and difficult to discover the causes and make changes. For example, cracks have been found in the turbine blades of the high pressure oxygen turbopump. Are they caused by flaws in the material, the effect of the oxygen atmosphere on the properties of the material, the thermal stresses of startup or shutdown, the vibration and stresses of steady running, or mainly at some resonance at certain speeds, etc.? How long can we run from crack initiation to crack failure, and how does this depend on power level? Using the completed engine as a test bed to resolve such questions is extremely expensive. One does not wish to lose an entire engine in order to find out where and how failure occurs. Yet, an accurate knowledge of this information is essential to acquire a confidence in the engine reliability in use. Without detailed understanding, confidence can not be attained.

A bottom-up approach is equally essential for building robust and sustainable Analytics-driven decision-support systems. Without the benefit of well-tested atomic models (e.g., the peculiar demand for fundamentally different types of products), building a large-scale model ab initio (in this instance, a manufacturing company’s demand forecast) invites shelfware-hood. I have seen exactly this sort of failure occur at a multi-billion dollar world-wide ERP roll-out, where the demand planning model template built by HQ analysts was so inapplicable to specific business units that local planners took to over-writing official forecasts produced by the ERP with their own hand-computed numbers!

As I’ve previously described (e.g., here), the practice of Analytics inevitably occurs in the shadow of business imperatives. Business goals give shape to the analysis. But they also have the power to corrupt the process. Feynman pungently captures that corruption at NASA, and its consequences in his final words:

For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.

One wonders about Feynman’s analysis of the current financial crises, which appears at least as much a product of self-delusion on high as the Challenger disaster!

Tell Me Something I Already Know (Or Want to be True)!

As a member of the advisory council for the upcoming INFORMS Practice Meeting in Phoenix, I am assembling what I hope will be a boffo slate of speakers for the track on Managing Risk & Uncertainty. Researching recent work in the area, I encountered an excellent 1997 paper by Dick Barr and Tom Siems titled Bank Failure Prediction Using DEA to Measure Management Quality. Early warning indicators (called Key Risk Indicators, KRIs, in the Risk Management community) of bank failure include one that is difficult to directly extract from balance sheets: Management Quality. Barr and Siems used Data Envelopment Analysis, or DEA (a linear programming-based efficiency measure, look here for a tutorial) to identify an analytically meaningful surrogate for Management Quality. The resulting multi-factorial risk model was remarkably predictive: it could correctly label a bank as strong or an incipient failure with 96% accuracy, a year to 18 months out.

I wonder whether the early warning system was used. Siems is listed as employed by the Federal Reserve Bank of Dallas, a regulatory body. So there appears a prima facie opportunity to apply the model. On the other hand, I would not be surprised to hear that the paper’s publication was its terminal “development milestone”.

Speaking later with Doug Smith, another financial risk estimation guru, I commiserated how often models such as Barr & Siems are left on the shelf. As Doug characterized the unfortunate imperatives facing his technical collaborators employed in finance: “The pressure to create profits meant that the results of risk models were ignored”.

Doug’s lament brought to mind a frequent problem for the analytics practitioner: the unfortunate habit of our “customers” (whether line managers within our own organizations or external consulting clients) to cherry-pick which analysis to use, and which to ignore. The situation becomes especially tricky when analytical findings collide with political winds. The latest such example comes from the British Isles, where a pay-as-you-go congestion charge system pushed by Newcastle University researchers was deep-sixed by Manchester voters. Feasibility, it turns out, is in the eyes of the customer. In this case, the citizenry decided that the traffic smoothing benefits of the congestion charge were trumped by the nefariousness of the new “tax”. Never mind the value of time!

While there are good estimates for project acceptance/failure (e.g., here) I have not come across estimates on how often analytics projects fail, or are shelved, for extraneous reasons. Do you have data, or even stories, to share?

Probability Management

In recent weeks I have been working with Sam Savage, well-known OR personality and a consulting professor at Stanford. We’re focusing on developing a practice framework for Probability Management. Whazzat, you ask? In sum, Probability Management is all about robust decision-making in the presence of uncertainty. (Pretty much the vision for Intechné!)

Since real world decision problems are almost always ill-structured and fuzzy, our tools of choice belong to the worlds of simulation, and statistical visualization. Stochastic optimization plays a role too, but in a very different form than typically understood, say, in Operations Research circles. In general, we are not interested in creating IT systems that generate “best possible” recommendations. Rather, we enable managers to interactively explore the decision space of good solutions, using something similar to a business intelligence (BI) approach. The key difference between BI and Probability Management is that while BI is essentially descriptive (identifying multi-factorial relationships, typically for historical data) Probability Management is prescriptive: our clients learn what to do better.

I intend to write further on this topic, but for now let me point interested readers to our newly redesigned web site. The organization is a loose consortium of academic and commercial folks involved in the field, as vendors, users, and advisors. Check out the Interact! tab. It contains illustrative Excel models that describe the relevant concepts far better than long-winded descriptions. If you find it interesting and wish to discuss further, contact me.

OR Practice Methodology: Assumptions & Concepts

In a recent article, I examined the need for the practice of Operations Research to be driven by a formal methodology. The primary motivating factor is the increasing mainstreaming of OR (or more general Advanced Analytics) techniques in Enterprise IT. This in turn is forcing system developers and consultants to proactively manage scalability and management of risk by “industrializing” OR delivery.

The typical use of this word in OR-related discussion concerns specific techniques – usually mathematical or statistical in nature – for solving reasonably well-posed problems. (Fairly typical of this usage is the paper titled “A methodology for integrating cell formation and production planning in cellular manufacturing”, published in Annals of Operations Research.) This OR is replete with (so-called) methodologies to solve problems fitting into structured classes. However, in most cases, “methodology” is just a nickel cigar encased in a five dollar label. What is under discussion is essentially a “method” or “technique” (or perhaps a class of such methods.). When speaking of an OR Practice Methodology, we’re interested not in the tools but in the practical principles and artifacts governing the deployment of OR techniques and methods.

The focus of this series of articles is on embedding OR in Enterprise IT. (While OR value is often delivered through one-off analyses, the role of a practice methodology in that purely consultative model is less clear.) In the Enterprise IT context, software engineering provides a sound basis for exploration. Methodologies such as Agile Programming, eXtreme Programming (XP), CMMI and Rational Unified Process (RUP) provide the IT project manager with a variety of risk control frameworks and usable artifacts for standard software development. However, they do not directly address the needs of the OR practitioner. Neither are they well-understood or used in the OR community.

Attempts on OR-specific methodologies have failed to gain traction in the practice community. The CHIC methodology for constraint programming was extended in the late nineties to large-scale combinatorial optimization problems and named CHIC2. However, I do not believe that CHIC2 was ever taught or used anywhere outside its academic seedbed in Europe.

More recently, French software vendor Ilog has extended its proprietary methodology for rule-based systems – ISIS – to optimization. Being a proprietary system that is viewed as a competitive advantage by Ilog, a list of OR-specific ISIS features has not been released. I do believe that ISIS has not, in any meaningful way, been used to develop a large-scale OR application. Ilog’s view of OR is limited to optimization, and thus ISIS is unlikely to be applicable to OR as a whole. But if opened to public view, it is the most promising methodological effort that I am aware of.

In the next article in this series, I will discuss practice methodology from the perspective of INFORMS, the leading professional society for OR.

Purposing Intechné

At the recently concluded INFORMS 2008 Practice Meeting, multiple colleagues asked about our vision for Intechné. Quite simply, our vision for the company is to reliably deliver smart decision-making capability to our clients.

It goes without saying that smart business decision-making involves advanced analytical techniques from the fields of Operations Research, Statistics, and Artificial Intelligence. These include Predictive Analytics and Data Mining (to detect correlative, possibly causal, relationships in historical data), Monte Carlo Simulation and Decision Analysis (to simulate the impact of such relationships and tease out key sensitivities in anticipative decision-making) and various flavors of Optimization (both mathematically-driven algorithms and less-structured, heuristic approaches). But the application of a wide spectrum of techiques does not necessarily guarantee smart decisions. Intechné differentiates itself by explicitly focusing on an often overlooked issue in applying advanced analytics in the enterprise: Risk.

Viewed in the context of applying advanced analytics to business improvement, risk is like the weather: everybody talks about it, but nobody does anything about it. Or, nothing systematic at any rate. At the purely technical level, approaches such as mathematical optimization produce “brittle” decisions; very small changes in input can produce dramatically different recommendations. Perhaps that’s not of concern in the few situations where human operators are not in the associated decision chain. But in general, analytics are used to support decision, not execute them. Unexplainable, non-intuitive, or volatile decisions often force operators to work around their decision-support systems, or even completely ignore them. For instance, we found a sophisticated SAP/APO installation essentially ignored by its users (demand planners at a Food & Beverage company) because it couldn’t auto-profile different product types. While the overall MAPE was ok, sales forecasts for individual products diverged from reality in unexpected ways.

When it comes to delivering decision-support technology based on advanced analytics, a host of implementation risks arise beyond standard IT development risks. For instance, the response times of constraint programming models can decay exponentially with input size. (This is quite different from, as an example, rule-based decision engines.) Encountered unexpectedly, such non-responsiveness leads to expensive and disruptive modeling/algorithmic rework at an advanced project stage.

All in all, the business of delivering smart business decision-making is characterized by these, and many other risks. In informal feedback from colleagues and clients, we find that project mortality in this area is unacceptably high: about one in three advanced analytics projects fails to perform to expectation.

An active risk management orientation lies at the core of our vision for Intechné. In forthcoming communications we describe how this orientation is incorporated into our practice culture, and how it has been shown to improve client results.

|