Tuesday, June 18, 2013

Post # 84: Forget the WHY – Just tell me WHEN and WHERE !

I've been really busy the past several weeks with the relaunch of Advanced Technology Insights, LLC (www.ATInsightsLLC.com).  But, as a quick follow-up to my last posting (Your Brain on Big Data), I wanted to share a confirming tidbit I ran across in the WSJ last Thursday (June 13 issue)...

Tucked away on page B6 that day was a fascinating "CIO Journal" blog entry by Rachael King, entitled, "How Spies May One Day Predict The Future".  The content of her posting validates my musing from Post # 83 regarding potential displacement of the need to understand "first principles causality" with what I call "big data correality"...

Within the brief article, Ms. King discusses the potential for using "current data to predict the future."  She discusses a declassified project, named "Open Source Indicators,"...

"...One declassified project, Open Source Indicators, reviews a range of publicly available sources, such as Twitter messages, Web queries, oil prices, and daily stock market activity, to gauge the likelihood of certain societal events.  The goal is to develop continuously automated systems that can predict when and where a disease outbreak, riot, political crisis or mass violence might occur."

According to Ms. King...

"Already the project has been able to accurately forecast student protests that occurred in Paraguay when the president was impeached, and to predict a Hantavirus outbreak in Argentina last year."

All of this reminds of me of the discipline of noise analysis, the use of which allows one to extract meaning (if there's any to be found) from what might otherwise appear to be random, uncorrelated, and non-relevant signals.  It's also closely related to regression analysis, in which one estimates the response  of a dependent variable (say stock price) to a set of (hypothesized) correlated parameters (day of the week, P/E ratio, values of key financial indicators, etc.).  Given some luck, sufficient time, and enough raw data to use in the curve fitting algorithm, it's possible to develop good predictive capabilities without truly understanding any of the underlying or fundamental cause and effect relationships.

So correlation replaces causality.

It raises the question, "Does one really need to know why, so long as one can predict when and where?"

Some heavy physics, metaphysics, and philosophy knotted up in this one...

Just thinking...

Sherrell