Friday, August 07, 2009

Statisticians in Demand

We live in a world where data is the raw material that analysts use to produce information, also known as knowledge. However, without analysis, data remains data in raw form. This might explain the recent upsurge in hiring for statisticians. Steve Lohr (“For Today’s Graduate, Just One Word: Statistics,” NYT, Aug 5, 2009) argues that statistics may be the password for hiring in the coming years:
The rising stature of statisticians, who can earn $125,000 at top companies in their first year after getting a doctorate, is a byproduct of the recent explosion of digital data. In field after field, computing and the Web are creating new realms of data to explore — sensor signals, surveillance tapes, social network chatter, public records and more. And the digital data surge only promises to accelerate, rising fivefold by 2012, according to a projection by IDC, a research firm.

The demand for statisticians is consistent with a larger trend toward competing on analytics in enterprise. This trend has also given impetus to the need for other experts, especially in computer programming.

Though at the fore, statisticians are only a small part of an army of experts using modern statistical techniques for data analysis. Computing and numerical skills, experts say, matter far more than degrees. So the new data sleuths come from backgrounds like economics, computer science and mathematics.

Over the past several decades, firms have invested heavily into data management technology, including server and data-warehousing systems. These investments have created massive amounts of raw data that are begging to be analyzed by people trained and skilled in descriptive and inferential statistics, stochastic modeling, linear and non-linear forecasting, and so forth. The creation of so much raw data in recent years makes statistical analysis of that data a vital value-adding activity that enables competing on analytics.

“I keep saying that the sexy job in the next 10 years will be statisticians,” said Hal Varian, chief economist at Google. “And I’m not kidding.”


Ron said...

The problem with the increasing availability of more and more data is that it can be selectively chosen by statisticians to produce information, then classified as knowledge, to prove almost any point. It is the move from raw data to knowledge that can be bothersome.

If by knowledge, we mean that which has at least, say an eighty-five percent chance of being true and provably correct, then the non-statistician using those analyzed figures has at least the same level playing field as some corporation who has hired a $125,000 a year statistician who may have been hired to figure out how to plausibly lie with statistics. Who supplies the data and or manages the collection thereof?

Two caveats for the non-statistician given to assume and use that knowledge as true: one, beware of experts who lose sight of the side effects of what they do due to a specialized education; and two, consider the ethics and reasons behind the producer of those statistics.

Sergei said...

I think you are too affected by the popular culture myth of “lies, damn lies, and statistics”. In fact, contemporary statistics is a rather rigorous discipline based on sound Math.

The original phrase “… damn lies …” is usually attributed to Benjamin Disraeli, the late 19th century United Kingdom Prime Minister. Back then, politicians hated sound statistics, as they do now, because sound statistics reveals what is actually going on, while politicians are by their very nature inclined to spin news in ways more suitable for their agendas.

You are right in saying that quality of data affects the quality of statistical inference. I don’t agree though that statisticians are hired to figure out how to plausibly lie with statistics. If you want soft lies, Marketing will generate lots of them for significantly less money :-)

Any statistician worthy of his or her degree will tell you all about the lengths they go in the “design of experiment” phase to specifically neutralize the biases inherent in noisy and low-quality data. It is revealing the truth that is expensive and takes specialized skills! Lying in easy, there is no need for statisticians in this area.

Post a Comment