Emerging in tandem with the more prevalent big data trend, the data scientist may still have a nebulous job description, based on a firm’s unique data set and needs, but their remit is simple; they derive business intelligence and drive revenue by making sense of both structured and unstructured data.
IBM defines the role as an evolution of the typical data analyst, with similar training in computer science, statistics and analytics, but with added business acumen and the communicative abilities to relay their findings to CIOs, and in turn CEOs.
That definition itself is evidence of the role’s importance. These wizards of ones and zeroes, where they are employed effectively, are charged with communicating their findings to senior decision makers.
The data scientist role is described as ‘part analyst, part artist’ by Anjul Bhambhri, Vice President of Big Data Products at IBM. “A data scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a Renaissance individual who really wants to learn and bring change to an organisation”.
Analytics solutions provider Pivotal sees the data scientist blend as a mixture of both soft and technical skills. From the technical side there should be a basis in mathematics, statistics and machine learning, as well as being into computer science and understanding technology. But you need to have knowledge of the domain you are working with, be it finance, bioinformatics or digital media.
Data science is now where computer science was a decade ago, says Noelle Sio, Senior Data Scientist for Pivotal – formerly EMC’s Greenplum. The skillset may have seemed superfluous and the demand for such skills was certainly not widely comprehended when universities began offering degree courses, she tells AMEinfo.
The science of making money
As the New York Times noted last month, North Carolina State University have been offering a master’s degree in analytics since 2007. Fully 100% of last year’s graduates had job offers – 84 in total. The average salary offer was $89,100, and surpassed $100,000 for those with prior work experience.
But why do we suddenly require experts just to process business data when existing desktop solutions have been sufficient for so long?
“The limit of [Microsoft] Excel is about a million rows and other statistical software can run between one and ten million rows of data, although a colleague once told me his computer literally caught on fire while trying to do that. I’ve certainly brought down my fair share of big systems trying to calculate too much data,” says Sio.
For most businesses, we’re talking about petabytes of data – trillions of rows of data. This is information that can take well over a day to either transfer or back up, never mind face thorough interrogation to extract relevant information for business intelligence.
“The end result of a data science project isn’t just a beautiful model, but a business action. The input for a data science team is a business question – how can we sell more? The data science team can turn that from a business problem into a maths problem, though it’s not just about doing it in theory but figuring out how to do it in practice and executing that.”
It may be that companies are reluctant to initially draft in a data scientist because of a lack case studies and understanding, but it’s more likely that they would simply struggle to find one. Such experts are scarce and Sio herself predicts we will see more university degrees and graduate training programs emerge in the near future to fit the need. EMC, owner of Pivotal, are currently running their own Academic Alliance program in the UAE and Saudi Arabia, training graduates within big data analytics virtualisation, cloud computing and other IT trends.
Sio says Pivotal want to help customers to do become data driven on their own; thus maintaining their IPs and staying in control of their innovations.
“There are a few ways an organisation can make use of a data scientist. They can try to go out and hire one, which is very hard to do because they’re in such high demand. They can hire one straight from school and train them up and you can find people in your organisation with the right skillsets and retrain them.”
Sio’s own route to becoming a senior data scientist came via a passion for applied mathematics and an internship with Fox, owner of the formerly de rigueur Myspace, though she admits the emerging role already requires greater numbers of recruits with dedicated training.
Whether or not the data scientist is the CIO of tomorrow is really based upon whether or not we see ROI success stories emerge in the coming years. Any paradigm shift towards data science becoming integral to business transformation will take time and investment, but a greater measure of perceived revenue generation from the guys at the top.