This article is written by Aaron Endré – our Writer in Residence. Follow him at @aaronendre
The question we all secretly want to ask when we hear of exceptionally smart data scientists analyzing user behavior to tell companies which ads to deliver —shouldn’t all these ‘data scientists’ be using this data to cure cancer or something? — was posed recently by Klint Finley at WIRED: “Tech companies are snapping up scientists with backgrounds in fields like physics, mathematics, and bioscience — people we might expect to be busy curing cancer, saving the environment, or discovering the origin of the universe. It’s easy to be cynical about this,” he writes.
The flight — and reckless abandon — of the scientific community away from arguably more meaningful research, he posits, is due to the lack of professorships jobs: the U.S. produced 100,000 PhDs between 2005 and 2009, while creating only 16,000 new professorships.
“The irony,” Finley adds, “is that many believe that tech can help revive sciences…[companies like] Cloudant can provide scientists with better tools so they can spend more time actually doing science and less time fiddling with their software, and others are building platforms that can help in similar ways.”
But, in fact, big data’s brightest minds are doing just that: helping to solve the mysteries that we couldn’t otherwise solve without analyzing data sets that were too large to analyze pre-big data.
As reported in Informationweek and Wall Street Journal: the American Society of Clinical Oncology (ASCO) is harnessing big data to build and mine a database of cancer treatment records to help physicians determine the best treatments for particular kinds of patients. A just-completed prototype of the program aggregates data on more than 100,000 breast cancer patients from 27 oncology practices, many of which use different electronic health record (EHR) systems.
Some 1.6 million Americans are diagnosed with cancer every year, but in more than 95% of cases, details of their treatments are “locked up in medical records and file drawers or in electronic systems not connected to each other,” said Allen Lichter, chief executive office of ASCO. “There is a treasure trove of information inside those cases if we simply bring them together.”
ASCO is joining a Big Data movement that is well under way across medicine, including other initiatives in cancer and in cardiology. The Institute of Medicine, an independent body that advises the U.S. government on medical issues, believes such databases eventually will become a “health-care utility” to generate knowledge for treating a variety of diseases. The ASCO project is “recognition that big data is an imperative for the future of medicine,” said Lynn Etheredge, a consultant with the Rapid Learning Project at George Washington University in Washington, D.C.
Indeed, Cloudant (a UTR ’10 alum – watch their presentation) creates technology that is based on the work the founders did while analyzing data from the Large Hadron Collider — the world’s largest atom collider that seeks to answer questions about the origins of the universe. They were disappointed with existing tools for dealing with large volumes of data, so they created their own version of the open source CouchDB database called BigCouch.
“More than anything, an education in the physical sciences teaches you how to think,” says Cloudant co-founder and chief technology officer Adam Kocoloski. “Startups are all about solving new problems. A background in science helps you react quickly to new and unknown situations.”
Given the potential of big data to completely revolutionize all aspects of medicine and science, perhaps we shouldn’t be too alarmed that big data startups are hiring the world’s best—in fact, maybe it’s high time that scientists get the chance to do something great, through technology, as opposed to, as Finley writes, “ending up as low-paid adjunct professors or baristas.”