60 Minutes aired a piece last night about scientific fraud at Duke University, where data was fabricated in order to support alleged discoveries in individualized cancer therapies. As a result of these investigations, a number of previously published scientific articles have been retracted.
Stephen Wolfram is doing it again. I’m a big fan of Wolfram (you can read some of my other posts here, here, and here…), and am always intrigued by what he comes up with. A couple of days ago, Wolfram launched his latest contribution to data science and computational understanding – Wolfram|Alpha Pro.
Here’s an overview of what the new Pro version of Wolfram|Alpha can provide:
With Wolfram|Alpha Pro, you can compute with your own data. Just input numeric or tabular data right in your browser, and Pro will automatically analyze it—effortlessly handling not just pure numbers, but also dates, places, strings, and more.
Upload 60+ types of data, sound, text, and other files to Wolfram|Alpha Pro for automatic analysis and computation. CSV, XLS, TXT, WAV, 3DS, HDF, GXL, XML…
Zoom in to see the details of any output—rendering it at a larger size and higher resolution.
Perform longer computations as a Wolfram|Alpha Pro subscriber by requesting extra time on the Wolfram|Alpha compute servers when you need it.
Licenses of prototying and analysis software go for several thousand dollars (Matlab, IDL, even Mathematica) - student versions can be had for a few hundred dollars, but you can’t leverage data science for business purposes on student licenses.
Wolfram|Alpha Pro lets anyone with a computer, an internet connection, and a small budget to leverage the power of data science. Right now, you can get a free trial subscription, and from there, the costs are $4.99/month. This price is introductory, but it could be sedutive enough to attract a lot of users (I’ve already signed up – all you need for the free trial is an e-mail address…)
One option that I find really interesting is Wolfram’s creation of the Computable Document Format (CDF), which interactivity lets you get dynamic versions of existing Wolfram|Alpha output as well as access to new content using interactive controls, 3D rotation, and animation. It’s like having Wolfram|Alpha is embedded in the document.
I had attended a Wolfram Science Conference back in 2006 and saw the potential for such a document format back then. There were a number of presenters who later wrote up their work into a paper, published by the journal Complex Systems. Since many of the presentations utilized a real interactivity with the data, I could see where much of the insight would be lost when people tried to write things down and limit their visualizations to simple, static graphs and figures.
I remember contacting Jean Buck at Wolfram Research, and recommending such a format. Who knows whether that had any impact, but I’m certainly glad to see that this is finally becoming a reality. I actually got the opportunity to meet Wolfram at the conference (he even signed a copy of his Cellular Automata and Complexity for me… – Jean was kind enough to arrange that for me – thanks, Jean!)
If you’re interested in data science and have a spare $5 this month, try out Wolfram|Alpha Pro!
- 1 in 3 scientists admit to using questionable research practices
- 1 in 50 admits falsifying or fabricating data outright
- Among biomedical researcher trainees at UC-San Diego, 81% said they would modify or fabricate results to win a grant or publish a paper
This is obvious disturbing, and worth highlighting to try and root these things out. Science is about finding the truth – no matter what it is – and as more businesses start using data science in order to drive business outcomes, we need to make sure that science is about being honest – with the truth and with ourselves.
The scientific method was developed to provide the best way to figure out what the truth is, given the data we’ve got. It doesn’t make perfect decisions (no method can), but it’s the best method available.
Real scientists (the ones not highlighted in Jen’s research) care about what the data is actually saying and discovering the truth. When someone cares about something else other than the truth (money, celebrity, fame, etc.), then bad science is what you get. Of course, when there are people involved, sometimes the truth isn’t the top priority.
Great infographic, Jen! You can find it here…