Nerd Pride Friday: ‘Wars’ Trumps ‘Trek’

It appears that Princess Leia and Captain Kirk are in a bit of interstellar combat, albeit 30+ years later.  And regardless of what Kirk has to say about it, I think that Star Wars rules!  So, the big question is, why are we even talking about this?…

Well, back in September, William Shatner, who portrayed Captain Kirk on the original Star Trek series, posted a YouTube video about how Star Trek was really better than Star Wars – how Lucas’ space opera was really “derivative” of Gene Roddenberry’s creation by 10-20 years. 

Well, a couple of weeks ago, Carrie Fisher, who played Princess Leia in the original Star Wars trilogy of movies, had to call Shatner out, posting her own video rebutting the critique.

Below are the videos to both Shatner’s original post, and Fisher’s reply.  Personally, I’d rather see the original Kirk and Leia duke it out, but I guess we’ll have to settle for their earthly thespians…


Forbes: Can Big Data Fix Healthcare?

This is the very question asked by Colin Hill, CEO and co-founder of GNS Healthcare, a healthcare analytics company.  Hill hopes to make the case that healthcare can benefit from what a recent McKinsey report calls “the next frontier for innovation, competition and productivity.”  

I think Hill is onto something, especially with this insight:

What will healthcare look like in the year 2020?  One thing is certain: we can’t afford its current trajectory.  Left unchecked, our $2.6 trillion in annual spending will grow to $4.6 trillion by 2020, one-fifth of GDP.  With almost 80 million Baby Boomers approaching retirement, economists forecast these trends will likely bankrupt Medicare and Medicaid in the near future.  And while healthcare reform ignites a number of important changes, alone it does not resolve our issues.  It’s critical we fix our system now.

Something’s got to give, and better decisions from better data can yield significant healthcare savings if done right.  Saving lives and reducing costs dramatically in healthcare would qualify as one of those hard problems where disciplined approaches can yield significant results.  Here is Hill’s post on Forbes…

Fast Company: Interview with LinkedIn’s Reid Hoffman

I ran across this interview by Fast Company with LinkedIn co-founder about his new book The Start-Up of You and the need for companies to have a data strategy, or risk losing “potentially a lot” in the future.  Here’s that brief bit from the Hoffman interview:

What do companies miss out on if they don’t have a data strategy?

Potentially a lot. If you say the way our products and services are constituted, how we determine our strategy and maintain a competitive edge against other folks–if data is a very strong element of each of these, and you’re not doing anything, it’s like trying to run a business without business intelligence. I’m not sure I have a broad enough view that I would say every company needs to have a data strategy. But I would say many companies do. I certainly think that any company that is over 20 people needs to have a technology strategy, and data is essential to where technology is going.

LinkedIn has already been on record as not worrying about Facebook taking over their business.  According to Hoffman, “People with advanced degrees are three times more likely to use LinkedIn.”

You can read the Fast Company interview here

Banks Predicting Your Divorce?

Are banks predicting divorces?  Well, if there’s data to help them predict such things, they may very well use it to optimize their business.

Forbes has a couple of posts that peek into businesses use of “big data”.  The first article talks about the race to build new analytics to solve challenges of large volumes of data.  Here’s a snippet from Tom Groenfeldt‘s post, quoting Scott Gnau, head of research and development at Teradata:

Thought leaders in a number of industries are starting to leverage the additional analytic content from big data and combine it with what they have in large volume data stores as well. It is interesting to understand social media and consumer sentiment, but when that information is analyzed in combination with traditional consumer data it provides new, rich intelligence helping companies to identify trends and react to immediate business conditions.

According to another Forbes article, there are a number of studies that show that companies that characterize themselves as “data driven” as the best corporate performers.  Now, when we’re talking “data driven”, we mean in how the company operates, not necessarily in what it produces as technology.  Top performing companies are determined to use the data that they have (especially about themselves) to improve what they do and how they do it. 

Also, banks are on the lookout for changes that could affect how they do business with their customers, and of course, their bottom line:

Banks, for example, worry about their customers divorcing, because divorce causes a change in credit-worthiness. No problem. They can now see a divorce coming before the couple does. All from the data.

As part of the “Computer Science or Data Science” panel at Techonomy 2011 in Tucson, AZ this week, the panel explored how data science has taken its place next to computer science as a fundamental element of information technology.  New technologies are coming out seemingly every day, not only to handle big data, but to understand how to extract relevant information from the ocean of data we’re swimming in.

A company in Silicon Valley, ai-one, announced today that they have “a breakthrough method to graphically represent knowledge enables software developers to easily build intelligent agents such as Apple’s SIRI and IBM Watson”.  The technology, ai-Fingerprint, is geared toward natural language programming, allowing developers to create new technologies that use natural language as input data.  

Apple’s Siri and IBM’s Watson are definitely heading in the right direction for this type of technology.  I just bought an iPhone 4S and I’ve tested Siri out a number of times.  While Siri doesn’t get everything right (it keeps thinking my name is “Nick” when I say “Mic”), it does get more right than I expected.  I was able to send texts and e-mails to people without keystrokes, and I took some notes using the voice feature, getting nearly every word correct.  Pretty amazing stuff!…

Watson is the supercomputer that beat two longtime Jeopardy! champions, and it uses a technology approach that looks for the best answer for the questions being asked (or in this case, the best question for the answer being presented – it is Jeopardy! after all…).  These are definitely the models that should be emulated; although, ai-one’s announcement is a press release so before we see the results, let’s chalk this up at the moment as good marketing…

The NYC Data Science Race

You know that data science is truly becoming a recognized scientific discipline when billions of university dollars will be spent on its future.

I wrote previously about Columbia’s effort to expand its Manhattan campus to build a data science and engineering center.  However, Stanford and Cornell are also in the race. 

Much of this comes from a desire by the City of New York to become home to a leading engineering and applied science campus.   NYC is willing to invest $100 million into infrastructure improvements for the winner.

Stanford and Cornell have put in bids for the project to use the land on Roosevelt Island, while Columbia will be expanding its Manhattanville campus.  Other schools are looking to expand in NYC as well – Carnegie Mellon, which is looking at the Brooklyn Navy Yard, and NYU, which wants to move into Downtown Brooklyn.

Movie Review: J. Edgar

J. Edgar entered theaters this weekend, and my wife and I had the opportunity to see it last night.  Unfortunately, the movie only put the exclamation point on a disappointing evening (it was raining, dinner at Macaroni Grill was an hour and fifteen minutes wait to get our meal after we ordered – well, you get the picture…).  While we really looked forward to seeing this film, I’d have to rate the movie between 2 and 2.5 stars out of 5.

The movie was directed by Clint Eastwood, who won Academy Awards for Unforgiven and Million Dollar Baby, and J. Edgar Hoover was played quite well by Leonardo DiCaprio, who has been toyed with by the Academy in receiving three nominations for acting, but is yet to receive his deserving award.

[Aside:  DiCaprio is a fantastic actor, and I enjoy watching him on the screen.  He’s received Oscar nominations for Who’s Eating Gilbert Grape?, The Aviator, and Blood Diamond.  However, he starred in Titanic, which won the most Oscars ever, yet he didn’t received a nomination?!  He was great in Catch Me If You Can, Inception, Shutter Island, and particularly The Departed, but no Oscar nods.  Letter to the Academy:  Honor the man, pronto – don’t make him wait 25+ years like you made Martin Scorsece wait…  OK, stepping down from soapbox…]

Through Eastwood’s telling of the story, we find that there were three main people in Hoover’s life:  his mother Annie Hoover, Clyde Tolson, his number two in the Bureau and his life companion, and Helen Gandy, Hoover’s personal secretary.  These three people really did comprise the entirety of Hoover in who defined him, who nurtured him, and who protected him.

What’s clear from the picture is that J. Edgar Hoover had an incredibly logical mind (he even invented a card catalog system when he worked at the Library of Congress) and was a true innovator in the area of criminal and forensic science.  His recognition of the use of fingerprint forensics to solve crimes was genius, and he certainly was constructive in fighting communism as a radical force in the United States, and in fighting organized crime elements in the big cities.

However, the part of Hoover for which he will be most remembered is his surveillance (in some cases, illegal) of public figures such as President John F. Kennedy and Dr. Martin Luther King Jr., and the secret “personal and confidential” files that he kept on people in order to coersce high-ranking U.S. officials to get his way.

Hoover had elements of genius, but some real shortcomings.  Certainly the times in which he lived didn’t allow him to live his life as transparently as he might have liked.  Yet, other flawed parts of his character showed through quite clearly regardless of the times.

Eastwood did as good of a job (I think) as he could with Hoover’s life (and the screenplay), and I personally think that DiCaprio did a great acting job in portraying Hoover as a human being, even though he ran the FBI with an iron fist during his 48-year tenure, intimidating Attorneys General and Presidents in the process.  We do get the sense of the strong personal bonds Hoover had with his mother, Tolson, and Gandy, even though he didn’t always treat them well.  The acting is very solid – Naomi Watts is very good as Helen Gandy, Dame Judi Dench plays Hoover’s mother amazingly well, and Armie Hammer is quite good as Hoover’s companion Clyde Tolson.

However, while I like nonlinear storylines with flashbacks to fill in the timeline, the timeline for this movie goes back and forth a bit too much for my taste – it actually made it hard to figure out where I was in the story.  Plus, the story itself was somewhat slow at times, which made the overall length of the movie seem longer than it really was.

Overall, I enjoyed learning about Hoover and his life (both public and private), but I probably would have enjoyed a one-hour documentary instead of Eastwood’s two-and-a-half hour drama.  If you want to see good actors, J. Edgar might be good (as a rental), but you want to know more about J. Edgar Hoover, there’s probably a good documentary out there.

Dumbill Data Science Discussions

Edd Dumbill is the general manager for the Strata Conference, recently wrote a nice post on Google+ titled “Why Do We Need Data Science?

Here is a really good insight from Dumbill and how data science applies to business:

Why is the scientific method applicable to business and data?

Every company’s business is complex in itself, and they operate in a complex world. The financial, economic and societal structures we live and do business in are complex. Because of this complexity and interactions, businesses can be viewed in the same light as organic, biological systems. They are complex entities within a complex system.

This is where science comes into play. Even assuming you could come up with a top-down mathematical model of your business, there’s too much interaction and randomness with complex systems for your model to be practical. Thus, the exploratory approach of science becomes useful to a business.

Your world and business is a giant laboratory, and ever more so as the world becomes more networked. By employing data scientists you can discover better how your business works, how it can be improved, and find new things you can do that you didn’t know of before. To do this, you must connect up three kinds of people: the business folk, the data scientists and the data engineers.

I do like Dumbill’s take here and there is absolute merit with applying the scientific method to business activities.   Peter Wang of Streamitive commented on Dumbill’s post as well, and has some interesting points…

Ultimately, data doesn’t mean anything without trying to answer questions.  To get actionable information, you need data and you need to be asking the right questions. That’s why the scientific method is so important – it’s all about posing a hypothesis or asking a question, and then squeezing the right information out of the data in order to answer it.

Gartner Magic Quadrant Report on Big Data Integration Tools

Based upon their Magic Quadrant analysis of data integration tools, Gartner rates Informatica Corp. and IBM as the top software vendors in the space.

Gartner uses a Magic Quadrant to rate companies as leaders, challengers, niche players and visionaries based on several criteria including “completeness of vision” and “ability to execute.”  From Gartner’s website:

  • Leaders execute well against their current vision and are well positioned for tomorrow.
  • Visionaries understand where the market is going or have a vision for changing market rules, but do not yet execute well.
  • Niche Players focus successfully on a small segment, or are unfocused and do not out-innovate or outperform others.
  • Challengers execute well today or may dominate a large segment, but do not demonstrate an understanding of market direction.

A post by Mark Brunelli, Senior News Editor, at SeniorDataManagement has a more detailed analysis of the Gartner report.  Here’s what Brunelli wrote, detailing some of the thoughts of Ted Friedman, a Gartner vice president and information management analyst and co-author of the report:

“You’re hearing a lot about big data and analytics around big data,” Friedman said. “To do that kind of stuff you’ve got to collect the data that you want to analyze and put it somewhere. [That] in effect is a job for data integration tools.”

It does seems that the main focus right now in this space is on data handling and data management.  A lot of work is being done by companies to create data visualization tools to gain insight from the data, but as the problems get much harder, better analytics approaches will need to be brought to bear.  The real key over the next few years will be on the smart analysis of all this data, turning the data into reliable actionable information.

Nerd Pride Friday: The People v. George Lucas

In this week’s Nerd Pride Friday segment, I wanted to highlight a documentary that was released on DVD a few weeks ago called The People v. George Lucas.

For those of you who (like me) are big Star Wars fans, you’ll probably appreciate this documentary.  Star Wars has been a solid part of the popular culture since the first movie was released in 1977.  However, as new home movie technologies come out (VHS, widescreen, DVD, Blu-ray…), the movies themselves have changed, because they’ve been re-edited by Lucas and his team to add, delete, or change some of the content.

People LOVE these films, and changing them feels to some like a bit of them is being changed along with it.  This has led to discomfort by some and outrage by others about the modification of the films they grew up loving.  The People v. George Lucas is a documentary about this very phenomenon.

Examples of some of the changes include:

    • Changing the Mos Eisley cantina scene where Han Solo met Greedo.  Han kills Greedo in the cantina, but in the original, Han shot Greedo because he was tired of the conversation and the pressure Greedo was putting on him.  In the revised version, Greedo shoots at Han first, giving Han “justification” for killing Greedo.  This slight revision changes the whole nature of Han’s character, where he was originally a “bad guy turned good guy”.  This has led to T-shirts that say “Han Shot First” as a mantra for the original films…


    • In the original Star Wars (which has now been renamed Star Wars Episode IV:  A New Hope to align itself with the other five movies in the series), we never saw Han Solo confront Jabba the Hutt – Jabba was always this character we heard about through the dialogue.  In the re-edited version, we see old footage where they do meet.  Certainly, Lucas wanted this scene in the original film, but the special effects technology didn’t exist to do it well.  I like the included scene, but there’s a weird part where Han walks behind Jabba and has to step on and walk over his tail – would Han really do this?  It’s clunky but necessary, only because of the way the scene was filmed way back when…


    • Revision releases of the films edited two actors out of the films, one of them altogether.  In a scene at the end of the film, the actor playing Anakin Skywalker/Darth Vader in Star Wars Episode VI:  Return of the Jedi, Sebastian Shaw, is removed and replaced with Hayden Christensen, the actor who plays Anakin in the three prequel movies; at least Shaw is still in the film in other scenes.  However, the original actor who played the hologram version of Emperor Palpatine in Star Wars Episode V:  The Empire Strikes Back, voiced by Clive Revell, was completely replaced with Ian McDiarmid, the actor who plays Palpatine in the prequels.  Some like the new continuity that the revisions provide; others feel for the actors that were completely removed from the historic film series…


    • There are even more changes in the Blu-ray releases of the original trilogy, including Darth Vader saying “No” several times as he picks up Emperor Palpatine and tosses him into the Death Star reactor core, there are computer-generated rocks in front of R2-D2 while he’s hiding in the canyon; however, they’re “magically not there” after he comes out of hiding, and Greedo shoots first – again – but this time with slightly fewer frames than the previous release.

Here is a post from Wired interviewing the writer/director of The People v. George Lucas, Alexandre O. Philippe.  There is a lot of controversy about changes to the Star Wars films – in fact, out of nearly 2,000 reviews on Amazon, the recently released Blu-ray compilation of the six films has only 2 and 1/2 stars out of 5.  Some of the reviews are scathing (over 1,000 of the reviews are 1-star), which shows the emotions surrounding this issue…

I, for one, like some changes (I like to see the previously deleted scenes which provide more to the backdrop of the Star Wars universe), and feel for those who have seen the originals changed from what they remember. 

In the end, Han shot first, and that’s the way it is!!

Big Data and 1984?

As the data science and big data technology booms start accelerating, it’s worth noting how these technologies will change our lives – both positively and potentially negatively.

I posted previously about the ongoing discussion of privacy, but I’ve found another post on GigaOM about the same topic.  According to the article, the Supreme Court of the United States heard oral arguments on Tuesday in a case that could decide how connected the concept of big data is to constitutional expectations of privacy.

The case, United States v. Jones, is specifically about whether police needed a search warrant to place a GPS device on a suspect’s car and monitor his movements for 28 days.  Several justices, however, seized upon a very important question: How much data is too much before allowable surveillance crosses the line into an invasion of privacy?  This is a really nice post, and if you’re interested in the constitutional issues regarding privacy (for example, an appellate court has found that warrantless GPS tracking is a violation of the Fourth Amendment), I’d recommend that you take time to read the article

These two posts do highlight interesting differences in privacy and who controls our data.  We sometimes have a knee-jerk reaction to institutions that keep data on us and then use it for other purposes (whether they benefit us or not).  George Orwell’s 1984 and the Big Brother metaphors with which we’re all familiar deal with government controlling the data and what it can do with it – that’s what the US v. Jones case is really all about.

However, in the private world where we interact with companies and people more directly, it’s not really a Big Brother issue, because we give up our privacy all the time – there’s no legal requirement to give up data; we do it by choice.  We willingly give up our privacy in order to benefit from technology – little bit by little bit.  If we want a website to provide us great recommendations (say Netflix), the company is going to have to know more about us – what we like, and what we don’t like.  

It seems a bit “Big Brother”, but even people store data about us all the time – they’re called memories.  Some are good and some are bad; people remember what we enjoy and what we hate.  People who become our friends are the ones that become great matches for us – they enjoy our humor, they know what we like to discuss, and look out for us when we’re not around.

Companies will be trying to do that as well, but of course, it’s all about trust.  Just as we trust our friends with all that they know about us, we hope to trust companies with all the data they store about us.    That’s probably the biggest thing we need to wrestle with in the Age of Big Data – how to establish trust between people and the machines that will be keeping and using the data they have about us…