A Data Science Lesson from Richard Feynman

Richard Feynman

Richard Feynman

Richard Feynman is one of the greatest scientific minds, and what I love about him, aside from his brilliance, is his perspective on why we perform science.   I’ve been reading the compilation of short works of Feynman titled The Pleasure of Finding Things Out, and I recently came across a section that really hit home with me.

In the world of data science, much is made about the algorithms used to work with data, such as random forests or k-mean clustering.  However, I believe there is a missing component – one that deals the fundamentals underlying data science, and that is the real science of data science.

The following paragraphs are taken from The Pleasure of Finding Things Out, which I would encourage you all to read.  Feynman’s way of cutting through the scientific and mathematical gobbledygook to get to the essence of what all that stuff represents is remarkable, which in my mind just demonstrates his brilliance since he’s so able to communicate what he knows to other people.  I’ve written on the importance of effective communication, especially in science – the most effective scientific communicators were Albert Einstein and Stephen Hawking; I would definitely put Richard Feynman in that class.

One way, that’s kind of a fun analogy in trying to get some idea of what we’re doing in trying to understand nature, is to imagine that the gods are playing some great game like chess, let’s say, and you don’t know the rules of the game, but you’re allowed to look at the board, at least from time to time, in a little corner, perhaps, and from these observations you try to figure out what the rules of the game are, what the rules of the pieces moving are. You might discover after a bit, for example, that when there’s only one bishop around on the board that the bishop maintains its color. Later on you might discover the law for the bishop as it moves on the diagonal which would explain the law that you understood before – that it maintained its color – and that would be analogous to discovering one law and then later finding a deeper understanding of it. Then things can happen, everything’s going good, you’ve got all the laws, it looks very good, and then all of a sudden some strange phenomenon occurs in some corner, so you being to investigate that – it’s castling, some thing you didn’t expect. We’re always, by the way in fundamental physics, always trying to investigate those things in which we don’t understand the conclusions. After we’ve checked them enough, we’re okay.

The thing that doesn’t fit is the thing that’s the most interesting, the part that doesn’t go according to what you expected. Also, we could have revolutions in physics: after you’ve noticed that the bishops maintain their color and they go along the diagonal and so on for such a long time and everybody knows that that’s true, then you suddenly discover one day in some chess game that the bishop doesn’t maintain its color, it changes its color. Only later do you discover a new possibility, that a bishop is captured and that a pawn went all the way down to the queen’s end to produce a new bishop – that can happen but you didn’t know it, and so it’s very analogous to the way our laws are: They sometimes look positive, they keep on working and all of a sudden some little gimmick shows that they’re wrong and then we have to investigate the conditions under which this bishop change of color happened and so forth, and gradually learn the new rule that explains it more deeply. Unlike the chess game, though, in [which] the rules become more complicated as you go along, in physics, when you discover new things, it look more simple. It appears on the whole to be more complicated because we learn about a greater experience – that is, we learn about more particles and new things – and so the laws look complicated again. But if you realize all the time what’s kind of wonderful – that is, if we expand our experience into wilder and wilder regions of experience – every once in a while we have these integrations when everything’s pulled together into a unification, in which it turns out to be simpler than it looked before.

If you are interested in the ultimate character of the physical world, or the complete world, and at the present time our only way to understand that is through a mathematical type of reasoning, then I don’t think a person can fully appreciate, or in fact can appreciate much of, these particular aspects of the world, the great depth of character of the universally of the laws, the relationships of things, without an understanding of mathematics. I don’t know any other way to do it, we don’t know any other way to describe it accurately… or to see the interrelationships without it. So I don’t think a person who hasn’t developed some mathematical sense is capable of fully appreciating this aspect of the world – don’t misunderstand me, there are many, many aspects of the world that mathematics is unnecessary for, such as love, which are very delightful and wonderful to appreciate and to feel awed and mysterious about; and I don’t mean to say that the only thing in the world is physics, but you were talking about physics and if that’s what you’re talking about, then to not know mathematics is a server limitation in understanding the world.

The connection here to data science is the search for understanding.  Research and engineering teams use data science to explain things about the data, so that we can use that information later – maybe to make predictions, maybe for better explanations, maybe to make better products.  However, the key part is the understanding and without that, data science is merely a collection of tools and techniques used to fit observations.  Unless we seek to understand – trying the find “the why” – then we won’t really know whether our data science models, tools, or techniques are actually working.

If you are interested, these passages are from a television interview Feynman conducted as part of a BBC documentary Richard Feynman: No Ordinary Genius.

Question:  Do you have any thoughts on the fundamental science of data science or on Richard Feynman? You can leave a comment below.

I currently serve as Director in the Advanced Risk & Compliance Analytics (ARCA) practice at PricewaterhouseCoopers (PwC). I've served as Director of Data Science & Analytics Engineering at Areté Associates and in leadership positions with Elanix, Inc. (now Agilent Technologies) and Mentor Graphics. I've served the public as Chair of the Thousand Oaks, CA Planning Commission and now work in New York City. I have been married to my wife Stephanie since 1993, and we have a wonderful daughter Monroe. Learn more about me »

Please note: I reserve the right to delete comments that are offensive or off-topic.