Two of the biggest buzzwords in our industry is “big data” and “data science”. Big Data seems to have a lot of interest right now, but Data Science is fast becoming a very hot topic.

I think there’s room to really define the **science** of data science - what are those fundamentals that are needed to make data science *truly *a science we can build upon?

Below are my thoughts for an outline for such a set of fundamentals:

**Fundamentals of Data Science**

**Introduction**

- What is Data?

- The Goal of Data Science

- The Scientific Method

**Probability and Statistics**

- The Two Characteristics of Data

- Examples of Statistical Data

- Introduction to Probability

- Probability Distributions

- Connection with Statistical Distributions

- Statistical Properties (Mean, Mode, Median, Moments, Standard Deviation, etc.)

- Common Probability Distributions (Discrete, Binomial, Normal)

- Other Probability Distributions (Chi-Square, Poisson)

- Joint and Conditional Probabilities

- Bayes’ Rules

- Bayesian Inference

**Decision Theory**

- Hypothesis Testing

- Binary Hypothesis Test

- Likelihood Ratio and Log Likelihood Ratio

- Bayes Risk

- Neyman-Pearson Criterion

- Receiver Operating Characteristic (ROC) Curve

- M-ary Hypothesis Test

- Optimal Decision Making

**Estimation Theory**

- Estimation as Extension of M-ary Hypothesis Test

- Unbiased Estimation

- Minimum Mean Square Error (MMSE)

- Maximum Likelihood Estimation (MLE)

- Maximum A Posteriori Estimation (MAP)

- Kalman Filter

**Coordinate Systems**

- Introduction to Coordinate Systems

- Euclidian Spaces

- Orthogonal Coordinate Systems

- Properties of Orthogonal Coordinate Systems (angle, dot product, coordinate transformations,

etc.)

- Cartesian Coordinate System

- Polar Coordinate System

- Cylindrical Coordinate System

- Spherical Coordinate System

- Transformations Between Coordinate Systems

**Linear Transformations**

- Introduction to Linear Transformations

- Properties of Linear Transformations

- Matrix Multiplication

- Fourier Transform

- Properties of Fourier Transforms (time-frequency relationship, shift invariance, spectral

properties, Perseval’s Theorem, Convolution Theorem, etc.)

- Discrete and Continuous Fourier Transforms

- Uncertainty Principle and Aliasing

- Wavelet and Other Transforms

**Effects of Computation on Data**

- Mathematical Representation of Computation

- Reversible Computations (Bijective Mapping)

- Irreversible Computations

- Impulse Response Functions

- Transformation of Probability Distributions (due to addition, subtraction, multiplication,

division, arbitrary computations, etc.)

- Impacts on Decision Making

**Prototype Coding / Programming**

- Introduction to Programming

- Data Types, Variables, and Functions

- Data Structures (Arrays, etc.)

- Loops, Comparisons, If-Then-Else

- Functions

- Scripting Languages vs. Compilable Langugages

- SQL

- SAS

- R

- Python

- C++

**Graph Theory**

- Introduction to Graph Theory

- Undirected Graphs

- Directed Graphs

- Various Graph Data Structures

- Route and Network Problems

**Algorithms**

- Introduction to Algorithms

- Recursive Algorithms

- Serial, Parallel, and Distributed Algorithms

- Exhaustive Search

- Divide-and-Conquer (Binary Search)

- Gradient Search

- Sorting Algorithms

- Linear Programming

- Greedy Algorithms

- Heuristic Algorithms

- Randomized Algorithms

- Shortest Path Algorithms for Graphs

**Machine Learning**

- Introduction to Machine Learning

- Linear Classifiers (Logistic Regression, Naive Bayes Classifier, Support Vector Machines)

- Decision Trees (Random Forests)

- Bayesian Networks

- Hidden Markov Models

- Expectation-Maximization

- Artificial Neural Networks and Deep Learning

- Vector Quantization

- K-Means Clustering

**Question: Do you have any thoughts on the fundamentals of data science? You can leave a comment below.**