Introduction to Probability

probable [latin probabilis that may be proved; from probare to try, test, approve make good; from probus good]



The Beginning

The first recorded evidence of probability theory can be found as early as 1550 in the work of Cardan. In 1550 Cardan wrote a manuscript in which he addressed the probability of certain outcomes in rolls of dice, the problem of points, and presented a crude definition of probability. Had this manuscript not been lost, Cardan would have certainly been accredited with the onset of probability theory. However, the manuscript was not discovered until 1576 and printed in 1663, leaving the door open for independent discovery.

The onset of probability as a useful science is primarily attributed to Blaise Pascal (1623-1662) and Pierre de Fermat (1601-1665). While contemplating a gambling problem posed by Chevalier de Mere in 1654, Blaise Pascal and Pierre de Fermat laid the fundamental groundwork of probability theory, and are thereby accredited the fathers of probability. The question posed was pertaining to the number of turns required to ensure obtaining a six in the roll of two dice. The correspondence between Pascal and Fermat concerning this and the problem of points led to the beginning of the new concepts of probability and expection.

In the seventeenth century, a shopkeeper, John Graunt (1620-1674), set out to predict mortality rates by categorizing births and deaths. In the London Life Table, Graunt made a noteworthy attempt to predict the number of survivors out of one hundred through increments of ten years. This work along with his earlier paper Natural and Political Observations Made upon the Bills of Mortality, the first known paper to use data in order to draw statistical inferences, gained him access into the Royal Society of London.

Graunt’s observations and predictions elicited interest in probability from others, such as two brothers, Ludwig and Christiaan Huygens. Beginning with the interest initially sparked by Graunt’s work and later by the work of Pascal and Fermat, Christiaan Huygens, a Dutch physicist, became the first to publish a text on probability theory entitled De Ratiociniis in Ludo Aleae (On Reasoning in Games and chance), in 1657. In this text, Huygens presented the idea of mathematical expectation. This text was unrivaled until James Bernoulli (1654-1705) wrote Ars Conjectandi, which was published eight years after his death.

In Ars Conjectandi, Bernoulli expounded on and provided alternative proofs to Huygens’ De Ratiociniis in Ludo Aleae, presented combinations and permutations which encompass most of the results still used today, included a series of problems on games of chance with explanations, and finally, and most importantly, he revealed the famous bernoulli theorem, later called the law of large numbers.

Probability theory continued to grow with Abraham DeMoivre’s Doctrine of Chances: or, a Method of Calculating the Probability of Events in Play, published in 1718, and Pierre Simon Laplace’s (1749-1827) Theorie Analytique des Probabilites, published in 1812. The Theorie Analytique des Probabilites outlined the evolution of probability theory, providing extensive explanations of the results obtained. In this book Laplace presented the definition of probability, which we still use today, and the fundamental theorems of addition and multiplication of probabilities along with several problems applying the Bernoulli process.

Father of probability

While contemplating a gambling problem posed by Chevalier de Mere in 1654, Blaise Pascal and Pierre de Fermat laid the fundamental groundwork of probability theory, and are thereby accredited the fathers of probability.


The first major accomplishment in the development of probability theory was the realization that one could actually predict to a certain degree of accuracy events which were yet to come. The second accomplishment, which was primarily addressed in the 1800's, was the idea that probability and statistics could converge to form a well defined, firmly grounded science, which seemingly has limitless applications and possibilities. It was the initial work of Pascal, Fermat, Graunt, Bernoulli, DeMoivre, and Laplace that set probability theory, and then statistics, on its way to becoming the valuable inferential science that it is today.


(Random House) a strong likelihood or chance of something. The relative possibility an event will occur ...the ratio of the number of actual occurrences to the total number of possible occurrences.

The probability that an event will occur is a number between 0 and 1. In other words, it is a fraction. It is also sometimes written as a percentage, because a percentage is simply a fraction with a denominator of 100. For more about these concepts, see our pages on Fractions and Percentages.

An event that is certain to occur has a probability of 1, or 100%, and one that will definitely not occur has a probability of zero. It is also said to be impossible.

Probability is easier to understand with an example:

Suppose that you are going to throw a standard dice, and you want to know what your chances are of throwing a 6.

In this case, there is only one outcome that leads to that event (ie. you throw a 6), and 6 possible outcomes altogether (you might throw 1, 2, 3, 4, 5 or 6).

The probability of throwing a six is therefore 1/6.

Now suppose that you want to know what your chances are of throwing 1 or 6. Now there are two favourable outcomes, 1 and 6, but still 6 possible outcomes.

The probability is therefore 2/6. Which you can reduce down to 1/3.

Three Types of Probability

1. Classical

(equally probable outcomes) Let S=sample space (set of all possible distinct outcomes).

Then the probability of an event = \frac{number\: of\: ways\: the\: event\: can\: occur}{number \: of\: outcomes\: in\: S }\

2. Relative Frequency Definition

The probability of an event in an experiment is the proportion (or fraction) of times the event occurs in a very long (theoretically infinite) series of (independent) repetitions of experiment. (e.g. probability of heads=0.4992)

3. Subjective Probability

The probability of an event is a "best guess" by a person making the statement of the chances that the event will happen. (e.g. 30% chance of rain)


n distinct objects are to be "drawn" or ordered from left to right (Order matters objects are drawn without replacement)

  1. The number of ways to arrange n distinct objects in a row is n(n-1)(n-2)......(2),(1) = n!

  2. The number of ways to arrange r objects selected from n distinct objects is


The number of ways to choose r objects from n (a set, i.e. order doesn't matter) is denoted by \left ( \frac{n}{r} \right ) . For n and r both non-negative integers with n\geq r,

\left ( \frac{n}{r} \right ) = \frac{n!}{r!\left ( n-r \right )!} = \frac{n^{\left ( r \right )}}{r!}

Probability of Multiple Events

Probability gets a bit more complicated when you have multiple events, for example, when you’re tossing more than one coin, or throwing several dice.

The reason is that you have more possible outcomes.

For example, when you are tossing two coins, each one could land heads or tails up. So instead of just two possible outcomes (heads or tails), there are now four:

First Coin Head Head Tail Tail
Second Coin Tail Head Tail Head

More coins will mean more possible outcomes.

As a rule of thumb, the number of possible outcomes is equal to:

The number of outcomes per item to the power of the number of items.

So if you have five coins, each with two possible outcomes, the total number of possible outcomes is 25 = 2 x 2 x 2 x 2 x 2 = 32.

If you want to work out the probability of throwing a head and a tail when you throw two coins, there are two outcomes that are favourable (the first coin is heads, and the second is tails, or the first is tails and the second is heads), and four events in total. The probability is 2/4 or 1/2.

Worked example

If you throw three dice, what is the probability that you do not throw any 4s, 5s, or 6s?

You are throwing three dice, each of which has 6 possible outcomes.

The total number of outcomes is therefore 63 = 6 x 6 x 6 = 216

Each dice has three favourable outcomes, 1, 2, or 3.

For the first two dice, you need to throw either 1, 2, or 3 for both dice. The favourable outcomes are:

1-1    1-2    1-3    2-1    2-2    2-3    3-1    3-2    3-3

In other words there are nine favourable outcomes with two dice. Now each one of these has three possible favourable outcomes from the third dice (ie. the third dice could be 1, 2, or 3).

So the number of favourable outcomes is 9 x 3 = 27.

The probability of not rolling 4, 5, or 6 with three dice is therefore 27/216 = 1/8.

Wed, 02/24/2021 - 15:46
Shiksha is working as a Data Scientist at iVagus. She has expertise in Data Science and Machine Learning.