But sometimes we, as human beings, can relate more to pretty pictures and
diagrams than potentially just lots of figures appearing in tabular form.
So if, for example, we had a categorical variable,
such as the continent to which a particular country belongs,
then an appropriate graphical display might be a simple bar chart.
Now, of course, in practice,
we would use computer software to generate diagrams for us.
We'd never really draw these things manually with pen and pencil and paper.
But do be conscious that when computer software produces a diagram,
take great care that the diagram which is generated is not actually
distorting what is trying to be communicated to the wider audience.
So take great care about any axes which are used and
any scales used on those axes.
And make sure that your diagram is not distorting what you're trying to convey.
Equally, if you're looking at a diagram produced by someone else,
particularly by perhaps a politician or an advertizing company,
they may be deliberately trying to distort what the data genuinely is showing
to try get across their particular perspective and point of view.
Now perhaps a very common type of diagram used when
analyzing a single measurable variable might be that of a histogram.
So, for example, we might consider the GDP per capita
across a set of countries within a sample data set.
And a histogram very clearly brings to life the data, and
you get a very strong sense of the distribution of GDP per capita
across the various countries in our data set.
Now that word distribution, that perhaps rings a bell with some work we did in week
two of the course when we constructed some simple probability distributions.
For example, for the score for a fair die, say.
Or perhaps the chances of getting heads and tails when tossing a fair coin.
But, of course, those probability distributions were derived theoretically.
For example, that fair die, hence the six equally likely outcomes.
And we attached a probability of one over six to each of those.
In contrast here, we're looking at the distribution not of a sort of
theoretical probability distribution, but rather a sample distribution.
Ie the distribution of the variable being considered within our sample data set.
So looking at this histogram of GDP per capita,
you can see that it varies a great deal across the countries in our data set.
With the vast majority of countries being quite poor on a GDP per capita basis,
with one or two countries performing very well on the GDP per capita basis.
Of course, we need to look at these things more numerically, and
just because a country has a high GDP per capita
does not necessarily mean everyone within that country is wealthy.
Per capita is simply an example of an average which we're going to be looking at
in the next section whereby we're looking at the total GDP in a country and
imagine, hypothetically, it was split equally among the population.
Just as in countries with very low levels of GDP per capita, of course,
perhaps those at the very elite of society if there was a lot of corruption, say.
Those at the top are perhaps doing very well in life,
with the vast majority of the population struggling a great deal.
So GDP per capital perhaps one fairly simplistic metric
to gain some sense of how wealthy a country is.
But, of course, is not revealing the entire picture.
But nonetheless, it's achieving a simple goal of data reduction.
In a simple histogram we can convey quite a lot of information about how wealth is
spread in different countries around the world.