0:30
So in looking at the higher dimensional drawings or
higher dimensional spaces, you can think of something like a square.
A square consists of two one-dimensional lines,
these two vertical lines, that are separated horizontally.
I've taken a one dimensional line, copied into a second vertical line and
then connected them up with lines between the vertices.
So I've now got a two-dimensional square.
If I want to make this two-dimensional square into a three-dimensional cube,
I just copy the square, and move it in a direction, and then I just link up
the corresponding vertices with lines, and that gives me a three-dimensional cube.
If I want a four-dimensional cube, I just take the three-dimensional cube and
I make another copy of it and offset it in another direction, and
then just draw lines between the corresponding vertices and it gives me
a tesseract, some two-dimensional projection of a four-dimensional cube.
And so you can do this, but it's not very helpful for
visualization, because you've got all these different directions that are being
projected into a two-dimensional image, and after three dimensions, we're just
not used to perceiving the world in four or higher dimensions, and so we just have
difficulty perceiving four-dimensional or higher data being projected that way.
If I have a scatter plot, for example, I can pretty readily see x and
y encoded in the positional Cartesian coordinates of x and y here.
If I had a z coordinate that's being projected,
z is being foreshortened as it's being projected here, and
I've gotta mentally remember that z is some combination of horizontal and
vertical that's different than the horizontal x or the vertical y axis.
So you can draw these hints that are basically similar to a shadow,
where I'm indicating, with these dashed lines,
what the coordinates of this green dot are in three dimensions,
because it lacks other visual cues of where it is in three dimensions.
If you try to do this in four dimensions,
then it becomes a nightmare to manage all those dimensions.
So there's a better technique for doing this called parallel coordinates that
Al Inselberg demonstrated to me back in the early 90s,
when they were first invented, and I was quite blown away by his demonstration, and
I'll try to reproduce it here.
2:55
So on parallel coordinates, we're going to take the Cartesian coordinates.
We have a horizontal x-axis and a vertical y-axis, and
we're going to take these axes and we're going to
make them parallel instead of orthogonal, at right angles, as they usually are.
So I'm going to take the x-axis, I'm going to put it here, and
I'm going to take the y-axis and I'm going to put it here.
And so I'll label this axis x and this axis y.
So now the two axes are not orthogonal.
They're parallel to each other and they don't extend from the same origin.
So the origin here is horizontally at the bottom and
increasing x goes up this axis, increasing y goes up this axis.
So now I've got the data points, and I need to figure out where these data points
occur, so if I take the y-coordinates of each data point,
I can map each point onto its corresponding position on the y-axis.
I'm just basically dragging them across horizontally, because their position in y
in this chart corresponds to their height along this y-axis.
If I want to do the same for
the x-axis, I'm going to basically take the x position of each point, and then I'm
going to drag it to the corresponding x position on this vertical x-axis, so
the horizontal length here corresponds to the vertical length here, and
so blue and red are at the same horizontal x coordinate, and so
they overlap each other on the x-axis.
And in green is a little bit farther to the right, so
it's going to be a little bit higher on the parallel x-axis.
Yellow is a little bit farther to the right so
it's going to be a little bit higher.
And then this blue green color dot is the farthest right so
it'll be the highest on the x-axis.
So now I've got this one set of points that's now appearing as two
sets of points, and I've got a correspondence between color,
and color's a good perceptual indicator of category, but
it's very difficult to perceive what's going on here.
So what we're going to do is, instead of displaying these as points on the axis,
I'm going to connect these points with lines, and
I'm going to delete the original points, and
now we get this nice duality between points in the coordinate system,
here, and lines in the coordinate system, here.
So, this orange point, here, has an x-coordinate and a y-coordinate.
It corresponds to this line connecting its x-coordinate to its y-coordinate.
And so each one of these five points in the Cartesian x y coordinate
system corresponds to a line in the parallel x y coordinate system here.
5:49
Some other features are collinearity, so if these three points lie along
the same line, then you get this nice convergence in parallel coordinates.
And so you can see some co-linearity that happens, because lines in parallel
coordinates, corresponding to points in Cartesian coordinates that are collinear,
will basically converge in a point in the parallel coordinate system.
So you can add extra dimensions.
If I add a z-axis here, I'd have to add all sorts of three-dimensional queues to
these data points to figure out where they are in three dimensions.
In parallel coordinates, I just add a z-axis here.
And then I can connect the z-coordinates to the y-coordinates,
and follow this line from its x-coordinate to its y-coordinate to its z-coordinate.
And if I add a fourth w-axis.
It's actually easy to add a fourth axis here.
And in fact, all of these points may be collinear in the z w coordinate system,
and you can see because these lines are kind of converging to the same point.
The point that they converge to doesn't necessarily have to be between the axes.
But there are some ways of helping a person to see these points.
And so these parallel coordinates are really useful for high-dimensional data.
7:07
And you get this correspondence of being able to follow these lines through
the various coordinates and have them correspond to points here, and
it's easier to see some relationships.
There could be correspondences between the w-axis and the y-axis, and
we wouldn't see those unless we also had lines from these points on
the w-axis to the points on the y-axis, and so you get a bit of a combinatorial
nightmare when you try to connect every axis to every other axis.
So there's some decision making that needs to happen if you're going to show
correspondences between axes,
to choose to have those axes next to each other in the parallel coordinates system.
7:49
So parallel coordinates take the orthogonal axes of the Cartesian
coordinate system and they lay them out in parallel, and you can have any number of
these parallel coordinate axes laid next to each other, and you can start to
perceive higher dimensional data using these parallel coordinates.
Some of the problems of parallel coordinates are the fact that you're only
seeing the pairwise relationship between the axes, but they can be very useful for
finding certain features in your data, for example, collinearity.
[MUSIC]