Next, we'll count the number of vertices and edges, define a min and max function for Spark's reduce method, computer the min and max degrees, and compute the histogram data of the degree of connectedness. In this hands on, we will cover the degree distribution of the metros graph from the first hands on exercise. This will help you practice finding the degrees of connectedness in a graph. These numbers will be used for plotting the visualizations in the next hands on exercises. The degree of a vertex is the number of edges or connections the vertex has to other vertices in the graph. In directed graphs, each vertex has an in degree, the number of edges directed to the vertex. In and out degree, the number of edges directed away from the vertex. The metros graph is an example of a directed graph. Each metropolis vertex has one outgoing edge to a country vertex. Each country vertex has one or more incoming edges from metropolis vertices. This will be a quiz question. Starting again where we left off from the previous hands-on exercise, first, ensure your Cloudera VM is started, and that you downloaded the dataset examples of analytics. The link is in the content for this week. Use the numEdges attribute to print the number of edges in metrosGraph. As you can see the result is 65, which matches the number of lines in metro_country.csv. Now use the numVertices attribute to print the number of vertices in metrosGraph. As you can see the result is 93, which matches the number of lines in metro.csv65 plus the number of lines in country.csv, 28. Define the max and the min reduce operation to compute the highest and lowest degree vertex. Let us find the vertex with the most outgoing edges or the vertex with the largest out degree by passing the max function to a reduced operation on the out degrees of metrosGraph. The result in this case is vertex ID five with one outgoing edge. The result could have been any metropolis because every metropolis in this graph has one outgoing edge to its country. Let us find the vertex with the most incoming edges, or the vertex with the largest inDegree. This is done the same way as the previous example, except you'll run the reduce operation on the inDegrees of metrosGraph. The result is VertexId 108 with 14 incoming edges. Apply a filter to the metrosGraph vertices to find out which vertex is 108. The answer is the United States. This means that the United States has 14 metropolises in the metros.csvfile. We can also compute how many vertices have one out going edge by applying a filter of one to the outgoing degrees and counting the results. The result is 65 because there are 65 metropolises with one outgoing degree. None of the countries have any outgoing degrees. Let us ignore whether or not the edge is in or out, and just find which vertex has the most edges. Again, we will run the reduce operation with the max function. But this time we will run it on metrosGraph's degrees attribute. The result is 108 again with 14 connections. This means that the United States is the most connected vertex in metrosGraph. Finally, let us calculate the histogram data of the degrees for the countries in metrosGraph. First, create a map that only includes countries. So create a filter to only include the vertices with the vertex ID that is greater than or equal to 100. Then you will group the map by the size of the degree and sort the map from lowest to highest degree. The output shows six pairs in an array. The first number is the number of edges, and the second number is the number of vertices that have that number of edges. In other words, the result of the query shows that there are 18 countries with 1 metropolis, 4 countries with 2 metropolises, 2 countries with 3 metropolises, 2 countries with 5 metropolises, 1 country with 9 metropolises, and 1 country with 14 metropolises.