With the increasing availability of granular data on the relationships between individual entities - such as persons (social media), countries (internatinal trade) and financial institutions (supervisory reporting) - network analysis offers many possibilities to extract useful information from such data. This post provides an introduction to network analysis in R using the powerful igraph package for the calculation of metrics and ggraph for visualisation. It marks the beginning of a more comprehensive treatment of network analysis on r-econometrics.
There are multiple packages for the analysis of networks in R. This page concentrates on the igraph package, which allows for a broad range of applications. But before we get into it in more detail, it is useful to know that there are two possible ways to represent the edges, i.e. the connections, of a network:
Adjacency matrix: This is a square matrix, where each row and column corresponds to an entity.
With the increasing availability of granular data on the relationships between individual entities - such as persons (social media), countries (internatinal trade) and financial institutions (supervisory reporting) - network analysis offers many possibilities to extract useful information from such data. This section provides brief introductions to the analysis of network data in R.
Basics of the igraph package Summary statistics with the igraph Network visualisation with ggraph
If a network is small, it can be easily summarised by its graph or a figure. But once a network reaches a certain size, it becomes more meaningful to use more formal summary statistics in order to describe its features. This post covers some basic network summary statistics as presented in Jackson (2008). The metrics are based on the concept of centrality, which describes the importance of a node in a given network of nodes.
Beside the calculation of summarising network metrics, the visualisation of a graph can also be a very informative step in network analysis. Since visualisations in R usually involve the ggplot2 package, I focus on the ggraph package, which is based on the ggplot2 architecture. For illustration I use the artificial data set from my post on network analysis, which is an igraph object names graph_df.
When using ggplot2 the main challenge in the visualisation of networks is to find suitable x- and y-coordinates for the nodes of a graph.