GERGM Travis-CI Build Status

An R package to estimate Generalized Exponential Random Graph Models

NOTE: This package is still under development. PLEASE REPORT ANY BUGS OR ERRORS TO mdenny@psu.edu.

Model Overview

An R package which implements the Generalized Exponential Random Graph Model (GERGM) with an extension to estimation via Metropolis Hastings. The relevant papers detailing the model can be found at the links below:

Installation

Requirements for using C++ code with R

See the Requirements for using C++ code with R section in the following tutorial: Using C++ and R code Together with Rcpp. You will likely need to install either Xcode or Rtools depending on whether you are using a Mac or Windows machine before you can use the package.

Installing The Package

The easiest way to do this is to install the package from CRAN via the standard install.packages command:

install.packages("GERGM")

This will take care of some weird compilation issues that can arise, and is the best option for most people. If you want the most current development version of the package (available here), you will need to start by making sure you have Hadley Wickham's devtools package installed.

install.packages("devtools")

Now we can install from Github using the following line:

devtools::install_github("matthewjdenny/GERGM")

I have had success installing this way on most major operating systems with R 3.2.0+ installed, but if you do not have the latest version of R installed, or run into some install errors (please email if you do), it should work as long as you install the dependencies first with the following block of code:

install.packages( pkgs = c("BH","RcppArmadillo","ggplot2","methods","stringr"), dependencies = TRUE)

Once the GERGM package is installed, you may access its functionality as you would any other package by calling:

library(GERGM)

If all went well, check out the ?GERGM help file to see a full working example with info on how the data should look.

Basic Useage

To use this package, first load in the network you wish to use as a (square) matrix, following the example provided below. You may then use the gergm() function to estimate a model using any combination of the following statistics: out2star(alpha = 1), in2star(alpha = 1), ctriads(alpha = 1), recip(alpha = 1), ttriads(alpha = 1), edges(alpha = 1), absdiff(covariate = "MyCov"), edgecov(covariate = "MyCov"), sender(covariate = "MyCov"), reciever(covariate = "MyCov"), nodefactor(covariate, base = "MyBase"), netcov(network_covariate). To use exponential downweighting for any of the network level terms, simply specify a value for alpha less than 1. The gergm() function will provide all of the estimation and diagnostic functionality and the parameters of this function can be querried by typing ?gergm into the R console. You may also generate diagnostic plots using a GERGM Object returned by the gergm() function by using any of the following functions: Estimate_Plot(), GOF(), Trace_Plot().

Examples

Here are two simple working examples using the gergm() function:

library(GERGM)
########################### 1. No Covariates #############################
# Preparing an unbounded network without covariates for gergm estimation #
net <- matrix(rnorm(100,0,20),10,10)
colnames(net) <- rownames(net) <- letters[1:10]
formula <- net ~ recip + edges  
  
test <- gergm(formula,
              normalization_type = "division",
              network_is_directed = TRUE,
              use_MPLE_only = FALSE,
              estimation_method = "Metropolis",
              maximum_number_of_lambda_updates = 1,
              maximum_number_of_theta_updates = 5,
              number_of_networks_to_simulate = 40000,
              thin = 1/10,
              proposal_variance = 0.5,
              downweight_statistics_together = TRUE,
              MCMC_burnin = 10000,
              seed = 456,
              convergence_tolerance = 0.01,
              MPLE_gain_factor = 0,
              force_x_theta_update = 4)
  
########################### 2. Covariates #############################
# Preparing an unbounded network with covariates for gergm estimation #
net <- matrix(runif(100,0,1),10,10)
colnames(net) <- rownames(net) <- letters[1:10]
node_level_covariates <- data.frame(Age = c(25,30,34,27,36,39,27,28,35,40),
                                    Height = c(70,70,67,58,65,67,64,74,76,80),
                                    Type = c("A","B","B","A","A","A","B","B","C","C"))
rownames(node_level_covariates) <- letters[1:10]
network_covariate <- net + matrix(rnorm(100,0,.5),10,10)
formula <- net ~ recip + edges + sender("Age") + 
netcov("network_covariate") + nodefactor("Type",base = "A")  
   
test <- gergm(formula,
              covariate_data = node_level_covariates,
              network_is_directed = TRUE,
              use_MPLE_only = FALSE,
              estimation_method = "Metropolis",
              maximum_number_of_lambda_updates = 5,
              maximum_number_of_theta_updates = 5,
              number_of_networks_to_simulate = 100000,
              thin = 1/10,
              proposal_variance = 0.5,
              downweight_statistics_together = TRUE,
              MCMC_burnin = 50000,
              seed = 456,
              convergence_tolerance = 0.01,
              MPLE_gain_factor = 0,
              force_x_theta_update = 2)
  
# Generate Estimate Plot
Estimate_Plot(test)
# Generate GOF Plot
GOF(test)
# Generate Trace Plot
Trace_Plot(test)

Finally, if you specified an output_directory and output_name, you will want to check the output_directory which will contain a number of .pdf's which can aide in assesing model fit and in determining the statistical significance of theta parameter estimates.

Output

If output_name is specified in the gergm() function, then five files will be automatically generated and saved to the output_directory. The example file names provided below are for output_name = "Test":

Testing

So far, this package has been tested successfully on OSX 10.9.5 and Windows 7. Please email me at mdenny@psu.edu if you have success on another OS or run into any problems.