# Overview

This vignette introduces the GWASinspector package, its general form and how to run the algorithm on multiple GWAS files:

Check our website for further information and reference data sets.

The manual for this package can also be accessed online from here.

# Installation

1. The easiest way to get GWASinspector is to install it from CRAN:

# this will automatically download and install the dependencies.
install.packages("GWASinspector")
2. Alternatively, you can use the installation function and zipped package from our website:

# get the installation function from our website:
source('http://GWASinspector.com/references/install_GWASinspector.R')

# this function will check R packages and install the dependencies from CRAN.
install.GWASinspector(package.path = 'path/to/packageFile.gz')

# Required files

## Allele reference panels

Comparing result files with an standard reference panel is the most important part of the QC process. This reference is used to check the alleles in the datasets and to ensure they are all in the same configuration (same strand, same coded alleles) in the post-QC data.

We have created databases from the most popular refernece panel (e.g. HapMap, 1000G, HRC) which are available from our website. Database files are in SQLite format and can be downloaded as a compressed file.

Some reference panles include more than one population which should be set as a parameter in the configuration file before running the algorithm.

The column names used in the input may differ between files (e.g. one file uses EFFECT_ALLELE where another uses CODED_ALLELE). This file is a table of possible column names and their standard translations. A sample file with common names is provided in the package.

A sample file including common terms is provided as part of the package and could be used as a template. The file contains a two-column table, with the left column containing the standard column-names and the right the alternatives.

## Configuration file

An INI file is used to configure the quality control. See the manual for details.

Key-names and section-names should not be edited or renamed. Otherwise the algorithm will not work properly.

A sample file is included in the package which should be used as a template. File paths and QC parameters are are set according to comments and examples in the file.

# Step-by-step guide to run a QC

This walkthrough explains how to run QC on a sample result file.

## Step 1: make sure the package is installed correctly

After installation, try loading the package with the following command. You might see some warnings but the package should load successfully without any errors.

require(GWASinspector)

## Step 2: check R environment

local machine and R environment can be explored by running the following function.

system.check()

This will inform you if pandoc, kableExtra, xlsx packages and Java library are available. These are not required for running GWASinspector, but allow for Excel and HTML formatted outputs.

Standard allele-frequency reference datasets are available from our website. This file should be decompressed and copied in the references folder (dir_references parameter of the config file).

## Step 4: get the header-translation table

A copy of this file can be copied to a local folder by running this command. This is a text file which includes most common variable/header names and can be edited according to user specifications. This file should be copied in the references folder (dir_references parameter of the config file).

The default name of this file is alt_headers.txt. header_translations field should be edited in the configuration file accrodingly if this name is changed by user.

get.headerTranslation.file('c:/path/to/referenceFolder') # copies the file to selected folder

Notice: Duplicated entries will stop the algorihtm.

## Step 5: get the configuration file

It is recommended to set a working directory at first.

setwd('c:/path/to/workingDirectory') # copies the file to selected folder

This is a text file and is used for configuring the desired parameters and settings for running the algorithm. A template file can be copied to local folder by running the following command (config.ini).

The default name of this file is config.ini which can be changed by user.

get.config(getwd()) # copies the file to the working directory

## Step 6: modify the parameters in the configuration file

Please refer to the configuration file or package manual for full detail of parameters.

## Step 7: run the QC function

QC functions starts with the following command. Please refer to the package manual for full detail of parameters.

inspect('config.ini')

The results will be saved in the output folder specified in config.ini. A log file is also created, and can be used to locate any problems occurring during the QC run.

# Test run

You can run the algorithm on a sample GWAS file that is embedded in the package. Reports are generated and saved in the specified folder.

library(GWASinspector)
inspect.example('/sample_dir')