# 1 Overview

powdR is an R implementation of the full pattern summation approach to quantitative mineralogy from X-ray powder diffraction (XRPD) data. Whilst this is available in Excel spreadsheets such as FULLPAT (Chipera and Bish 2002) and RockJock (Eberl 2003), implementation in R allows for faster computation than is currently available, and provides a user-friendly Shiny application to help with the often iterative process of mineral quantification. Furthermore, the afps() function in powdR is designed to automate the full pattern summation procedure, which is particularly advantageous in high-throughput XRPD datasets.

# 2 Full pattern summation

A powerful property of XRPD data is that it can provide quantitative estimates of phase concentrations in mineral mixtures. Though several methods are available for such quantitative analysis, full pattern summation (also referred to as “full pattern fitting of prior measured standards”) is particularly suitable for mixtures containing crystalline mineral phases in combination with disordered and/or X-ray amorphous phases. Soil is a prime example of such mixtures, where crystalline minerals such as quartz and feldspars can be present in combination with clay minerals (i.e. disordered phases), and organic matter (i.e. amorphous phases).

The full pattern summation implemented in powdR is based on the principle that an observed pattern is comprised of the sum of signals from the individual components within it. A key component of this approach is a library containing measured or calculated patterns of the pure phases that may be encountered within the samples. These “reference”" patterns would ideally be measured on the same instrument as the sample. To quantify a given sample, suitable phases from the library are selected that together account for all peaks within the data, and their relative contributions to the observed signal optimised until an appropriate fit is achieved (Chipera and Bish 2002). This fit is usually refined using least squares optimisation of an objective parameter. The scaled intensities of the optimised patterns are then converted to weight % using reference intensity ratios, which are a measure of diffracting power relative to a standard mineral (usually corundum).

#Using powdR

## 2.1 The powdRlib object

A key component of the functions within powdR is the library of reference patterns. These are stored within a powdRlib object created using the powdRlib() constructor function. powdRlib() builds a powdRlib object from two components. The first component, specified via the xrd_table argument of powdRlib(), is a data frame of the count intensities of the reference patterns, with their 2$$\theta$$ axis as the first column. The column for a given reference pattern must be named using a unique identifier (a phase ID). An example of such a format is provided in the minerals_xrd data:

library(powdR)

data(minerals_xrd)

#>       tth ALB DOL.1 DOL.2 FEL GOE.1 GOE.2  ILL KAO
#> 1 4.00973 308   268   362 546  3549 10000 3078 525
#> 2 4.04865 294   256   345 524  3511  9592 2960 500
#> 3 4.08757 286   250   343 505  3401  9323 2888 486
#> 4 4.12649 277   247   327 512  3290  9042 2753 474
#> 5 4.16541 275   241   318 478  3194  9248 2718 478
#> 6 4.20433 261   228   314 459  3113  8557 2720 447

The second component required to build a powdRlib object (specified via the phases_table argument of powdRlib()) is a data frame containing 3 columns. The first is a string of unique ID’s representing each reference pattern in the data provided to the xrd_table argument. The second column is the name of the phase group that this reference pattern belongs to (e.g. quartz or glass). The third column is the reference intensity ratio (RIR) of that reference pattern (relative to a known standard, usually corundum). An example of the format required for the phases_table argument of powRlib() is provided in the minerals_phases data.

data(minerals_phases)

minerals_phases[1:8,]
#>   phase_id  phase_name  rir
#> 1      ALB Plagioclase 1.31
#> 2    DOL.1    Dolomite 2.35
#> 3    DOL.2    Dolomite 2.39
#> 4      FEL  K-feldspar 0.75
#> 5    GOE.1    Goethite 0.93
#> 6    GOE.2    Goethite 0.37
#> 7      ILL      Illite 0.22
#> 8      KAO   Kaolinite 0.91

Crucially when building the powdRlib object, all phase ID’s in the first column of the phases_table must match the column names of the xrd_table (except the name of the first column which is the 2$$\theta$$ scale), for example.

identical(names(minerals_xrd[-1]),
minerals_phases$phase_id) #> [1] TRUE Once created, powdRlib objects can easily be visualised using plot(), which when used for powdRlib objects accepts the arguments wavelength and refs that are used to specify the X-ray wavelength and the reference patterns to plot, respectively. In all cases where plot() is used hereafter, the addition of interactive = TRUE to the function call will produce an interactive html graph. my_lib <- powdRlib(minerals_xrd, minerals_phases) plot(my_lib, wavelength = "Cu", refs = c("ALB", "DOL.1", "QUA.1", "GOE.2")) ## 2.2 RockJock There are two powdRlib objects provided as part of the powdR package. The first is minerals (accessed via data(minerals)), which is a simple and low resolution library designed to facilitate fast computation of basic examples. The second is rockjock (accessed via data(rockjock)), which is a comprehensive library of 168 reference patterns covering most phases that might be encountered in geological and soil samples. The rockjock library in powdR uses data from the original RockJock program (Eberl 2003) thanks to the permission of Dennis Eberl. In rockjock, each reference pattern from the original RockJock program has been scaled to a maximum intensity of 10000 counts, and the RIR’s normalised relative to Corundum. All rockjock data were analysed using Cu K$$\alpha$$ radiation. ##RockJock synthetic mixtures To accompany the rockjock reference library, a list of eight synthetic mixtures from the original RockJock program (Eberl 2003) are also included in powdR in the rockjock_mixtures data (accessed via data(rockjock_mixtures). Their known weights (see ?rockjock_mixtures) can be compared to full pattern summation outputs (i.e. from fps() and afps(), detailed below) to assess accuracy. ##Subsetting a powdRlib object Occasionally it may be useful to subset a reference library to a smaller selection. This can be achieved using subset(), which for powdRlib objects accepts three arguments; x, refs and mode. The x argument specifies the powdRlib object to be subset, refs specifies the ID’s of phases to select, and mode specifies whether these phases are kept (mode = "keep") or removed (mode = "remove"). data(rockjock) #Have a look at the phase ID's in rockjock rockjock$phases$phase_id[1:10] #> [1] "ACTINOLITE_TREMOLITE" "ALBITE_CLEAVELANDITE" "ALMANDINE_GARNET" #> [4] "ALUNITE" "AMPHIBOLE" "ANALCIME" #> [7] "ANATASE" "ANDALUCITE" "ANDESINE" #> [10] "ANGLESITE" #Remove three phases from rockjock rockjock_1 <- subset(rockjock, refs = c("ALUNITE", "AMPHIBOLE", "ANALCIME"), mode = "remove") #Check number of phases remaining in library nrow(rockjock_1$phases)
#> [1] 166

#Keep three phases of rockjock
rockjock_2 <- subset(rockjock,
refs = c("ALUNITE",
"AMPHIBOLE",
"ANALCIME"),
mode = "keep")

#Check number of phases remaining
nrow(rockjock_2$phases) #> [1] 3 ## 2.3 Full pattern summation: Full pattern summation in powdR is provided via fps(), and an automated version provided in afps(). Here the rockjock and rockjock_mixtures data will be used to demonstrate the use of these functions. ### 2.3.1 Full pattern summation with internal standard In some cases samples are prepared for XRPD with an internal standard of known concentration. If this is the case, then the std_conc argument of fps() and afps() can be used to define the concentration of the internal standard (in weight %), which is used in combination with the reference intensity ratios to compute phase concentrations. For example, all samples in the rockjock_mixtures data were prepared with 20 % corundum as the internal standard, thus this can be specified using using std = "CORUNDUM" and std_conc = 20 in the call to fps() or afps(). data("rockjock_mixtures") fit_1 <- fps(lib = rockjock, smpl = rockjock_mixtures$Mix1,
refs = c("ORDERED_MICROCLINE",
"KAOLINITE_DRY_BRANCH",
"MONTMORILLONITE_WYO",
"ILLITE_1M_RM30",
"CORUNDUM"),
std = "CORUNDUM",
std_conc = 20,
align = 0.3)
#>
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Computing phase concentrations
#> -Using internal standard concentration of 20 % to compute phase concentrations
#> ***Full pattern summation complete***

Notice that when the std_conc is defined, the computed phase concentrations exclude the contribution of the internal standard…

fit_1$phases #> phase_id phase_name rir phase_percent #> 1 ILLITE_1M_RM30 Illite 0.277 7.4642 #> 2 KAOLINITE_DRY_BRANCH Kaolinite 0.581 13.5087 #> 3 LABRADORITE Plagioclase 0.811 22.4294 #> 4 MONTMORILLONITE_WYO Smectite (Di) 0.320 46.9472 #> 5 ORDERED_MICROCLINE K-feldspar 0.965 4.2268 …and that the phase concentrations do not sum to 100 %. sum(fit_1$phases$phase_percent) #> [1] 94.5763 Unlike other software where only certain phases can be used as an internal standard, any phase can be defined in powdR. For example, the rockjock_mixtures$Mix5 sample contains 20 % quartz (see ?rockjock_mixtures), thus adding "QUARTZ" as the std argument results in this reference pattern becoming the internal standard.

fit_2 <- fps(lib = rockjock,
smpl = rockjock_mixturesMix5, refs = c("ORDERED_MICROCLINE", "LABRADORITE", "KAOLINITE_DRY_BRANCH", "MONTMORILLONITE_WYO", "CORUNDUM", "QUARTZ"), std = "QUARTZ", std_conc = 20, align = 0.3) #> #> -Aligning sample to the internal standard #> -Interpolating library to same 2theta scale as aligned sample #> -Optimising... #> -Computing phase concentrations #> -Using internal standard concentration of 20 % to compute phase concentrations #> ***Full pattern summation complete*** fit_2phases
#>               phase_id    phase_name   rir phase_percent
#> 1             CORUNDUM      Corundum 0.908       25.1526
#> 2 KAOLINITE_DRY_BRANCH     Kaolinite 0.581        5.1658
#> 3          LABRADORITE   Plagioclase 0.811        9.4053
#> 4  MONTMORILLONITE_WYO Smectite (Di) 0.320       13.1061
#> 5   ORDERED_MICROCLINE    K-feldspar 0.965       39.5580

sum(fit_2$phases$phase_percent)
#> [1] 92.3878

### 2.3.2 Full pattern summation without internal standard

In cases where an internal standard is not added to a sample, phase quantification can be achieved by assuming that all detectable phases can be identified and that they sum to 100 weight %. By setting the std_conc argument of fps() or afps() to NA, or leaving it out of the function call, it will be assumed that the sample has been prepared without an internal standard and the phase concentrations computed accordingly.

fit_3 <- fps(lib = rockjock,
smpl = rockjock_mixturesMix1, refs = c("ORDERED_MICROCLINE", "LABRADORITE", "KAOLINITE_DRY_BRANCH", "MONTMORILLONITE_WYO", "ILLITE_1M_RM30", "CORUNDUM"), std_conc = NA, std = "CORUNDUM", align = 0.3) #> #> -Aligning sample to the internal standard #> -Interpolating library to same 2theta scale as aligned sample #> -Optimising... #> -Computing phase concentrations #> -Internal standard concentration unknown. Assuming phases sum to 100 % #> ***Full pattern summation complete*** In this case the phase specified in the std argument is only used for sample alignment, and is included in the computed phase concentrations. fit_3phases
#>               phase_id    phase_name   rir phase_percent
#> 1             CORUNDUM      Corundum 0.908       20.2401
#> 2       ILLITE_1M_RM30        Illite 0.277        6.2949
#> 3 KAOLINITE_DRY_BRANCH     Kaolinite 0.581       11.3924
#> 4          LABRADORITE   Plagioclase 0.811       18.9156
#> 5  MONTMORILLONITE_WYO Smectite (Di) 0.320       39.5923
#> 6   ORDERED_MICROCLINE    K-feldspar 0.965        3.5646

Furthermore, the phase concentrations sum to 100 %.

sum(fit_3$phases$phase_percent)
#> [1] 99.9999

### 2.3.3 Full pattern summation with data harmonisation

It is usually recommended that the reference library used for full pattern summation is measured on the same instrument as the sample using an identical 2$$\theta$$ range and resolution. In some cases this is not feasible, and the reference library patterns may be from a different instrument to the sample. To allow for seamless use of samples and libraries from different instruments (measured on the same wavelength), fps() and afps() contain a logical harmonise argument (default = TRUE). When the sample and library contain non-identical 2$$\theta$$ axes, harmonise = TRUE will convert the data onto the same scale by determining the overlapping 2$$\theta$$ range and interpolating to the coarsest resolution available.

#Create a sample with a shorter 2theta axis than the library
Mix1_short <- subset(rockjock_mixturesMix1, tth > 10 & tth < 55) #Reduce the resolution by selecting only odd rows of the data Mix1_short <- Mix1_short[seq(1, nrow(Mix1_short), 2),] fit_4 <- fps(lib = rockjock, smpl = Mix1_short, refs = c("ORDERED_MICROCLINE", "LABRADORITE", "KAOLINITE_DRY_BRANCH", "MONTMORILLONITE_WYO", "ILLITE_1M_RM30", "CORUNDUM"), std = "CORUNDUM", align = 0.3) #> #> -Harmonising library to the same 2theta resolution as the sample #> -Aligning sample to the internal standard #> -Interpolating library to same 2theta scale as aligned sample #> -Optimising... #> -Computing phase concentrations #> -Internal standard concentration unknown. Assuming phases sum to 100 % #> ***Full pattern summation complete*** fit_4phases
#>               phase_id    phase_name   rir phase_percent
#> 1             CORUNDUM      Corundum 0.908       20.1242
#> 2       ILLITE_1M_RM30        Illite 0.277        8.4206
#> 3 KAOLINITE_DRY_BRANCH     Kaolinite 0.581       12.7544
#> 4          LABRADORITE   Plagioclase 0.811       19.8052
#> 5  MONTMORILLONITE_WYO Smectite (Di) 0.320       35.0070
#> 6   ORDERED_MICROCLINE    K-feldspar 0.965        3.8886

## 2.4 Automated full pattern summation

The selection of suitable reference patterns for full pattern summation can often be challenging and time consuming. An attempt to automate this process is provided in afps(), which can select appropriate reference patterns from a reference library and subsequently exclude reference patterns based on limit of detection estimates. Such an approach is considered particularly advantageous when quantifying high-throughput XRPD datasets that display considerable mineralogical variation.

Here the rockjock library, containing 168 reference patterns, will be used to quantify one of the samples in the rockjock_mixtures data.

fit_5 <- afps(lib = rockjock,
smpl = rockjock_mixturesMix1, std = "CORUNDUM", align = 0.3, lod = 1) #> #> -Aligning sample to the internal standard #> -Interpolating library to same 2theta scale as aligned sample #> -Applying non-negative least squares #> -Optimising... #> -Removing negative coefficients and reoptimising... #> -Calculating detection limits #> -Removing phases below detection limit #> -Reoptimising after removing crystalline phases below the limit of detection #> -Removing negative coefficients and reoptimising... #> -Computing phase concentrations #> -Internal standard concentration unknown. Assuming phases sum to 100 % #> ***Automated full pattern summation complete*** fit_5phases_grouped
#>      phase_name phase_percent
#> 1    Background        0.0001
#> 2      Corundum       19.8285
#> 3        Illite        7.6667
#> 4    K-feldspar        2.7622
#> 5     Kaolinite       10.7962
#> 6   Plagioclase       19.1234
#> 7 Smectite (Di)       39.8229

## 2.5 Plotting

Plotting results from fps() or afps() (powdRfps and powdRafps objects, respectively) is achieved using plot(). Static ggplot() plots can be created using:

plot(fit_5, wavelength = "Cu")

Alternatively, interactive ggplotly() plots can be created by adding interactive = TRUE to the function call, e.g. plot(fit_5, wavelength = "Cu", interactive = TRUE).

##Quantifying multiple samples The easiest way to quantify multiple samples is with lapply()

multi_fit <- lapply(rockjock_mixtures[1:3], fps,
lib = rockjock,
std = "CORUNDUM",
refs = c("ORDERED_MICROCLINE",
"KAOLINITE_DRY_BRANCH",
"MONTMORILLONITE_WYO",
"ILLITE_1M_RM30",
"CORUNDUM",
"QUARTZ"),
align = 0.3)
#>
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Computing phase concentrations
#> -Internal standard concentration unknown. Assuming phases sum to 100 %
#> ***Full pattern summation complete***
#>
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Computing phase concentrations
#> -Internal standard concentration unknown. Assuming phases sum to 100 %
#> ***Full pattern summation complete***
#>
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Removing negative coefficients and reoptimising...
#> -Computing phase concentrations
#> -Internal standard concentration unknown. Assuming phases sum to 100 %
#> ***Full pattern summation complete***

names(multi_fit)
#> [1] "Mix1" "Mix2" "Mix3"

##Summarising mineralogy When multiple samples are quantified, it is often useful to report the phase concentrations of all of the samples in a single table. For a given list of powdRfps and/or powdRafps objects, the summarise_mineralogy() function yields such summary tables, for example:

summarise_mineralogy(multi_fit, type = "grouped", order = TRUE)
#>   sample_id Kaolinite Corundum Plagioclase Smectite (Di)  Illite
#> 1      Mix1   11.3595  20.2729     18.8879       39.6687  6.0913
#> 2      Mix2   19.8944  20.7959     34.2119        2.6911  9.9973
#> 3      Mix3   37.8932  20.8328          NA        3.2884 19.2039
#>   K-feldspar Quartz
#> 1     3.5021 0.2177
#> 2     7.9301 4.4794
#> 3    11.0426 7.7391

where type = "grouped" denotes that phases with the same phase_name will be summed together, and order = TRUE specifies that the columns will be ordered from most common to least common (assessed by the sum of each phase across the samples).

## 2.6 The powdR Shiny app

To run powdR via the Shiny app, use run_powdR(). This loads the application in your default web browser. The application has eight tabs:

1. Reference Library Builder: Allows you to create and export a powdRlib reference library from two .csv files: one for the XRPD measurements, and the other for the ID, name and reference intensity ratio of each pattern.
2. Reference Library Viewer: Facilitates quick inspection of the phases within a powdRlib reference library.
3. Reference Library Editor: Allows the user to easily subset a powdRlib reference library .
4. Full Pattern Summation: A user friendly interface for iterative full pattern summation of single samples.
5. Automated Full Pattern Summation: A user friendly interface for automated full pattern summation of single samples.
6. Results viewer: Allows for quick inspection of results derived from full pattern summation.
7. Results editor: Allows for results from previously saved powdRfps and powdRafps objects to be edited via addition or removal of reference patterns to the fitting process.
8. Help Provides a series of video tutorials (via YouTube) detailing the use of the powdR Shiny application.

## References

Chipera, Steve J., and David L. Bish. 2002. “FULLPAT: A full-pattern quantitative analysis program for X-ray powder diffraction using measured and calculated patterns.” Journal of Applied Crystallography 35 (6): 744–49. https://doi.org/10.1107/S0021889802017405.

Eberl, D. D. 2003. “User’s guide to ROCKJOCK - A program for determining quantitative mineralogy from powder X-ray diffraction data.” Boulder, CA: USGS.