Customizing PTXQC reports

Reports rely on a small set of parameters, which do two things:

• set optimal values for the PTXQC metrics and scoring functions (usually tighly connected to the data and MS instrument)
• report length: number of metrics shown in the report or partial evaluation of the txt-folder

The first point can partially be addressed via a mqpar.xml file. However, not all thresholds are known or even interesting to MaxQuant, e.g. the number of expected proteins. For this, PTXQC comes with a YAML config file. This file also addresses the report length.

Automatically importing MaxQuant parameters via mqpar.xml

Some settings in MaxQuant, e.g. the MS/MS search tolerance are important for PTXQC in order to correctly score the mass recalibration or show expected error margins.

So how does PTXQC know about the settings used during the MaxQuant run? Copy the mqpar.xml which MaxQuant generated in the analysis directory into your txt folder, e.g. c:\MQ_Diabetes_study\mqpar.xml -> c:\MQ_Diabetes_study\combined\txt\mqpar.xml. This will enable PTXQC to extract some parameter settings and results in more accurate reports. You will see a warning message during report generation if mqpar.xml is missing. The FAQ vignette has additional information on why this manual copying is senseful.

Alternatively, you can set the MaxQuant parameters manually in the YAML file (see next section) to match your MaxQuant configuration.

More customization using the YAML config file

A configuration file allows you to enable/disable certain plots, and to modify most target thresholds used for the scoring metrics and within plots, e.g. the number of proteins you expect. This will differ by platform and instrument; so adapt it to your needs. The defaults are meant for a Thermo Orbitrap Velos, a 4h LC gradient for a complex matrix (e.g. human cells). Disabling certain report sections will decrease report size and hide metrics which you may not need. The time required to create the report will also reduce, but usually this is not a bottleneck.

PTXQC uses YAML format for its parameter configuration. YAML is short for ‘YAML Aint Markup Language’; yes, it’s an endless recursion. There is a nice tutorial for YAML at [http://rhnh.net/2011/01/31/yaml-tutorial] – or just Google to find more. Also, any PTXQC YAML file will start with a comment section which explains the format.

PTXQC will create a report_*.yaml file within your txt folder upon its first invocation. If you did not run PTXQC before on this folder, you can take any other YAML config file or run PTXQC with default settings (see above) to create such a file. Edit the YAML file as you desire using a text editor of your choice (e.g. Notepad++). Each YAML file has a leading comment section which will describe the basics of how YAML looks like and what you should/not change.

Note: PTXQC will only read the YAML file which matches its own version exactly, i.e. report_v0.82.1.yaml will be ignored if PTXQC has version 0.83.1 to avoid compatibility problems. A new YAML file matching the current PTXQC version will automatically be generated. If you want to use your old settings, rename the file manually.

The YAML shown below is shortened and lacks parts of the introductory comment section. But in general, the YAML file will look like this:

# This is a configuration file for PTXQC reporting.
# One such file is generated automatically every time a report PDF is created.
# You can make a copy of this file, then modify its values and use the copy as an input to another round of report generation,
# e.g., to exclude/include certain plots or change some global settings.
#
# << more comment text in original YAML file ... >>
#
PTXQC:
UseLocalMQPar: yes
ReportFilename:
extended: yes
NameLengthMax_num: 10.0
order:
qcMetric_PAR: 1.0
qcMetric_PG_PCA: 3.0
qcMetric_EVD_Top5Cont: 10.0
qcMetric_PG_Ratio: 19.0
qcMetric_EVD_UserContaminant: 20.0
<< more metrics here... >>
File:
Parameters:
enabled: yes
Summary:
enabled: yes
IDRate:
Thresh_great_num: 35.0
ProteinGroups: ...
Evidence:
enabled: yes
ProteinCountThresh_num: 3500.0
PeptideCountThresh_num: 15000.0
MatchBetweenRuns_wA: auto

To disable certain metrics/plots, just change the order of the metric in question to a negative value, e.g. qcMetric_PG_PCA: -3.0, will disable the PCA plot. This is helpful to disable uninteresting metrics, but also a temporary solution if your PC cannot cope with RAM demands of PTXQC or a metric is broken (you should open a bug report in this case).

Rename Raw files in the report

PTXQC will try to shorten Raw file names to make the plot axis as compact as possible, giving more area to show the data rather than overly long Raw file names. It will use prefix and infix removal of strings which are common to all Raw file names. Name your files in a systematic to make this approach as effective as possible. Should this shortening not reach a predefined threshold of 10 characters in length for the longest filename after shortening, PTXQC will simply use consecutive numbers, i.e. file 01, file 02 to name your files on the axis of the plots.

If you do not want that, you can either:

• Increase this length threshold in the YAML config to force usage of the shortened versions. But be advised that using long filenames might result in ugly axis with overlapping text etc.
• Use a manual mapping table: a file named report_vXXX_filename_sort.txt will be created in your txt-folder, which contains the file name mapping that PTXQC computed. You can edit this file with any text editor and upon running PTXQC again, this new mapping will be applied.

Reordering Raw files in the report

By default, the order of Raw files in the report is identical to the order specified within MaxQuant.

You already read about the report_vXXX_filename_sort.txt file above. It can not only be used to assign custom names, but also to re-arrange the order in which Raw files are shown in the plots and heatmap. Open it with a text editor and simply rearrange the rows in the file. Easy!