Europe PMC is a repository of life science literature. Europe PMC ingests all PubMed content and extends its index with other sources, including Agricola, a bibliographic database of citations to the agricultural literature, or Biological Patents.
Index coverage
For more background on Europe PMC, see:
Levchenko, M., Gou, Y., Graef, F., Hamelers, A., Huang, Z., Ide-Smith, M., … McEntyre, J. (2017). Europe PMC in 2017. Nucleic Acids Research, 46(D1), D1254–D1260. https://doi.org/10.1093/nar/gkx1005
This client supports the Europe PMC search syntax. If you are unfamiliar with searching Europe PMC, check out the Europe PMC query builder, a very nice tool that helps you to create your queries. To make use of your Europe PMC queries in R, simply copy & paste the search string to the search functions of this package.
In the following, some examples how to search Europe PMC are presented.
empc_search()
is the main function to query Europe PMC. It searches both metadata and fulltexts.
library(europepmc)
europepmc::epmc_search('malaria')
#> # A tibble: 100 x 27
#> id source pmid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 29322… MED 29322… 10.10… Anion in… Vullo D, Syrj… J Enzyme In… 1
#> 2 29497… MED 29497… 10.71… Motivati… Winn LK, Less… J Glob Heal… 1
#> 3 29412… MED 29412… 10.10… Advances… Ragavan KV, K… Biosens Bio… <NA>
#> 4 29661… MED 29661… 10.11… Socioeco… Were V, Buff … Malar J 1
#> 5 29671… MED 29671… 10.33… Mapping … Ferrao JL, Ni… Int J Envir… 4
#> 6 29619… MED 29619… 10.71… Quantify… Krezanoski PJ… J Glob Heal… 1
#> 7 29636… MED 29636… 10.11… "Scaling… Faye S, Cico … Malar J 1
#> 8 29649… MED 29649… 10.15… Updated … Ballard SB, S… MMWR Morb M… 14
#> 9 29661… MED 29661… 10.11… Factors … Awuah RB, Asa… Malar J 1
#> 10 29652… MED 29652… 10.13… Cost of … Dalaba MA, We… PLoS One 4
#> # ... with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>,
#> # hasLabsLinks <chr>, hasTMAccessionNumbers <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>, hasSuppl <chr>
Please note that Europe PMC expands queries with MeSH synonyms by default, a behaviour which can be turned off with the synonym
parameter.
europepmc::epmc_search('malaria', synonym = FALSE)
#> # A tibble: 100 x 27
#> id source pmid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 29322… MED 29322… 10.10… Anion in… Vullo D, Syrj… J Enzyme In… 1
#> 2 29497… MED 29497… 10.71… Motivati… Winn LK, Less… J Glob Heal… 1
#> 3 29412… MED 29412… 10.10… Advances… Ragavan KV, K… Biosens Bio… <NA>
#> 4 29661… MED 29661… 10.11… Socioeco… Were V, Buff … Malar J 1
#> 5 29671… MED 29671… 10.33… Mapping … Ferrao JL, Ni… Int J Envir… 4
#> 6 29661… MED 29661… 10.11… Factors … Awuah RB, Asa… Malar J 1
#> 7 29614… MED 29614… 10.13… Transfus… Iheonu FO, Fa… PLoS One 4
#> 8 29625… MED 29625… 10.11… A 17-yea… Tesfa H, Bayi… Malar J 1
#> 9 29652… MED 29652… 10.13… Cost of … Dalaba MA, We… PLoS One 4
#> 10 29615… MED 29615… 10.11… A review… Muchena G, Du… Malar J 1
#> # ... with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>,
#> # hasLabsLinks <chr>, hasTMAccessionNumbers <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>, hasSuppl <chr>
To get an exact match, use quotes as in the following example:
europepmc::epmc_search('"Human malaria parasites"')
#> # A tibble: 100 x 27
#> id source pmid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 29109… MED 29109… 10.11… Validat… Uddin T, McFad… Antimicrob … 1
#> 2 28902… MED 28902… 10.11… A genet… Sayers CP, Mol… Cell Microb… 1
#> 3 27894… MED 27894… 10.10… Plasmod… Maeno Y, Culle… Parasitology 4
#> 4 28900… MED 28900… 10.11… Can Mix… Singh US, Siwa… Biomed Res … <NA>
#> 5 29669… MED 29669… 10.11… The bio… Awono-Ambene P… Parasit Vec… 1
#> 6 29370… MED 29370… 10.13… A novel… Komaki-Yasuda … PLoS One 1
#> 7 27748… MED 27748… 10.10… Non-hum… Martinelli A, … Parasitology 1
#> 8 PMC55… PMC <NA> <NA> Can Mix… Singh US, Siwa… Biomed Res … <NA>
#> 9 28525… MED 28525… 10.10… The use… Othman AS, Mar… Expert Rev … 7
#> 10 28531… MED 28531… 10.13… Experim… Singh N, Barne… PLoS One 5
#> # ... with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pubType <chr>, isOpenAccess <chr>,
#> # inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstPublicationDate <chr>,
#> # pageInfo <chr>, pmcid <chr>, hasSuppl <chr>
By default, 100 records are returned, but the number of results can be expanded or limited with the limit
parameter.
europepmc::epmc_search('"Human malaria parasites"', limit = 10)
#> # A tibble: 10 x 27
#> id source pmid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 29109… MED 29109… 10.11… Validat… Uddin T, McFad… Antimicrob … 1
#> 2 28902… MED 28902… 10.11… A genet… Sayers CP, Mol… Cell Microb… 1
#> 3 27894… MED 27894… 10.10… Plasmod… Maeno Y, Culle… Parasitology 4
#> 4 28900… MED 28900… 10.11… Can Mix… Singh US, Siwa… Biomed Res … <NA>
#> 5 29669… MED 29669… 10.11… The bio… Awono-Ambene P… Parasit Vec… 1
#> 6 29370… MED 29370… 10.13… A novel… Komaki-Yasuda … PLoS One 1
#> 7 27748… MED 27748… 10.10… Non-hum… Martinelli A, … Parasitology 1
#> 8 PMC55… PMC <NA> <NA> Can Mix… Singh US, Siwa… Biomed Res … <NA>
#> 9 28525… MED 28525… 10.10… The use… Othman AS, Mar… Expert Rev … 7
#> 10 28531… MED 28531… 10.13… Experim… Singh N, Barne… PLoS One 5
#> # ... with 19 more variables: journalVolume <chr>, pubYear <chr>,
#> # journalIssn <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, citedByCount <int>,
#> # hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstPublicationDate <chr>,
#> # pageInfo <chr>, pmcid <chr>, hasSuppl <chr>
Results are sorted by relevance. Other options via the sort
parameter are
sort = 'cited'
by the number of citation, descending from the most cited publicationsort = 'date'
by date published starting with the most recent publicationSometimes, you would like to send more than one search to Europe PMC at once. A simple solution is using plyr::ldply()
:
my_dois <- c(
"10.1159/000479962",
"10.1002/sctm.17-0081",
"10.1161/strokeaha.117.018077",
"10.1007/s12017-017-8447-9"
)
plyr::ldply(my_dois, function(x) {
europepmc::epmc_search(paste0("DOI:", x))
})
#> id source pmid doi
#> 1 28957815 MED 28957815 10.1159/000479962
#> 2 28941317 MED 28941317 10.1002/sctm.17-0081
#> 3 29018132 MED 29018132 10.1161/strokeaha.117.018077
#> 4 28623611 MED 28623611 10.1007/s12017-017-8447-9
#> title
#> 1 Clinical Relevance of Patent Foramen Ovale and Atrial Septum Aneurysm in Stroke: Findings of a Single-Center Cross-Sectional Study.
#> 2 Concise Review: Extracellular Vesicles Overcoming Limitations of Cell Therapies in Ischemic Stroke.
#> 3 One-Stop Management of Acute Stroke Patients: Minimizing Door-to-Reperfusion Times.
#> 4 Deferiprone Rescues Behavioral Deficits Induced by Mild Iron Exposure in a Mouse Model of Alpha-Synuclein Aggregation.
#> authorString
#> 1 Schnieder M, Siddiqui T, Karch A, Bähr M, Hasenfuss G, Liman J, Schroeter MR.
#> 2 Doeppner TR, Bähr M, Hermann DM, Giebel B.
#> 3 Psychogios MN, Behme D, Schregel K, Tsogkas I, Maier IL, Leyhe JR, Zapf A, Tran J, Bähr M, Liman J, Knauth M.
#> 4 Carboni E, Tatenhorst L, Tönges L, Barski E, Dambeck V, Bähr M, Lingor P.
#> journalTitle issue journalVolume pubYear journalIssn
#> 1 Eur Neurol 5-6 78 2017 0014-3022; 1421-9913;
#> 2 Stem Cells Transl Med 11 6 2017 2157-6564; 2157-6580;
#> 3 Stroke 11 48 2017 0039-2499; 1524-4628;
#> 4 Neuromolecular Med 2-3 19 2017 1535-1084; 1559-1174;
#> pageInfo
#> 1 264-269
#> 2 2044-2052
#> 3 3152-3155
#> 4 309-321
#> pubType
#> 1 journal article
#> 2 review; journal article;
#> 3 clinical trial; research support, non-u.s. gov't; journal article;
#> 4 research-article; journal article;
#> isOpenAccess inEPMC inPMC hasPDF hasBook citedByCount hasReferences
#> 1 N N N N N 0 Y
#> 2 N N N N N 0 Y
#> 3 N N N N N 1 N
#> 4 Y Y N Y N 1 Y
#> hasTextMinedTerms hasDbCrossReferences hasLabsLinks
#> 1 N N Y
#> 2 N N Y
#> 3 N N Y
#> 4 Y N Y
#> hasTMAccessionNumbers firstPublicationDate pmcid hasSuppl
#> 1 N 2017-09-28 <NA> <NA>
#> 2 N 2017-09-23 <NA> <NA>
#> 3 N 2017-10-10 <NA> <NA>
#> 4 Y 2017-06-16 PMC5570801 Y
By default, a non-nested data frame printed as tibble is returned. Other formats are output = "id_list"
" returning a list of IDs and sources, and output = “‘raw’”" to get full metadata as list. Please be aware that these lists can become very large.
Europe PMC parses article metadata for various concepts and terms.
Semantic types | Description/Examples |
---|---|
accession | A unique identifier given to a DNA or protein sequence record |
chemical | e.g. Granzymes, Peptides, Hydrogen |
disease | e.g. dysthymias, gid, icterohemorrhagic |
efo | Experimental Factor Ontology e.g. generation, health, mortality rate, scale, findings, genome etc. |
gene_protein | e.g. atp, cl-43, ecoriir, gng11, ipt1, mlks |
go_term | A Gene Ontology (GO) term e.g. annealing, neuroblasts |
organism | e.g. pneumocystidomycetes, sarus, terebratulide |
Here’s how to search for publications about meningitis:
europepmc::epmc_search('disease:meningitis')
#> # A tibble: 100 x 27
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 29304… MED 2930… PMC5… 10.1… Evalu… Mpoza E, Muk… PLoS One 1
#> 2 29495… MED 2949… PMC5… 10.3… Menin… McCarthy PC,… Vaccines (B… 1
#> 3 29253… MED 2925… PMC5… 10.1… Early… Kambiré D, S… J Infect 3
#> 4 29580… MED 2958… PMC5… 10.1… Forei… Nasher F, Fö… BMC Microbi… 1
#> 5 29509… MED 2950… PMC5… 10.3… Genet… Ousmane S, D… Antibiotics… 1
#> 6 29454… MED 2945… PMC5… 10.1… Cereb… Takahashi K,… J Neuroinfl… 1
#> 7 29547… MED 2954… PMC5… 10.1… Bioin… Andreae CA, … PLoS One 3
#> 8 29594… MED 2959… PMC5… 10.3… Loop-… Seki M, Kilg… Front Pedia… <NA>
#> 9 29364… MED 2936… PMC5… 10.1… The c… Yaesoubi R, … PLoS Med 1
#> 10 29593… MED 2959… PMC5… 10.1… Preva… Lee H, Seo Y… Sci Rep 1
#> # ... with 90 more rows, and 18 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>,
#> # hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstPublicationDate <chr>
To see, which other terms were text-mined on the article level, use the europepmc::epmc_tm()
function.
Another nice feature of Europe PMC is to search for cross-references between Europe PMC to other databases. For instance, to get publications cited by entries in the Protein Data bank in Europe published 2016:
europepmc::epmc_search('(HAS_PDB:y) AND FIRST_PDATE:2016')
#> # A tibble: 100 x 27
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 28089… MED 2808… PMC5… 10.1… Struc… Sluchanko NN… Structure 2
#> 2 28035… MED 2803… PMC5… 10.1… Struc… Waz S, Nakam… J Biol Chem 7
#> 3 28030… MED 2803… PMC5… 10.1… Struc… Christensen … PLoS One 12
#> 4 28028… MED 2802… PMC5… 10.1… Struc… Dow GT, Gilb… Protein Sci 3
#> 5 28024… MED 2802… PMC5… 10.1… Cryst… Kuk AC, Mash… Nat Struct … 2
#> 6 28011… MED 2801… PMC5… 10.1… Struc… Levdikov VM,… J Biol Chem 7
#> 7 28009… MED 2800… PMC5… 10.1… Struc… Zhao H, Wei … Sci Rep <NA>
#> 8 28005… MED 2800… <NA> 10.1… Cycli… Coxon CR, An… J Med Chem 5
#> 9 28004… MED 2800… <NA> 10.1… Disco… Cheeseman MD… J Med Chem 1
#> 10 28065… MED 2806… <NA> 10.1… Kobuv… Klima M, Cha… Structure 2
#> # ... with 90 more rows, and 18 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>,
#> # hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstPublicationDate <chr>
The following sources are supported
To retrieve metadata about these external database links, use europepmc_epmc_db()
.
Europe PMC let us also obtain citation metadata and reference sections. For retrieving citation metadata per article, use
europepmc::epmc_citations("9338777", limit = 500)
#> # A tibble: 216 x 11
#> id source citationType title authorString journalAbbrevia… pubYear
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 28437… MED "research su… Thre… Colon-Moran… Virology 2017
#> 2 28054… MED journal arti… Anti… Inoue Y, Yo… Ann Biomed Eng 2017
#> 3 27832… MED "research-ar… Tran… Kim N, Choi… PLoS One 2016
#> 4 27649… MED "research-ar… Comp… Nascimento … PLoS One 2016
#> 5 27527… MED "review-arti… How … Denner J. Viruses 2016
#> 6 27466… MED "research su… Exis… Kuse K, Ito… J Virol 2016
#> 7 26991… MED journal arti… Micr… Plotzki E, … Xenotransplanta… 2016
#> 8 26067… MED "brief-repor… Comp… Tang HB, Ou… Genome Announc 2015
#> 9 26043… MED "research su… Tole… Denner J, P… Virus Res 2015
#> 10 25956… MED "research su… Viru… Plotzki E, … Virus Res 2015
#> # ... with 206 more rows, and 4 more variables: volume <chr>,
#> # pageInfo <chr>, citedByCount <int>, issue <chr>
For reference section from an article:
europepmc::epmc_refs("28632490", limit = 200)
#> # A tibble: 169 x 19
#> id source citationType title authorString journalAbbrevia… issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 12002… MED JOURNAL ART… Triclo… Adolfsson-Er… Chemosphere 9-10
#> 2 18795… MED JOURNAL ART… In vit… Ahn KC, Zhao… Environ. Health… 9
#> 3 18556… MED JOURNAL ART… Effect… Aiello AE, C… Am J Public Hea… 8
#> 4 17683… MED JOURNAL ART… Consum… Aiello AE, L… Clin. Infect. D… <NA>
#> 5 15273… MED JOURNAL ART… Relati… Aiello AE, M… Antimicrob. Age… 8
#> 6 18207… MED JOURNAL ART… The in… Allmyr M, Ha… Sci. Total Envi… 1
#> 7 17007… MED JOURNAL ART… Triclo… Allmyr M, Ad… Sci. Total Envi… 1
#> 8 26948… MED JOURNAL ART… Pressu… Alvarez-Rive… J Chromatogr A <NA>
#> 9 23192… MED JOURNAL ART… Exposu… Anderson SE,… Toxicol. Sci. 1
#> 10 25837… MED JOURNAL ART… Observ… Vladar EK, L… Methods Cell Bi… <NA>
#> # ... with 159 more rows, and 12 more variables: pubYear <int>,
#> # volume <chr>, pageInfo <chr>, citedOrder <int>, match <chr>,
#> # essn <chr>, issn <chr>, publicationTitle <chr>, publisherLoc <chr>,
#> # publisherName <chr>, externalLink <chr>, doi <chr>
Europe PMC gives not only access to metadata, but also to full-texts. Adding AND (OPEN_ACCESS:y)
to your search query, returns only those articles where Europe PMC has also the fulltext.
Fulltext as xml can accessed via the PubMed Central ID (PMCID):
europepmc::epmc_ftxt("PMC3257301")
#> {xml_document}
#> <article article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
#> [1] <front>\n <journal-meta>\n <journal-id journal-id-type="nlm-ta"> ...
#> [2] <body>\n <sec id="s1">\n <title>Introduction</title>\n <p>Atm ...
#> [3] <back>\n <ack>\n <p>We would like to thank Dr. C. Gourlay and Dr ...