Version X addresses the following issues:
*More phenotypes
Phecodes were designed to support replication of genotype/phenotype associations in the GWAS catalog. Thus, we focused on creating codes that capture common/complex diseases found in adults. Many specific ICD codes were aggregated into broad phecodes, particularly in chapters relating to pregnancy, congenital anomalies, and neonatology. PhecodeX adds granularity across the coding structure and includes 3612 phecodes, compared with 1851 phecodes in v1.2. These added phecodes are meant to facilitate new research applications, such as the study of Mendelian disease or pregnancy-related conditions.
*New Look
Each phecodeX label is prefixed by a two-letter label indicating the category, followed by an underscore and a three-digit root code. In contrast, v1.2 phecodes were labeled with three-digit root codes, similar to ICD-9s. The character prefixes make phecodeX visually distinct from v1.2 and ICD codes, and prevent programs like R and Excel from corrupting codes by interpreting them as integers (e.g., phecode “008” being transformed to “8”). The numeric component of each phecodeX code is unique, even without the prefix.
*New categories
PhecodeX introduces a new section for genetic conditions that includes 324 phecodes for specific genetic diagnoses and chromosomal anomalies (e.g., Rett syndrome, Trisomy 18, and DiGeorge syndrome). PhecodeX also includes a new neonatal section. The 'injuries/poisonings' section in v1.2 has been removed from phecodeX.
*Alignment with ICD-10
Phecodes were developed before the release of ICD-10, and their structure largely conforms to the ICD-9 coding system. ICD-10 introduced new, more granular concepts that were not captured in the previous system. The new version created 574 new codes that pertain to ICD-10 only codes. These codes are marked as “1” in the icd10_only column and their phecode string ends with a “*”.
*Multi-mapping
The original phecode structure was based on a 1-to-1 mapping (each ICD mapped to a unique phecode). To incorporate ICD-10s into the map, we needed to do away with the 1-to-1 convention. For version X, we created new phecodes that took advantage of this new flexibility, which is particular helpful for infectious disease phenotypes. For example, V 1.2 maps “Streptococcus pneumoniae” to pneumonia, in the respiratory section; Version X maps “Streptococcus pneumoniae” to phecodes for pneumonia as well as streptococcus infections (in the ID section).
PheWAS R integration:
PhecodeX is compatible with the popular R PheWAS package. Instructions to use phecodeX are on GitHub
Caveats:
Many of the phenotypes added to phecodeX are for rare conditions and symptoms,and many of the phecodes for common/complex disease are highly similar between v1.2 and phecodeX. Thus, both phecode versions may produce highly similar results in analyses relating to common diseases of adulthood. A large number of studies have been published using phecodes v1.2, including publically available catalogs on this site and on pheweb , a site hosted by University of Michigan. For this reason, researchers may chose to continue using phecodes v1.2, which has been shown to be suitable for the study of common/complex diseases. We encourage researchers to experiment with both maps to determine which is right for their project.
Projects that use phecodeX:
The added granularity of phecodeX was instrumental in several published studies, including work relating to perinatal risk factors, hereditary cancer syndromes, PheWAS analysis with PheTK [ PheTK GitHub ], and a knowledgebase designed to interpret PheWAS results.
Future work:
The utility of phecodes has always been grounded in their ability to reproduce known associations. With this mindset, we are working on studies that compare phecodes V1.2 and phecodeX in terms of their ability to replicate known genetic associations and to rare disease. Stay tuned.
Maps:
The following files contain a map of ICD-9 to phecodes, ICD-10 to phecodes, and phecode strings.
Downloads:
Phecode information file:
phecodeX_info.csv This file includes phecodes, phecodes strings, category labels, and information relating to sex-specific codes
We provide ICD to phecode map files that support both ICD-9 and -10 Clinical modification (CM), the extended version of ICDs used by the United States (US), as well as the WHO version of ICD-10. If you are using ICD codes from the US, use the 'CM' files; otherwise, use the 'WHO' files.
ICD to phecode map, unrolled:
The unrolled map file maps ICD codes to phecodes. The relationships are 'unrolled' such that a child phecode is mapped to all parents (e.g. 'Type 2 diabetes' implies 'Diabetes mellitus'). The 'flag' column indicates whether the ICD is an ICD-9 code or ICD-10 code. This file is useful for translating ICDs to phecodes, which can be done using join in mysql or merge in R (with ICD and vocubulary_id as keys).
phecodeX_unrolled_ICD_CM.csv Compatible with the clinical modification (CM) ICD-9 and -10 codes.
phecodeX_unrolled_ICD_WHO.csv Compatible with the WHO's ICD-10.
ICD to phecode map, descriptive:
This is a highly descriptive mapping file that includes ICDs and phecodes along with their descriptive labels. It is useful for browsing how phecodes are defined. The file is 'flat' meaning the ICD->phecode relationships are not unrolled.
phecodeX_ICD_CM_map_flatv.csv Compatible with the clinical modification (CM) ICD-9 and -10 codes.
phecodeX_ICD_WHO_map_flat.csv Compatible with the WHO's ICD-10.
For more information about phecodeX files and formatting, see our GitHub repositories for phecodeX
Share your comments and thoughts about phecodeX to lisa.bastarache@vumc.org
Expert consultation:
In creating the phecodes, we consulted with 21 clinicians, many of whom use phecodes regularly, for advice on how best to structure the phecodes. We are so grateful for their help! Crack team of clinicians who helped with phecodes:
April Barnado,Julie Bastarache,Elly Brokamp,Meredith Campbell,Jeff Goldstein,Beth Ann Malow,Johnathan Mosley,Travis Osterman,
Dolly Padovani-Claudio,Andrea Ramirez,Dan Roden,Bryce Schuler,Eddie Siew,Bill Stead,Jen Sucre,Isaac Thomsen,Rory Tinker,Sara Van Driest,
Colin Walsh,Jeremy Warner,Quinn Wells,Lee Wheless,
informatics expertise:
Megan Shuey, Ida Aka, Adam Lewis
Learn more:
If you'd like to read more about phecodes in general, this paper provides a detailed overview.