(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . CHOIRBM: An R package for exploratory data analysis and interactive visualization of pain patient body map data [1] ['Eric Cramer', 'Division Of Pain Medicine', 'Stanford University School Of Medicine', 'Palo Alto', 'California', 'United States Of America', 'Maisa Ziadni', 'Kristen Hymel Scherrer', 'Department Of Cell Biology', 'Physiology'] Date: 2022-11 Body maps are commonly used to capture the location of a patient’s pain and thus reflect the extent of pain throughout the body. With increasing electronic capture body map information, there is an emerging need for clinic- and research-ready tools capable of visualizing this data on individual and mass scales. Here we propose CHOIRBM, an extensible and modular R package and companion web application built on the grammar of graphics system. CHOIRBM provides functions that simplify the process of analyzing and plotting patient body map data integrated from the CHOIR Body Map (CBM) at both individual patient and large-dataset levels. CHOIRBM is built on the popular R graphics package, ggplot2, which facilitates further development and addition of functionality by the open-source development community as future requirements arise. The CHOIRBM package is distributed under the terms of the MIT license and is available on CRAN. The development version of the package with the latest functions may be installed from GitHub . Example analysis using CHOIRBM demonstrates the functionality of the modular R package and highlights both the clinical and research utility of efficiently producing CBM visualizations. The number of patients with chronic pain conditions has steadily and dramatically increased over time, leading to immense individual and societal burden. To better study and improve treatments for these conditions, it is important to develop methods for characterizing the patients’ pain. Central to this effort is describing the location and distribution of pain throughout each patient’s body. Body maps are visual methods that efficiently and effectively facilitate capturing the location and extent of a patient’s pain and can be readily integrated with electronic data capture systems. As electronic health records have become the cornerstone of patient care, there is an emerging need for clinic- and research-ready tools to visualize body-map data on individual and mass scales. To address this need, Stanford researchers developed and validated the CHOIR Body Map for capturing the locations and distribution of a given patient’s pain, and we developed the CHOIRBM R package for analyzing the data. The CHOIRBM software provides functions for analyzing or visualizing individual body maps and large-scale data sets for comparisons across groups such as demographics or pain conditions. In addition, we built CHOIRBM with the popular R graphics package ggplot2 to facilitate further development or customization as future needs arise. Funding: MZ was supported by grants from the National Institute on Drug Abuse (NIDA K23DA047473). KHS was supported by grants from the National Institute on Drug Abuse (T32 DA035165). SM was supported by grants from the National Institute of Health (K24DA029262, R61NS11865, K24NS126781), and the Redlich Pain Endowment. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Data Availability: All data needed to evaluate the conclusions in the paper are present in the paper and contained within the CHOIRBM R package ( https://search.r-project.org/CRAN/refmans/CHOIRBM/html/00Index.html , https://github.com/emcramer/CHOIRBM ). As an alternative to providing a de-identified data set to the public domain, we currently allow access for the purpose of re-analyses or appropriate follow-up analyses by any qualified investigator willing to sign a contract with the host institution limiting use of data without direct PHI/PII identifiers, in accordance to HIPAA regulations, and with a 15-day manuscript review for compliance purposes. For access to the data, interested parties can contact the study PI, Dr. Sean Mackey, at choir-support@stanford.edu . Code used to generate figures and statistics may be found at https://github.com/emcramer/CHOIRBM_paper_code . Copyright: © 2022 Cramer et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. This manuscript introduces CHOIRBM, an R package that provides a collection of functions for data formatting, processing, and visualizing anatomical pain data using the CBM. Novel aspects of the package include: a suite of plotting methods to enable efficient and flexible visualization of complex and large body map data sets through an Application Programming Interface (API) and several functions for statistical comparisons and tests. In addition, it is the first tool to generate a colored body map, provide tools for comparing body maps across groups, and methods for analyzing the effect of continuous variables (such as NIH PROMIS measures) on body map endorsement. The intended users of this R package are researchers, statisticians, and clinicians interested in analyzing an individual patient or large body map data provided for pain characterization. In this paper, we demonstrate the use of this novel R package using data from the original CBM validation study collected through REDCap [ 6 ]. These analyses demonstrate the core functionality of the package and highlight both the clinical and research utility of efficiently producing CBM visualizations. Currently, over 100,000 CBM assessments have been collected and analyzed [ 7 , 8 , 12 , 13 , 13 – 30 ] through CHOIR, across institutions and clinical sites worldwide. In addition to the multi-site CHOIR electronic data capture ecosystem, the CBM has also been integrated into research workflows such as Research Electronic Data Capture (REDCap), a cloud-based, secure software [ 30 ] application for clinical research. The extensive, multi-site use of the CBM for research and medical purposes since 2013 has led to the creation of large data sets. However, a tool is not readily available to generate, analyze, and visualize body map- and integrated- data. This makes finding data-driven insights cumbersome and leads to non-standard methods of analysis. Thus, there is a demonstrable need for an informatics tool to analyze body map data that will aid researchers and clinicians seeking to understand the anatomical location, distribution, and comorbidities of their patients’ pain. The CHOIR platform uses item-response theory-based measures, including the National Institute of Health’s (NIH) Patient-Reported Outcomes Measurement Information System (PROMIS), which was designed and validated for precise and efficient measurement of health-related symptoms in patients with a wide variety of chronic conditions [ 9 ]. Recently, a formal initial validation demonstrated that the CBM possessed validity, reliability, and utility as an instrument to efficiently collect data on self-reported pain location and distribution and is thus a cost-effective diagnostic and prognostic tool [ 6 ]. Furthermore, as the CBM is multifunctional, it may be used to address conditions relating to nociceptive pain (caused by inflammation), neuropathic pain (caused by nerve damage), and nociplastic pain (diffuse pain not associated with inflamed tissue or nerve damage) [ 6 , 10 , 11 ]. There is a critical need to better characterize and manage pain in light of chronic pain’s immense individual and societal burden [ 1 – 4 ]. Central to pain characterization is the location and distribution of pain throughout the body [ 1 , 2 ]. Several dedicated efforts to develop body maps [ 1 – 5 ] face limitations, including low resolution, condition-specific features, anatomical demarcations not corresponding to clinical pain conditions, or paper and pencil requirements. To address the need for a standardized, digital, general-purpose body map to collect self-reported pain location data efficiently, Stanford researchers developed and validated the CHOIR body map (CBM) [ 6 ], as part of CHOIR, an open-source electronic learning healthcare system [ 7 , 8 ]. Methods CHOIR body map data capture The CBM is an electronic, visual representation of the human body that enables participants to indicate the location(s) of their pain (Fig 1). Participants use a computer mouse or touchscreen device to select each body area in which they experience pain. The CBM has two body silhouettes of identical segmentation to reflect the female and male anatomy. Each silhouette has 36 anterior and 38 posterior symmetrical body segments that best align with typical distributions of common chronic pain conditions on the body surface and joints. Each of the 74 anatomical locations for pain endorsement is identified by a three-digit ID code for efficient data capture and analysis. Codes that begin with a 1 correspond to locations on the front of the body while codes that begin with a 2 correspond to locations on the back of the body. Note to users, the three-digit identification codes differ between the male and female silhouettes, however, the CHOIRBM R package has functions to match them (functions convert_bodymap() and convert_bodymaps()). PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 1. The (A) male and (B) female CBM with each body map area labeled with its three-digit identification code. https://doi.org/10.1371/journal.pcbi.1010496.g001 Design and implementation The CHOIRBM package was designed to be open-source and built on top of the application programming interface (API) of the popular R data visualization package, ggplot2. Therefore, CHOIRBM is implemented in an object-oriented manner, with a series of functions that operate on base R objects such as data.frames and lists to produce ggplot2 objects. This approach makes the CHOIRBM API intuitive to users familiar with the R programming language and facilitates efficient and straightforward plot customization. The standard analysis workflow is to import the dataset as an R data.frame, use CHOIRBM helper functions to reformat the data to match relevant values to specific locations on the CBM (if necessary), use built-in analytic tools to compare and derive clinical insights, and use the plotting functions to generate publication-ready figures. We implemented CHOIRBM to include basic analytic functions: to compare CBMs across groups (e.g., male versus female, two groups with different pain conditions, or two time points), to investigate the impact of continuous variables on body map endorsement (e.g., age, NRS pain scores, or PROMIS measures), and to create plots to derive insights from the dataset, as demonstrated herein visually. Documentation of all functions organized by capability and additional details and example workflows can be found in the package vignettes online (https://www.github.com/emcramer/CHOIRBM). Data format and processing CHOIRBM can process CBM data from two different data sources: the CHOIR database which uses SQL tables or REDCap. In each case, data is imported into the R programming language and stored in computer memory as an R data.frame (analogous to an Excel spreadsheet). CHOIRBM does not introduce any package-specific data structures or objects. Thus, the primary data class in the CHOIRBM package is a data.frame with a minimum of three columns: [1] a column indicating the three-digit identification number of a CBM location, [2] a grouping column indicating if the location is on the front or back of the CBM, and [3] a column containing the values to use for coloring and filling the CBM locations in the plot. This data.frame-based approach simplifies the process of visualizing information by directly loading data from any spreadsheet, delimited file, R data file, or SQL query, and ensures flexibility by allowing users to easily switch values for plotting. For example, the percent endorsement, raw count, or any other measure or score. Therefore, plotting functions in the CHOIRBM package are written to operate on data.frame objects and work with R tidyverse pipes. Working with data extracted from a CHOIR database The CHOIR interface for the CBM consists of a clickable CBM image. Each anatomical location that the patient selects is recorded by CHOIR as a series of thee-digit codes in a delimited string. CBM data extracted from CHOIR databases is obtained as a series of pain location identifiers in a comma-separated string; with one string for each patient in a dataset. The data is exported from CHOIR with an SQL query and is automatically in R tidy format, with each row in the table representing a patient or participant and each column representing a variable; including each patient’s CBM endorsement (Fig 2A). PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 2. (A) Example of the format for an input data.frame for the CHOIRBM package. (B) Example of an output data.frame ready for plotting. Note, only the first two rows and last two rows are shown. (C) Example data from (A) and (B) plotted in a CBM. https://doi.org/10.1371/journal.pcbi.1010496.g002 The data can be transformed from the raw delimited body map strings using the string_to_map() function. string_to_map() will create a single body map data.frame from a patient’s string indicating binary endorsement of different body map segments. These individual body maps can be aggregated with the aggregate_maps() function, which accepts a list of CBMs and sums the endorsement of each anatomical location across all possible locations to produce a single data.frame with the raw count ready to plot as shown in Fig 2B, and the resulting visualization of CBM data in Fig 2C. Working with data extracted from a REDCap project The REDCap interface for the CBM also consists of a clickable CBM image and each anatomical location that the patient selects on the clickable image-map is recorded by the REDCap system. Importantly, however, the data format is determined by how a researcher programs the CBM instrument into their REDCap project. A patient’s CBM may be recorded in REDCap as either a series of thee-digit codes in a delimited string (similar to the method of export for CHOIR databases), or a collection of check boxes which results in 74 one-hot encoded variables in the exported dataset. While REDCap allows the user to choose which method to use, CHOIRBM will only accept data from REDCap that has been formatted in a delimited string, and researchers must program their CBM instrument to use a text-box field as outlined in Fig 3 (which produces a delimited string). By following this convention, data files exported from REDCap via manual download or its API will be formatted appropriately (Fig 2A) for immediate use with the CHOIRBM string_to_map() function, thereby reducing the need for data quality control. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 3. Example of the CBM instrument format required in REDCap for streamlined use with the CHOIRBM package. Selecting a single text box to collect a patient’s body map data allows the CHOIRBM string_to_map() function to automatically generate plot-ready R data.frames. https://doi.org/10.1371/journal.pcbi.1010496.g003 The data will be exported in R tidy format, with each row representing a patient and each column containing a variable (with one column for CBM endorsement). The string_to_map() function will create a single body map data.frame from a patient’s string indicating binary endorsement of different body map segments. These individual body maps can be aggregated with the aggregate_maps() function, which accepts a list of CBMs and sums the endorsement of each anatomical location across all possible locations to produce a single data.frame with the raw count ready to plot as shown in Fig 2B, and the resulting visualization of CBM data in Fig 2C. Analysis There are multiple ways to analyze CBM data depending on the variables of interest or the research question. The CHOIRBM package includes the following quantitative methods for analyzing body map endorsement information: 1) inter-group comparisons with a categorical variable such as gender, pain condition, or time point, 2) measuring the association of a continuous variable such as pain intensity scores or an NIH PROMIS measure with body map location endorsement, and 3) identifying co-occurrence patterns in body map location endorsement. Inter-group comparisons with a categorical variable For comparing body map endorsement between groups using a variable with two categories such as gender or time point, CHOIRBM includes the comp_choirbm_ztest() function. This function takes as input two R data.frames, one for each group. The data.frames are in R tidy format, with each row in the table representing a patient or participant, and each column representing a variable with one of those columns containing that individual’s CBM endorsement as a delimited string. The program then runs a series of z-tests to test whether there are statistically significant differences in endorsement of each location on the body map between groups [30]. To account for multiple hypothesis testing, comp_choirbm_ztest() automatically adjusts the p-values using the Bonferroni correction procedure, or users have the option to supply their own correction method. Users may also choose between left, right, and two-tailed z-tests to investigate the directionality of each relationship. The function returns a data.frame with one row for each anatomical location on the CBM, and columns for the z-test’s z-score and p-value. Measuring the impact of a continuous variable on CBM location endorsement For investigating the effect of a continuous variable such as pain intensity score or an NIH PROMIS measure on CBM segment endorsement, CHOIRBM includes the comp_choirbm_glm() function. comp_choirbm_glm() accepts a data.frame with at least one column for the patients’ CBM endorsement in a delimited string, and another column with the variable of interest. The function returns a data.frame object where each row is the result of a logistic regression examining the relationship between the continuous variable and patient endorsement [30]. Similar to comp_choirbm_ztest(), the p-values are adjusted with the Bonferroni correction by default to account for multiple hypothesis testing but the correction method may be changed at the user’s discretion. Investigating co-occurrence of CBM location endorsement CBM co-occurrence is defined as the number of times two anatomical locations on the CBM are endorsed together by patients in a data set. For example, given two patients where one endorses the locations numbered "101, 102, 103, 104, 201, 202" and the other indicates "101, 102, 201, 202," the location coded "101" co-occurs with "103" and "104" once, but with "102", "201", and "202" twice. Co-occurrence plays a role in chronic overlapping pain conditions (COPCs) and may be used to determine whether pain locations are more commonly endorsed together due to a particular etiology or pathology [31]. CHOIRBM supports co-occurrence analysis with the comp_cooccurrence() function. comp_cooccurrence() accepts a data.frame in R tidy format where one of the columns contains the patients’ CBM endorsements as delimited strings. It then calculates the number of times any two CBM segments are observed together in each body map across the entire data set. The function returns a data.frame object where each row is a combination of locations and a column that contains the number of times each combination of CBM locations occurred together (co-occurrence). [END] --- [1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010496 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/