Wonderful Wednesday May 2025 (62)

Longitudinal data Wonderful Wednesdays

As the current therapy of the Macrophage Activation Syndrome has significant side effects, dose reduction is key. Here are visualisations to effectively demonstrate the dose reduction over time, both on the individual level and in summary.

PSI VIS SIG https://www.psiweb.org/sigs-special-interest-groups/visualisation
05-14-2025

Macrophage Activation Syndrome

Macrophage activation syndrome (MAS) is a severe, potentially life-threatening condition involving a massive inflammatory response that overwhelms the whole body. It mainly affects children and symptoms include: fever, tiredness, low energy, headaches, confusion, seizures, enlarged lymph nodes, liver and spleen problems and bleeding disorders.

Current standard therapy includes high dose glucocorticoid (GC) treatment, although these have significant side effects including reduced growth rate, cataracts, mood changes and weight gain. Long term exposure to GC causes significant harm, especially in children.

Data set:

The data provided are based on two pooled open-label studies in children (n=39) with a diagnosis of MAS disease currently receiving GC treatment. Enrolled subjects started a new investigational drug on day 1, and one objective of the study was to reduce (taper) the GC dose to a safe level during the 8 week interventional period.

The data include daily GC doses levels for the 56 day interventional period, and also weekly average GC doses (week 1-8).

The Challenge:

A description of the challenge can also be found here.
A recording of the session can be found here.

Visualisations

The discussion of the various plots includes advantages of double coding, intelligent sorting, choosing time intervals, selecting suitable colour gradients and the level order. The latter one is important for streamgraphs, stacked bar charts and Sankey diagrams.

Example 1. Line graph

link to code

Example 2. Patient profile graph

The purpose of this plot is getting an initial “feel” for trends and variability in the raw data. With only 39 subjects a facet-plot showing dose vs. time for each individual was feasible. Facet were arranged from left to right and top to bottom in order of decreasing strength of correlation (values of the Somer’s rank correlation coefficient are shown in the next plot). Because the GC pulses represented extreme outliers, I used log scale to visualise trends, namely in the lower value range. Horizontal lines at 0.2, 0.5 and 1mg/kg/day define the three GC thresholds mentioned in the challenge. Additionally, doses >10mg/kg/day define GC pulses.

link to code

Example 3. Correlation dot plot

Somers’ D is a non-parametric measure of correlation related to Goodman-Kruskal’s Gamma and Kendall’s Tau-b. In contrast to the latter, the asymmetric Somers’ D coefficient is appropriate where of the variables being dependent on the other, as in this case.


The correlation coefficients for GC dose vs. time, including their 95% CIs, were negative for all subjects but one. The median Somers’ D across all patients was -0.817 (95% CI: -0.898 - -0.722). Strength of correlation seemed not associated with the total length of time patients received GC pulse therapy, probably because the frequency of the high-dose outliers/pulses decreased over the treatment period.

link to code

Example 4. Lasagna/spaghetti plots

The upper of the two panels is a “spaghetti plot”. This format offers a more compact way to visualise dose changes over time than the faceted line plots above, although at the expense of time-dependent resolution (qualitative changes) and/or clarity (colour instead of height). Subjects on the y-axis are sorted by dose level, consecutively from day 1 to day 56 of the treatment (ignoring baseline values at day 0). This makes the spaghetti plot more similar in appearance and comparable to the stacked area plot in the lower panel, showing changes in the proportion of patients with dose ranges between different threshold levels.


As an optional feature, the correlation of dose vs. time in individual subjects can be shown by increasing the transparency of the coloured bars with decreasing negative correlation strength. This could be useful for focussing on subjects where changes from higher to lower thresholds are part of a strong general trend for decreasing dose with treatment time. However, finding a range for the transparency scale that provided satisfactory visual discrimination was not easy with this set of values.

link to code

Example 5. Line graph with CI

Instead of looking at changes in patient numbers, this figure tries to quantify the days that an average patient might (no longer) need GC doses above certain thresholds. This transformation smoothes the effect of extreme changes, i.e. the GC pulses, while preserving their effect on the variability of the sample. The x-axis represents the accumulated time since start of the investigational treatment, the y-axis the average fraction of days that patients were dosed above individual threshold levels. Confidence intervals illustrate the variability of the patient data, and that the apparent decline in over-the-threshold dosing might not be significant except for the 10mg/kg/day level (GC pulses). The visualisation could be useful for assessing whether the benefits of treatment are increasing with time, and when the change will become significant for the average patient.

link to code

Example 6. Stacked bar chart


link to code

Example 7. Streamgraph

link to code

Example 8. Sankey diagram

link to code

Code

Example 1. Line graph

Oops, some code is missing here. We’re working on it.

Back to blog

Example 2 - 5

The code and documentation is provided by Thomas Weissensteiner on his publication page.

Back to blog example 2 Patient profile graph
Back to blog example 3 Correlation dot plot
Back to blog example 4 Lasagna/spaghetti plots
Back to blog example 5 Line graph with CI

Example 6. Stacked bar chart

##################################################################################################################
## Program:   WW_GCdose_APR2025.R                                                                  
##                                                                             
## Study:     None                                                         
##                                                                             
## Purpose:   Wonderful Wednesday PSI challenge 
##                                                                             
## Inputs:    WW_GCdose.csv
##
## Outputs:   StackedBarchartGCdoseGroups.png
##                                                                                                   
## Revision                                                                                                     
## History:      Version     Date        Author                  Description                                    
##                -------     ---------   -------------------     -------------------------------------------   
##                    1.0     6MAY2025   Baerbel Maus           Initial version 
##################################################################################################################


## cleanup
rm(list=ls())

# projectRoot <- "C:/Users/"

library(ggplot2)
library(tidyverse)

# dat <- read.csv(paste0(projectRoot,"WW_GCdose.csv"))
dat <- read.csv("https://raw.githubusercontent.com/VIS-SIG/Wonderful-Wednesdays/refs/heads/master/data/2025/2025-04-09/WW_GCdose.csv")

datWeeks <- unique(dat[,c("SUBJID","AVISITN","AVISIT","AVAL2")])

plot(NA,NA,xlab="Days", ylab="  Daily glucocorticoid dose (mg/kg/day)",xlim=c(0, 56), ylim=c(0, 40))
for (subj in unique(dat$SUBJID)[1:10]){
  lines(dat[dat$SUBJID == subj,"ASTDY"],dat[dat$SUBJID == subj,"AVAL1"])
}

plot(NA,NA,xlab="Weeks", ylab="Weekly glucocorticoid dose (mg/kg/day)",xlim=c(0,8), ylim=c(0, 40))
for (subj in unique(dat$SUBJID)[1:10]){
  lines(dat[dat$SUBJID == subj,"ASTDY"],dat[dat$SUBJID == subj,"AVAL2"])
  abline(a = 1,b= 0, col = "blue")
  abline(a = 0.5,b= 0, col = "blue")
  abline(a = 0.2,b= 0, col = "blue")
}

# Calculate the percentage of subjects per day with GC values between 0.2, 0.5, 1 or above 1
percDay <- matrix(NA, nrow = 57,ncol = 5)
j = 0
for (i in unique(dat$ASTDY)){
  j = j + 1
  datDay <- dat[dat$ASTDY == i,]
  percDay[j,1] <- i
  percDay[j,2] <- sum(datDay$AVAL1 < 0.2)/nrow(datDay) *100
  percDay[j,3] <- sum(0.2 <= datDay$AVAL1 & datDay$AVAL1 < 0.5)/nrow(datDay)*100
  percDay[j,4] <- sum(0.5 <= datDay$AVAL1 & datDay$AVAL1 < 1)/nrow(datDay)*100
  percDay[j,5] <- sum(datDay$AVAL1 >= 1)/nrow(datDay)*100
  
}

# transform data to long format as needed for stacked barchart
percDay2 <- as.data.frame(percDay)
data_long <- percDay2 %>%
  gather(key = "GClevel", value = "Value",-1)

# Create the stacked bar chart
png(filename ="StackedBarchartGCdoseGroups.png")
ggplot(data_long, aes(x = V1, y = Value, fill = GClevel)) +
  geom_bar(stat = "identity") +
  scale_fill_manual(values = c("V5" =  "#08519C", "V4" = "#4292C6","V3" = "#9ECAE1",V2 = "#DEEBF7"),
                    labels = c("V5" = ">= 1", "V4" = "0.5-1","V3" = "0.2-0.5","V2" = "<0.2")) +
  labs(x = "Days", y = "Percentage of Subjects") +
  labs(fill = "GC doses (mg/kg/day)")+
  scale_x_continuous(
    breaks = seq(0,60,10),
    limits = c(0, 60) # Specify the positions of the ticks
  ) +
  scale_y_continuous(
    breaks = seq(0,100,10),limits = c(0, 101)  # Specify the positions of the ticks
  ) +
  theme_minimal() +
  theme(
    # panel.background = element_blank(), # Remove the gray background
    # panel.grid.major = element_blank(), # Remove major grid lines
    panel.grid.minor = element_blank(), # Remove minor grid lines
    
    axis.ticks.y.right = element_blank(),  # Remove ticks on the right
    axis.ticks.x.top = element_blank(),    # Remove ticks on the top
    axis.line.y.right = element_blank(),   # Remove axis line on the right
    axis.line.x.top = element_blank()      # Remove axis line on the top
  )
dev.off() 

Back to blog

Example 7. Streamgraph

library(tidyverse)
library(ggstream)
library(ggtext)

out <- "."

df <- read_csv(WW_GCdose.csv) 

daily <- df %>%
  mutate(
    dl = case_when(
      AVAL2 <= 0.2 ~ "4",
      AVAL2 <= 0.5 ~ "3",
      AVAL2 <= 1.0 ~ "2",
      AVAL2 > 1.0 ~ "1"
    )
  ) %>%
  count(ASTDY, dl) %>%
  complete(ASTDY=1:56, dl = c("1", "2", "3", "4"), fill = list(n=0)) %>%
  select(ASTDY, dl, n)

colours <- c("#fef0d9", "#fdcc8a", "#fc8d59", "#d7301f")
daily$dl <- as.factor(daily$dl)

p0 <- ggplot(daily, aes(x = ASTDY, y = n, fill = dl)) +
   scale_fill_manual(values = rev(colours),
                    labels = rev(c("<= 0.2", "> 0.2 and <= 0.5", "> 0.5 and <=1.0", "> 1.0")),
                    name = "GC Dose (mg/kg/day)") +
  scale_x_continuous("Study Day",
                     limits = c(0, 56),
                   expand=c(0,0)) +
  labs(title="<b>Use of <span style='color:#d7301f'>High Dose GC</span> Reduces Steadily Over Time, With Stable Dose Achieved by Week 8</span></b>") +

  theme_minimal(base_size = 16) +
  theme(
    panel.background = element_rect(fill = "white", color = NA),
    plot.background = element_rect(fill = "white", color = NA),
    plot.title = element_markdown(colour = "black",
                                  size = 16)
  )

p1 <- p0 +  geom_stream(type = "proportional") +
  scale_y_continuous(name="Proportion", expand=c(0,0)) 
ggsave(file.path(out, "GC_Streamgraph_prop.png"), plot = p1, width = 12, height = 8, dpi = 300)

Back to blog

Example 8. Sankey diagram

library(tidyverse)
library(ggalluvial)
library(ggtext)

df <- read_csv(WW_GCdose.csv) 

weekly <- df %>%
   mutate(
    dl = case_when(
      AVAL2 <= 0.2 ~ "4",
      AVAL2 <= 0.5 ~ "3",
      AVAL2 <= 1.0 ~ "2",
      AVAL2 > 1.0 ~ "1"
    )
  ) %>%
  group_by(SUBJID, AVISITN) %>%
  filter(!duplicated(AVAL2)) %>%
  ungroup()

colours <- c("#fef0d9", "#fdcc8a", "#fc8d59", "#d7301f")
weekly$dl <- as.factor(weekly$dl)
weekly$AVISITN <- as.factor(weekly$AVISITN)

p <- ggplot(data=weekly, aes(x = AVISITN, stratum = dl, alluvium=SUBJID, fill=dl)) +
  geom_flow(color = "black", aes.flow = "forward") +
  geom_stratum() +
  scale_x_discrete("Week", 
                   limits = c("0", "1", "2", "3", "4","5", "6", "7","8"),
                   labels = c("0" = "BL", "1" = "1", "2" = "2", "3" = "3", "4" = "4", "5" = "5", "6" = "6", "7" = "7","8" = "8"),
                   expand=c(0,0)) +
  scale_y_continuous("Number of Patients", 
                     expand=c(0,0)) +  
  scale_fill_manual(values = rev(colours),
                    labels = rev(c("<= 0.2", "> 0.2 and <= 0.5", "> 0.5 and <=1.0", "> 1.0")),
                    name = "GC Dose (mg/kg/day)") +
  labs(title="<b>Use of <span style='color:#d7301f'>High Dose GC</span> Reduces Steadily After Week 2, With Stable Dose Achieved by Week 8</span></b>") +
  theme_minimal(base_size = 18) +
  theme(panel.grid.major = element_blank(),
        panel.background = element_rect(fill = "white", color = NA),
        plot.background = element_rect(fill = "white", color = NA),
        plot.title = element_markdown(colour = "black",
                                      size = 16)
        )

out <- "C:\\Users\\malle\\OneDrive\\OneDrive Documents\\_Steve\\Work\\R\\WW May 2025\\"
ggsave(file.path(out, "GC_Sankey.png"), plot = p, width = 12, height = 8, dpi = 300)

Back to blog

Citation

For attribution, please cite this work as

SIG (2025, May 14). VIS-SIG Blog: Wonderful Wednesday May 2025 (62). Retrieved from https://graphicsprinciples.github.io/posts/2025-05-14-wonderful-wednesday-may-2025/

BibTeX citation

@misc{sig2025wonderful,
  author = {SIG, PSI VIS},
  title = {VIS-SIG Blog: Wonderful Wednesday May 2025 (62)},
  url = {https://graphicsprinciples.github.io/posts/2025-05-14-wonderful-wednesday-may-2025/},
  year = {2025}
}