Feature request/suggestion - point estimate with interval labels to aid in plotting #223

Jimbilben · 2020-06-18T07:16:58Z

I've found it very useful to have easy access to a 'label' providing a chosen point estimate and density interval, which can be added for example to graphs as a means of providing extra, concise information about a parameter or effect that is depicted. E.g., adding label with the median and 95% HDI from the posterior, or whatever other point estimate/interval you choose, in the form: 5.67 [4.33 - 6.01].

The following R code takes a posterior distribution in the form of a vector and returns a summary. You can ignore most of the probably very inelegant coding, but the function inside of this function called 'point.uncertainty' could be a useful place to begin if you think this feature may be useful.

This requires dplyr and hdinterval packages to run.

summarise.me <- function(posterior, name = "summary", interval = .95, p.e = "median", u.e = "hdi") {
  
  hdi.bounds <- hdi(posterior, credMass = interval)
  hdi.lower <- hdi.bounds["lower"]
  hdi.upper <- hdi.bounds["upper"]
  
  eti.upper <- quantile(posterior, probs = interval + ((1 - interval) / 2) )
  eti.lower <- quantile(posterior, probs = (1 - interval) / 2)
    
  mean <- mean(posterior)
  median <- median(posterior)
  mode <- (density(as.numeric(posterior))
           $x[which.max(density(as.numeric(posterior))$y)])
  
  hdi.width <- hdi.upper - hdi.lower
  eti.width <- eti.upper - eti.lower
  
  posterior.summary <- 
    tibble(name = name,
           mean = mean,
           median = median,
           mode = mode,
           hdi.lower = hdi.lower,
           hdi.upper = hdi.upper,
           hdi.width = hdi.width,
           eti.lower = eti.lower,
           eti.upper = eti.upper,
           eti.width = eti.width,
           interval = interval)
  
  point.uncertainty <- function(summary, point.estimate = "mean", uncertainty = "hdi") {
    if (point.estimate == "mean" & uncertainty == "hdi")
    {
      paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$mean)), 
            " [", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.lower)), 
            "-", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.upper)), 
            "]" , 
            sep = "")
    } else if (point.estimate == "mode" & uncertainty == "hdi")
    {
      paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$mode)), 
            " [", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.lower)), 
            "-", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.upper)), 
            "]" , 
            sep = "")
    } else if (point.estimate == "median" & uncertainty == "hdi")
    {
      paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$median)), 
            " [", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.lower)), 
            "-", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$hdi.upper)), 
            "]" , 
            sep = "")
    } else if (point.estimate == "mean" & uncertainty == "eti")
    {
      paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$mean)), 
            " [", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.lower)), 
            "-", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.upper)), 
            "]" , 
            sep = "")
    } else if (point.estimate == "mode" & uncertainty == "eti")
    {
      paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$mode)), 
            " [", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.lower)), 
            "-", 
            sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.upper)), 
            "]" , 
            sep = "")
    }else if (point.estimate == "median" & uncertainty == "eti")
  {
    paste(sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$median)), 
          " [", 
          sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.lower)), 
          "-", 
          sub("^(-?)0.", "\\1.", sprintf("%.2f", summary$eti.upper)), 
          "]" , 
          sep = "")
  }

  }
  posterior.summary <- posterior.summary %>%
    mutate(label = point.uncertainty(posterior.summary, point.estimate = p.e, uncertainty = u.e))
  
  return(posterior.summary)
}

jgabry · 2020-06-22T16:44:45Z

Thanks for the feature request!

We have a function in bayesplot called mcmc_intervals_data() which does something similar to the summary that your function creates

library(bayesplot)
x <- example_mcmc_draws()
mcmc_intervals_data(x) # has some optional arguments to control interval width

# A tibble: 4 x 9
  parameter outer_width inner_width point_est      ll       l       m      h     hh
* <fct>           <dbl>       <dbl> <chr>       <dbl>   <dbl>   <dbl>  <dbl>  <dbl>
1 alpha             0.9         0.5 median    -42.7   -28.7   -18.5   -7.46   9.18 
2 sigma             0.9         0.5 median     17.1    17.6    18.0   18.4   19.0  
3 beta[1]           0.9         0.5 median     -0.165   0.136   0.358  0.601  0.916
4 beta[2]           0.9         0.5 median     -0.722  -0.578  -0.469 -0.362 -0.209

but we don't currently provide any way of creating plot labels from this summary table. We do provide other ggplot2 helper functions (https://mc-stan.org/bayesplot/reference/index.html#section-ggplot-helpers) so there's certainly a precedent for bayesplot incorporating something like this. Alternatively, it could be added as a column in the mcmc_intervals_data() data frame instead of as a separate function.

Jimbilben · 2020-06-24T10:00:01Z

One thing that would be nice about having it in the same output as other summary information is that when you wish to make plots, you already have it in the same tibble that you are summarising/visualising. I would at least find that more useful than a separate function, but up to you how/if you would think it best implemented!

jgabry added the feature label Jun 22, 2020

Sep	OCT	Nov
	24
2019	2020	2021

stan-dev / bayesplot

Feature request/suggestion - point estimate with interval labels to aid in plotting #223

Feature request/suggestion - point estimate with interval labels to aid in plotting #223

Jimbilben commented Jun 18, 2020 •

edited by jgabry

jgabry commented Jun 22, 2020

Jimbilben commented Jun 24, 2020

stan-dev / bayesplot

Sponsor stan-dev/bayesplot

Join GitHub today

Feature request/suggestion - point estimate with interval labels to aid in plotting #223

Feature request/suggestion - point estimate with interval labels to aid in plotting #223

Comments

Jimbilben commented Jun 18, 2020 • edited by jgabry

jgabry commented Jun 22, 2020

Jimbilben commented Jun 24, 2020

Essential cookies

Always active

Analytics cookies

Jimbilben commented Jun 18, 2020 •

edited by jgabry