Getting started with {gt} tables | Nicola Rennie (2024)

{gt} is an R package designed to make it easy to make good looking tables. This blog post demonstrates how to add plots as a column in a {gt} table.

April 21, 2022

What is {gt}?

{gt} is an R package designed to make it easy to make good looking tables. My favourite feature of the {gt} package, is the ability to combine plots and tables. I’ve definitely spent time in the past deciding whether data would better presented in a table or in a plot, but {gt} allows me to combine them.

I recently found some time to explore {gt} and make a table which combines textual data with plots, to visualise the most spoken languages in the world. This blog post will demonstrate how to create a basic {gt} tables, adding plots as a column, and editing the look of the table. All code used in this post is available in thisGitHub repository.

Getting started with {gt} tables | Nicola Rennie (1)

This blog post assumes you have some basic knowledge of {tidyverse} packages - mainly {dplyr} and %>%, the pipe operator. Although it will demonstrate the data wrangling process, it is not the focus of the blog post. We’ll be using multiple packages from the {tidyverse}, along with {readxl} to read in the data, {purrr} to create multiple plots, and of course {gt}.

1234
library(readxl)library(tidyverse)library(gt)library(purrr)

Data

The dataset used in this blog post comes fromDiversity in Data and can be downloadedhere.

The date set contains information on the top 100 most spoken languages in the world. To limit the length of the table we’ll be producing, we’ll only look at the top 10. The data set contains information on the rank of each language, the total number of speakers, the number of native speakers and the origin of each language.

The data can be read in using read_xlsl() and converted to a tibble. The data is already sorted in rank order so we can simply using slice_head(n = 10) to look at the top 10 most widely spoken languages.

12
df <- tibble(read_xlsx("100 Most Spoken Languages.xlsx")) %>%  slice_head(n = 10)

Data Wrangling

To create a table using {gt}, we first need to create a tibble (or data frame) where each column in the tibble will become a column of our table. So before we begin wrangling our data, we need to think about what columns we want our final table to contain.

The final table will contain 4 columns: (i) the rank, (ii) the name of the language, (iii) some textual information about the language, and (iv) a bar chart showing the total and native speakers of the language. We should have a tibble with 4 columns and 10 rows.

The first two columns and quite simple - they already exist in the original data set in the format we need them to be in. For the third column, we can use mutate() to add a new character column called "description" to the tibble.

 1 2 3 4 5 6 7 8 9101112
df <- df %>% mutate(description = c("English is the most spoken language in the world in terms of total speakers, although only the second most common in terms of native speakers.", "Mandarin Chinese is the only language of Sino-Tibetan origin to make the top ten, and has the highest number of native speakers.", "Hindi is the third most spoken language in the work, and is an official language of India, and Fiji, amongst others.", "Spanish is an Indo-European language, spoken by just over half a billion people worldwide.", "French is an official language of Belgium, Switzerland, Senegal, alongside France and 25 other countries. It is also of Indo-European origin.", "Standard Arabic is the only language of Afro-Asiatic origin in the top ten, and has no native speakers according to www.ethnologue.com.", "Bengali is the official and national language of Bangladesh, with 98% of Bangladeshis using Bengali as their first language.", "Russian is an official language of only four countries: Belarus, Kazakhstan, Kyrgyzstan and Russia; with over a quarter of a billion total speakers.", "Portuguese is spoken by just under a quarter of a billion people, with almost all (around 95%) being native speakers.", "Indonesian is the only Austronesian language to make the top ten"))

Alternatively, you could store these descriptions in a separate file and then join it to the original data frame.

Before getting started on making the plots, we need to format the two numeric columns we’ll be using. Currently, both "Total Speakers" and "Native Speakers" are stored as character vectors in the form "1,132M". We want this to be a numeric value of 1132. We use mutate() again to add new columns with the correctly formatted numeric columns. This is done using regular expressions.

12345678
df <- df %>%  mutate( Total = as.numeric( unlist(lapply( regmatches(`Total Speakers`, gregexpr("[[:digit:]]+", `Total Speakers`)), function(x) str_flatten(x)))), Native = as.numeric( unlist(lapply( regmatches(`Native Speakers`, gregexpr("[[:digit:]]+", `Native Speakers`)), function(x) str_flatten(x)))))

We also convert and NA values to 0, and remove columns we no longer need.r

1234
df <- df %>%  mutate(Native = replace_na(Native, 0), Nonnative = Total - Native) %>% select(Rank, Language, description, Total, Native, Nonnative)

Adding plots to tables

In order to add a column of plots to the table, we need to:

  • filter the dataframe to use only the relevant row
  • make a bar chart using the data in each row
  • programmatically create the plot for each row of the table

The easiest way to do this, in my opinion, is to create a function which takes two inputs: (i) the dataframe, (ii) a unique identifier for each row - in this case, we’ll use the name of the language to identify each row. Alternatively, we could use the rank, for example.

The function filters the data set to only give the row related to the chosen language. It then converts it into long format, ready to use with {ggplot2}. I initially manually filtered the data set, and prepared one plot to mess around with getting it to look the way I wanted it to, before I made it a function. The output of the function is a single plot, relating to a specific language.

 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031
plot_lang <- function(lang, df){ # prep data p_data <- filter(df, Language == lang) %>% select(Native, Nonnative, Total) %>% pivot_longer(1:2) %>% mutate(x = 1, name = factor(name, levels = c("Nonnative", "Native"))) # limits lower <- filter(p_data, name == "Native")$value upper <- unique(p_data$Total) if ((upper - lower) < 100){ upper = upper + 50 lower = lower - 50 } # make plot ggplot(data = p_data, mapping = aes(x = x, y = value, fill = name)) + geom_col() + scale_fill_manual(values = c("lightgrey", "#355C7D")) + labs(x = "", y = "") + scale_y_continuous(limits = c(0, plyr::round_any(max(df2$Total), 100, ceiling)), breaks = c(lower, upper), labels = c(filter(p_data, name == "Native")$value, unique(p_data$Total))) + theme_minimal() + coord_flip() + theme(legend.position = "none", panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.text.y = element_blank(), axis.text.x = element_text(colour = "#355C7D", size = 60, face = "bold"))}

There were a couple of things to be aware of when making this plot:

  • By default, {ggplot2} sets the limits of the plot equal to the limits of the data. If I wanted to display only the proportional difference between native and non-native speakers, this would be fine. Since, I also wanted to display the difference in total speakers across languages, I had to manually specify the limits based on the maximum values in the whole dataframe (not just the selected row).
  • I added labels on the x-axis to show the values. Sometimes the values for total and native speakers were close together and the labels overlapped. I also manually specified the position of the labels if they were too close together.

After the plot function was created, we can use map() from {purrr} to create a list of plots, by applying to a list of all languages.

123
all_lang <- df %>% pull(Language)lang_plots <- purrr::map(.x = all_lang, .f = ~plot_lang(.x, df = df))

Columns of tibbles don’t just store numeric or character vectors, they can also store lists of ggplots. So we can use mutate() again to add our list of plots resulting from applying map() as another column. We also filter out the columns we no longer need.

123
df <- df %>% mutate(plots = lang_plots) %>% select(Rank, Language, description, plots)

Making a {gt} table

We now have out input ready to make a table using {gt}. Unfortunately, because our table contains plots, we can’t simply pipe in the tibble to the gt() function. If you try it, you’ll see that the plot column contains code, rather than a plot.

Instead, we:

  • select the non-plot columns,
  • add a NA column for the plots,
  • pipe this into gt()
1234
tb <- df %>% select(Rank, Language, description) %>% mutate(plots = NA) %>% gt() 

Now, we use the text_transform() function to add in our plots to the table. Again, we use map() from {purrr} to apply ggplot_image().

123456789
tb <- tb %>%  text_transform( locations = cells_body(columns=plots), fn = function(x){ purrr::map( df$plots, ggplot_image, height = px(80), aspect_ratio = 4 ) }) 

This results in the table shown below.

Getting started with {gt} tables | Nicola Rennie (2)

All the key components we need are there, but we may want to add some styling to improve the finished look.

Styling tables

There are a lots of different ways to style {gt} tables, so we’ll just highlight a few here:

  • Adding text: adding in titles, subtitle, and captions is a key part that you’ll likely want to include in most of your tables. The tab_header() function adds text above the main body of the table, and tab_source_note() add text below the table.
12345678
tb <- tb %>%  tab_header( title = "10 Most Spoken Languages", subtitle = "Diversity in Data are an initiative centered around diversity, equity & awareness. In December 2021, they released a dataset obtained from www.ethnologue.com detailing the 100 most spoken languages in the world. This table shows the ten most spoken languages in the world. Although English is the most spoken in terms of total speaks, Mandarin Chinese has the highest number of native speakers." ) %>% tab_source_note( source_note = "N. Rennie | Data: data.world/diversityindata" )
  • Styling text: you may want to change the font colour, font family, sizing etc. of the titles or subtitles, as well as the text contained in the columns of the table. tab_style() is the function used to do all of these.
 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041424344454647484950
tb <- tb %>%  tab_style( locations = cells_title(groups = 'title'), style = list( cell_text( size = "xx-large", weight="bold", color='#355C7D' ))) %>% tab_style( style = list( cell_text( align = "center", size='medium', color='#355C7D', weight="bold")), locations = cells_body(Rank) ) %>% tab_style( style = list( cell_text( align = "center")), locations = cells_column_labels(Rank) ) %>% tab_style( style = list( cell_text( size='large', color='#355C7D', weight="bold")), locations = cells_body(Language) ) %>% tab_style( style = list( cell_text( align = "left")), locations = cells_body(plots) ) %>% tab_style( style = list( cell_text( align = "left")), locations = cells_column_labels(plots) ) %>% tab_style( style = list( cell_text(size = 'small', align = "center")), locations = cells_source_notes() )
  • Edit column names: by default the column headings in the table are the same as the column names in the input tibble. These may not be named in a presentation-ready format, so we can manually rename them using cols_label().
1234567
tb <- tb %>%  cols_label( Rank = "Rank", Language = "Language", description = "Description", plots = "Number of speakers (millions)" )
  • Edit column widths: editing the column widths can be done using cols_width() to make sure that the column is wide enough to make the plot legible.
1234567
tb <- tb %>%  cols_width( Rank ~ px(100), Language ~ px(135), description ~ px(400), plots ~px(400) )
  • Table width: to ensure that the final table will be displayed and saved correctly, it’s useful to set the total width of the table equal to the sum of the column widths we just set using tab_options().
123
tb <- tb %>%  tab_options(table.width = 1035, container.width = 1035)
  • Colouring rows: tables can sometimes be tricky to read, especially if they have lots of rows. tab_style() can also be used to set different colours for alternating rows.
12345
tb <- tb %>%  tab_style( style = list(cell_fill(color = "#e4edf4")), locations = cells_body(rows = seq(1,9,2)) )

The final table now looks a little more polished.

Getting started with {gt} tables | Nicola Rennie (3)

Saving your {gt} table

If you’re using RStudio you’ll notice that {gt} tables are previewed in the Viewer pane, rather than the Plots pane, so you can’t just use the Export button. However, the {gt} package does have the gtsave() function which you can use to save a static version of your plot.

1
gtsave(tb,"languages.png")

Alternatively, you can use Rmarkdown to output to HTML, or PDF documents, for example.

Final thoughts

These are just a couple of the things you can do with {gt}. Hopefully, it’s inspired you to explore what it can do. If you want to recreate this table for yourself, the data can be downloadedhere and the code is available onGitHub.

When I was first exploring {gt}, I found some resources byBenjamin Nowak very helpful. If you’re already a fan of {gt}, it’s also worth checking out{gtExtras} which provides some additional helper functions to assist in creating beautiful tables with {gt}.

For attribution, please cite this work as:

Getting started with {gt} tables.
Nicola Rennie. April 21, 2022.
nrennie.rbind.io/blog/getting-started-with-gt-tables
BibLaTeX Citation
@online{rennie2022, author = {Nicola Rennie}, title = {Getting started with {gt} tables}, date = {2022-04-21}, url = {https://nrennie.rbind.io/blog/getting-started-with-gt-tables}}

Licence: creativecommons.org/licenses/by/4.0

Getting started with {gt} tables | Nicola Rennie (2024)

References

Top Articles
2012 Mercedes-Benz SL-Class for sale - Irvine, CA - craigslist
2017 Hyundai Elantra for sale - Charleston, SC - craigslist
Sugar And Spice 1976 Pdf
Obituaries in South Bend, IN | South Bend Tribune
Edutone Skyward
Saccone Joly Gossip
BEL MOONEY: Should I leave this boorish, bullying layabout?
Gameplay Clarkston
Smoke Terminal Waterbury Photos
Mimissliza01
People Helping Others Property
Shiftwizard Login Wakemed
Cmx Cinemas Gift Card Balance
Pbr Oklahoma Baseball
Red Wing Boots Dartmouth Ma
Top Scorers Transfermarkt
Black Adam Showtimes Near Kerasotes Showplace 14
Finger Lakes 1 Police Beat
Unterschied zwischen ebay und ebay Kleinanzeigen: Tipps, Vor- und Nachteile
Ubreakifix Laptop Repair
Pokemon Fire Red Download Pc
Wsisd Calendar
Cool Math Games Unblocked 76
Sky Park Stl Coupon
Asa Morse Farm Photos
636-730-9503
Runnings Milwaukee Tool Sale
Thothub Alinity
SuperLotto Plus | California State Lottery
Burlington Spectrum Tucson
Thailandcupid
Generac Find My Manual
How to Learn Brazilian Jiu‐Jitsu: 16 Tips for Beginners
Mark Rosen announces his departure from WCCO-TV after 50-year career
Northern Va Bodyrubs
Mcdonald's Near Me Dine In
Devil May Cry 3: Dante's Awakening walkthrough/M16
Strange World Showtimes Near Amc Hoffman Center 22
Actionman23
Alexis Drake Donation Request
Texas Motors Specialty Photos
Franco Loja Net Worth
Understanding Turbidity, TDS, and TSS
Sayuri Pilkey
C And B Processing
Legend Of Krystal Forums
What is Landshark Beer?
How to Set Up Dual Carburetor Linkage (with Images)
Hr Central Luxottica Benefits
Kgtv Tv Listings
Agurahl The Butcher Wow
Hollyday Med Spa Prairie Village
Latest Posts
Article information

Author: Catherine Tremblay

Last Updated:

Views: 6239

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Catherine Tremblay

Birthday: 1999-09-23

Address: Suite 461 73643 Sherril Loaf, Dickinsonland, AZ 47941-2379

Phone: +2678139151039

Job: International Administration Supervisor

Hobby: Dowsing, Snowboarding, Rowing, Beekeeping, Calligraphy, Shooting, Air sports

Introduction: My name is Catherine Tremblay, I am a precious, perfect, tasty, enthusiastic, inexpensive, vast, kind person who loves writing and wants to share my knowledge and understanding with you.