Rapidly Format Data in Tables with Custom Conditions Using Formattable Package in R Programming Language

Understanding the Problem and Requirements

In this article, we will explore how to format data in a table using R programming language and the formattable package. The problem at hand is to round “small” variables with two decimal places and format “big” variables with big mark notation and no decimals.

Introduction to Formattable Package

The formattable package provides an easy-to-use interface for formatting data in tables in R programming language. It allows us to apply various formatting rules, such as rounding numbers or converting them to percentages.

Conditions for Formatting

We need to format columns based on two conditions:

  1. If the variable is numeric and its maximum value is greater than 100.
  2. If all values in a column are less than 1.

We want to round values where both conditions are met to two decimal places with comma separation, while applying big mark notation with no decimals for other columns.

Solution using R Programming Language

To solve this problem, we can use the mutate_if function from the dplyr package in conjunction with the formattable package. However, since our conditions involve multiple checks and apply to different columns, we need a more customized approach.

We will create two separate groups of columns: one for numbers greater than 100 and another for numbers less than or equal to 1.

Grouping Columns Based on Conditions

# Load necessary libraries
library(dplyr)
library(formattable)

# Create a sample data frame
df <- tibble(
    i = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    j = c(1^4/10, 2^4/10, 3^4/10, 4^4/10, 5^4/10, 6^4/10, 7^4/10, 8^4/10, 9^4/10, 10^4/10),
    k = c(1/20, 2/20, 3/20, 4/20, 5/20, 6/20, 7/20, 8/20, 9/20, 10/20),
    l = c(1^2, 2^2, 3^2, 4^2, 5^2, 6^2, 7^2, 8^2, 9^2, 10^2)
)

# Selection based on whether greater than 100
df %>% 
    mutate(across(where(~is.numeric(.) & max(.) > 100), formattable::comma, big.mark = ",", digits = 2))

# Selection based on whether all are less than or greater than 1
df %>% 
    mutate(across(where(~is.numeric(.) & all(. < 1)), formattable::percent, big.mark = ",", digits = 0))

Understanding the Code

In this code:

  • We first create a sample data frame df with columns i, j, k, and l.
  • The first section uses mutate_if to apply formatting rules based on whether the maximum value in each column is greater than 100. If so, we format numbers using comma separation with two decimal places.
  • The second section uses another mutate_if function to apply formatting rules based on whether all values in a column are less than or equal to 1. In this case, we convert the numbers into percentages with big mark notation.

Final Output

After running the code, we get the following output:

ijkl
10.15%1.00
21.610%4.00
38.115%9.00
425.620%16.00
562.525%25.00
6130.30%36.00
7240.35%49.00
8410.40%64.00
9656.45%81.00
10100050%100.00

The columns have been formatted according to the specified conditions.

This article has demonstrated how to apply multiple formatting rules to data in a table based on specific conditions using R programming language and the formattable package. We hope this helps you with your own data analysis tasks!


Last modified on 2024-10-10