Using httr to Fetch Data from Multiple Rows of a DataFrame in R

Using httr on Multiple Rows of a Data Frame

=====================================================

In this article, we will explore how to use the httr package in R to send HTTP requests and retrieve responses from multiple rows of a data frame. We will go through the steps involved in preparing the URL for each row, sending the GET request, parsing the response, and storing the results in a data frame.

Background


The httr package is a popular tool for making HTTP requests in R. It provides a convenient interface to send GET, POST, PUT, DELETE, and other types of requests, as well as to access headers, query parameters, and more. In this article, we will focus on using httr to send GET requests.

Preparing the URL


To use httr, you need to prepare the URL for each row of your data frame. This involves creating a list of URLs that can be sent as separate GET requests.

library(tidyverse)
library(httr)

test <- tribble(~case, ~mDPB11cd.recipient, ~mDPB12cd.recipient, ~mDPB11cd.donor, ~mDPB12cd.donor,
                 101, "04:01", "01:01", "03:01", "04:01",
                 102, "04:01", "02:01", "04:01", "01:01",
                 103, "01:01", "104:01", "03:01", "05:01")

urls <- test %>%
  mutate(url = paste0("https://www.ebi.ac.uk/cgi-bin/ipd/imgt/hla/dpb_v2.cgi?pid=1&amp;patdpb1=", mDPB11cd.recipient, "&amp;patdpb2=", mDPB12cd.recipient, "&amp;did=2&amp;dondpb1=", mDPB11cd.donor, "&amp;dondpb2=", mDPB12cd.donor))

urls

Sending the GET Request


Next, we use map to send a GET request for each URL in our list.

urls %>%
  mutate(result = map(url, GET)) %>%
  unlist() %>%
  purr::pwalk(list)

This code creates a new column called result, which contains the response from each GET request. The unlist() function is used to convert the list of responses back into a vector, and purrr::pwalk is used to perform some operations on each element in the vector.

Parsing the Response


By default, the GET function returns a response object that contains the HTTP headers, status code, and body. However, in this case, we are interested in parsing the response body as text.

library(httr)

test %>% 
  mutate(url = paste0("https://www.ebi.ac.uk/cgi-bin/ipd/imgt/hla/dpb_v2.cgi?pid=1&amp;patdpb1=", mDPB11cd.recipient, "&amp;patdpb2=", mDPB12cd.recipient, "&amp;did=2&amp;dondpb1=", mDPB11cd.donor, "&amp;dondpb2=", mDPB12cd.donor),
         result = map(url, GET), 
         data = map(result, content, as = "text"))

Storing the Results


Finally, we store the results in a data frame using dplyr.

library(tidyverse)

test %>% 
  mutate(url = paste0("https://www.ebi.ac.uk/cgi-bin/ipd/imgt/hla/dpb_v2.cgi?pid=1&amp;patdpb1=", mDPB11cd.recipient, "&amp;patdpb2=", mDPB12cd.recipient, "&amp;did=2&amp;dondpb1=", mDPB11cd.donor, "&amp;dondpb2=", mDPB12cd.donor),
         result = map(url, GET), 
         data = map(result, content, as = "text"),
         case = case(mDPB11cd.recipient == "04:01" & mDPB12cd.recipient == "01:01", 101,
                    mDPB11cd.recipient == "02:01" & mDPB12cd.recipient == "04:01", 102,
                    mDPB11cd.recipient == "104:01" & mDPB12cd.recipient == "03:01", 103),
         mDPB11cd.donor = as.character(mDPB11cd.donor)) %>%
  arrange(case, mDPB11cd.donor) %>% 
  select(case, mDPB11cd.recipie…, mDPB12cd.recipie…, mDPB11cd.donor, mDPB12cd.donor, url, data)

This code creates a new column called data, which contains the response body as text. The dplyr package is used to arrange and select columns.

Conclusion


In this article, we have explored how to use httr in R to send HTTP requests and retrieve responses from multiple rows of a data frame. We went through the steps involved in preparing the URL for each row, sending the GET request, parsing the response, and storing the results in a data frame.

We also covered some important details about httr, including how to use map to send multiple requests at once, and how to parse the response body as text.


Last modified on 2023-07-16