Edit tables in a relational database

2024-05-21

This document will learn you how to connect to a database and leverage editbl to explore and edit your data.

The package editbl is actually developed with the main goal of making it as easy as possible to work with a relational database. It therefore also uses terminology of the relational model. It is recommended to understand the concepts foreign key, natural key and surrogate key before proceeding.

That being said, let’s load some packages first.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(shiny)
library(editbl)

Basics of handling a database in R

The first thing you need is a database connection. Here we connect to an sqlite file, which is a portable database format.

tmpFile <- tempfile(fileext = ".sqlite")
file.copy(system.file("extdata", "chinook.sqlite", package = 'editbl'), tmpFile) 
conn <-   DBI::dbConnect(
    dbname = tmpFile,
    drv = RSQLite::SQLite()
)

With the {DBI} package you can list all tables and schemas that are available in the database for exploratory purposes.

DBI::dbListTables(conn)
##  [1] "Album"         "Artist"        "Customer"      "Employee"     
##  [5] "Genre"         "Invoice"       "InvoiceLine"   "MediaType"    
##  [9] "Playlist"      "PlaylistTrack" "Track"

The {dbplyr} package allows us to create tbl objects that reference the database tables without actually pulling all the data in memory. These objects can be manipulated using {dplyr} syntax in a lazy-evaluated way. Meaning you can work with big data that doesn’t even fit on your computer! You can read more about it here.

dplyr::tbl(conn, 'Album')
## # Source:   table<Album> [?? x 3]
## # Database: sqlite 3.45.0 [/tmp/RtmpOGknNx/file24f4bc8e33102.sqlite]
##    AlbumId Title                                 ArtistId
##      <int> <chr>                                    <int>
##  1       1 For Those About To Rock We Salute You        1
##  2       2 Balls to the Wall                            2
##  3       3 Restless and Wild                            2
##  4       4 Let There Be Rock                            1
##  5       5 Big Ones                                     3
##  6       6 Jagged Little Pill                           4
##  7       7 Facelift                                     5
##  8       8 Warner 25 Anos                               6
##  9       9 Plays Metallica By Four Cellos               7
## 10      10 Audioslave                                   8
## # ℹ more rows

A simple table editor

Let’s give a shot at building our first shiny app. This one will modify the Albums table in the database.

First we specify a shiny module. If you are unfamiliar with shiny modules, all you have to know is that they act as normal shiny apps, with ns() wrapped around in/output id’s. Though I do recommend reading more about them, since it will enhance what you can do with shiny.

dbUI <- function(id) {
  ns <- NS(id)
  fluidPage(
      eDTOutput(id = ns('Album'))
  )
}

dbServer <- function(id, conn) {
  moduleServer(
      id,
      function(input, output, session) {        
        Album <- eDT(
            id = "Album",
            key = "AlbumId",
            data = dplyr::tbl(conn, "Album"),
            in_place = TRUE 
        )
          
        invisible()
      }
  )
}

We need to put in_place = TRUE. Reason being that we want to actually modify tables within the database. If this argument is set to FALSE, editbl will return a edited copy of the original data. This is useful for data.frame’s, but not for databases.

We also specify which column(s) are the key of the table. This is the minimal set of columns to uniquely identify a row. You don’t have to specify this argument, but it makes eDT() more efficient. E.g. this way it only has to match on the key columns instead of all columns when doing updates or deletes.

IMPORTANT: Make sure the key you specify actually is unique across the table!

editbl will not itself check for uniqueness of rows because it’s computationally expensive. Ensuring uniqueness is usually the responsibility of the database.

Let’s run the app:

shiny::shinyApp( 
    ui = dbUI('id'),
    server =  function(input, output,session){
      dbServer('id', conn)
    })

Great, you can now modify the Album table in the database!

Tip: if you do not fill in the AlbumId for a new row, the database will automatically do it for you. If you prefer however to handle setting defaults within your application, take a look at the defaults argument of eDT().

Advanced table editor (flat, joined table).

Let’s take in one step further. You don’t really care about AlbumId and ArtistId do you? These are just meaningless surrogate keys to uniquely identify rows. So why not hide them?

This we can do by joining the Artists on ArtistId (foreign key).

dbServer_hidden_keys <- function(id, conn) {
  moduleServer(
      id,
      function(input, output, session) {
        db_album <- dplyr::tbl(conn, "Album")
        db_artist <- dplyr::tbl(conn, "Artist")
        
        Album <- eDT(
            id = "Album",
            data = db_album,
            in_place = TRUE,
            foreignTbls = list(
              foreignTbl(
                x = db_album,
                y = db_artist,
                by = 'ArtistId',
                naturalKey = 'Name'
              )
             ),
             options = list(
               columnDefs = list(
                 list(visible=FALSE, targets=c("AlbumId","ArtistId"))
               )
             )
        )
        invisible()
      }
  )
}

IMPORTANT Make sure the naturalKey you specify actually is unique across the table! (same reason as above)

shiny::shinyApp( 
    ui = dbUI('id'),
    server =  function(input, output,session){
      dbServer_hidden_keys('id', conn)
    })

Way more convenient isn’t it? Make sure to click the ‘edit’ button. Here you can see artists has now become a dropdown. You’ve actually also created your first constraint, ensuring that only artists that exists within the ‘Artist’ table can be filled in. So it is now easier to fill in this information and you ensure data correctness, double win.

Advanced table editor (partial table).

You can use dplyr::filter() to only show a subset of relevant rows to the user. This can be used for improvement of performance/navigation or enforcing row-level security.

dbUI_advanced <- function(id) {
  ns <- NS(id)
  fluidPage(
      shiny::uiOutput(ns("artistSelector_UI")),
      eDTOutput(id = ns('Album'))
  )
}

dbServer_advanced <- function(id, conn) {
  moduleServer(
      id,
      function(input, output, session) {
        ns <- session$ns
                
                
        db_album <- dplyr::tbl(conn, "Album")
        db_artist <- dplyr::tbl(conn, "Artist")
        
        output$artistSelector_UI <- shiny::renderUI(
          shiny::selectInput(ns('artist'),
           label = 'artist',
           choices = db_artist %>% select(Name) %>% collect())
        )

        Album <- eDT(
            id = "Album",
            data = db_album,
            in_place = TRUE,
            foreignTbls = reactive(
            {
             req(input$artist)
             selected <- input$artist     
             list(
              foreignTbl(
                x = db_album,
                y = db_artist %>% filter(Name == selected),
                by = 'ArtistId',
                naturalKey = 'Name'
              )
             )}),
             options = list(
               columnDefs = list(
                 list(visible=FALSE, targets=c("AlbumId","ArtistId"))
               )
             )

        )
        invisible()
      }
  )
}
shiny::shinyApp( 
    ui = dbUI_advanced('id'),
    server =  function(input, output,session){
      dbServer_advanced('id', conn)
    })

Above and beyond

Combining all the above opens up a whole lot of possibilities. Get creative and enjoy!