Chapter 5 Embedding R Scripts into Swift - Part 1

Even with some Swift helpers, using R’s C interface for all R operations can be a bit cumbersome. If your needs are “more R” vs “more Swift” then using R’s C function to parse and evaluate R code might be a better alternative for you.

5.1 “Hello, World!”, Parsed

Using the previously saved project (we’ll be making use of some of the SEXP Swift helpers), make a copy and rename the project it to scriptr. You can remove the {daybreak}-related source file.

Add the following to swift-r-bridge.h:

#include <R_ext/Parse.h>

and, replace the contents of main.swift with the following:

import Foundation

initEmbeddedR()

// if we care about the returned SEXP use a variable instead of _
let _ = R_ParseEvalString("""
cat("Hello, World!\n")
""", R_GlobalEnv)

Rf_endEmbeddedR(0)

Build and run that to get the age-old starter greeting.

The """ marks the start of a multi-line string in Swift, which is handy since we don’t have to escape double quotes if we use them. R_ParseEvalString requires a fairly recent version of R, but I’m a fan of folks keeping current, so if this isn’t working for you, upgrade! (Or, more seriously, just use the technique we’re about to cover).

This is a pretty dangerous function. If there’s a parse error, you’re looking at fatal command line app error (which may be the desired functionality, but it makes it harder to give the user helpful information on what happened). To see that in action, leave the trailing ) off of the cat call and re-run the product.

5.2 “Hello, World!” Parsed and Safely (Silently) Evaluated

We can do better than having the application crash and burn by using a combination of R_ParseVector() and R_tryEvalSilent(), but we’re going to create some Swift helpers to make working with SEXP character vectors a bit less painful, first.

The C side of R character vectors requires allocating space for a STRSXP SEXP and setting string elements in each of the allotted spaces. In Swift, it looks like:

let out: SEXP = RPROTECT(Rf_allocVector(SEXPTYPE(STRSXP), 5))
SET_STRING_ELT(out, 0, Rf_mkChar("this"))
SET_STRING_ELT(out, 1, Rf_mkChar("is"))
SET_STRING_ELT(out, 2, Rf_mkChar("a"))
SET_STRING_ELT(out, 3, Rf_mkChar("character"))
SET_STRING_ELT(out, 3, Rf_mkChar("vector"))
RUNPROTECT(1)

Ideally, most Swift programmers would prefer to do something like:

[ "this", "is", "a", "character", "vector" ].SEXP

(whether you prefer it to look like a function call or computed property, or be a verb or not is somewhat more a personal/team style call than anything else).

Thankfully, we can wire up some String Array helpers to make getting SEXPs to/from Swift much easier.

First, place this C bridge helper into swift-r-bridge.c:

inline const char *RVecElToCstr(SEXP x, R_xlen_t i) {
  return(CHAR(STRING_ELT(x, i)));
}

(remember to add the associated function declaration to the .h file).

Now, place the following in sexp-utils.r (the code block is self-annotated to avoid extraneous expository).

extension Array where Element == String {
  
  // this initializes a new String array from a SEXP character vector.
  // The init() has to be init?() - an Optional - since the passed-in SEXP might 
  // not be a charactrer vector. 
  //
  // we reserve space for all of the strings and then convert each SEXP 
  // into a String and add them to the array.
  
  init?(_ sexp: SEXP) {
    if (SEXPTYPE(TYPEOF(sexp)) == STRSXP) {
      var val : [String] = [String]()
      val.reserveCapacity(Int(Rf_length(sexp)))
      for idx in 0..<Rf_length(sexp) {
        val.append(String(cString: RVecElToCstr(sexp, R_xlen_t(idx))))
      }
      self = val
    } else {
      return(nil)
    }
  }
  
  // to get a SEXP character vector back out, we create the proper sized STRSXP
  // vector and iterate over the String array and add each element to the 
  // STRSXP vector. 
  
  var SEXP: SEXP? {
    let charVec = RPROTECT(Rf_allocVector(SEXPTYPE(STRSXP), count))
    defer { RUNPROTECT(1) }
    for (idx, elem) in enumerated() { SET_STRING_ELT(charVec, idx, Rf_mkChar(elem)) }
    return(charVec)
  }
  
  // this is just a convenience shortcut to get a PROTECTed version of the ^^ SEXP
  
  var protectedSEXP: SEXP? {
    return(RPROTECT(SEXP))
  }
  
}

We can test that out by placing the following right before the Rf_endEmbeddedR(0) line:

// this is now automagically converted to a PROTECTED STRSXP vector
let vectorToParse = ["""
cat("Hello, again, World!\n")
"""].protectedSEXP
defer { RUNPROTECT(1) }

// same as the example in the previous chapters
var status: ParseStatus = PARSE_OK
let parsed = RPROTECT(R_ParseVector(vectorToParse, -1, &status, R_NilValue))
defer { RUNPROTECT(1)}

// there can be multiple parsed expressions, so we need to iterate over
// them just in case and then evaluate each one.

if (status == PARSE_OK) {
  var rErr: CInt = 0
  for idx in 0..<Rf_length(parsed) {
    let _ = R_tryEvalSilent(VECTOR_ELT(parsed, R_xlen_t(idx)), R_GlobalEnv, &rErr)
    if (rErr != 0) {
      debugPrint("Eval failure: \(String(cString: R_curErrorBuf()))")
    }
  }
} else {
  debugPrint("Parse failure")
}

After building and running this you should see both “Hello” lines.

5.2.1 Staying Safe and DRY

While that is a safer parse/eval idiom, we really don’t want to type that every time we need to parse/eval some R code. Plus, we likely want to keep the results of the parsing to use back in Swift. Let’s add a function to r-utils.swift which will wrap all this desired functionality:

// take in a String Array of R code and return an Array of parsed/evaluated SEXPs

func safeSilentEvalParse(_ vec: [String]) -> [SEXP] {
  
  let vectorToParse = vec.protectedSEXP
  defer { RUNPROTECT(1)}
  var status: ParseStatus = PARSE_OK
  let parsed = RPROTECT(R_ParseVector(vectorToParse, -1, &status, R_NilValue))
  defer { RUNPROTECT(1)}

  if (status == PARSE_OK) {
    var rErr: CInt = 0
    var res: [SEXP] = [SEXP]()
    res.reserveCapacity(Int(Rf_length(parsed)))
    for idx in 0..<Rf_length(parsed) {
      let evaluated = R_tryEvalSilent(VECTOR_ELT(parsed, R_xlen_t(idx)), R_GlobalEnv, &rErr)
      if (rErr != 0) {
        res.append(R_NilValue)
        R_ShowMessage("Eval failure: \(String(cString: R_curErrorBuf()))") // see Up Next for a TODO for you, here
      } else {
        res.append(RPROTECT(evaluated))
      }
    }
    return(res)
  } else {
    R_ShowMessage("Parse failure \(status.rawValue)") // see Up Next for a TODO for you, here
    return([R_NilValue])
  }

Now, the previous parsing code section can be reduced to:

let res = safeSilentEvalParse(["""
cat("Hello, again, World!\n")
"""])

Except, we have one (not so) small problem. We had to protect each entry in the Swift SEXP Array because if we do not, R’s garbage collector could abscond with them. We need a way to UNPROTECT all of the non-NULL elements from that array, so add the following to sexp-utils.swift:

extension Array where Element == SEXP {
  func UNPROTECT() {
    RUNPROTECT(CInt(filter { $0 != R_NilValue }.count))
  }
}

Now we just need to call or defer res.UNPROTECT() when we’re done with the values.

5.3 Covering All the Base(s) Types

Let’s say we had the following bits of R code to evaluate with our new, spiffy function:

let vecToParse = [
  "'Hello, yet again, World!'",
  "2020 + 1", "2020L + 1L",
  "FALSE",
  "head(letters, 10)",
  "ChickWeight$weight[1:10]",
  "as.integer(ChickWeight$weight[1:10])",
  "head(letters) == 'b'",
  "mtcars",
  "'Goodbye, World!'"
]

We’ll have nine results from that if we pass vecToParse to safeSilentEvalParse, and not all of them will be length 1 core types (which we have SEXP helpers for already).

To handle them, we’ll need to make similar Array extensions for the core types (like we did for String<->STRSXP). Add these to sexp-utils.swift:

extension Array where Element == Bool {
  init?(_ sexp: SEXP) {
    if (SEXPTYPE(TYPEOF(sexp)) == LGLSXP) {
      var val : [Bool] = [Bool]()
      val.reserveCapacity(Int(Rf_length(sexp)))
      for idx in 0..<Rf_length(sexp) {
        val.append(LOGICAL(sexp)[Int(idx)] == 1)
      }
      self = val
    } else {
      return(nil)
    }
  }
  var SEXP: SEXP? {
    let logicalVec = RPROTECT(Rf_allocVector(SEXPTYPE(LGLSXP), count))
    defer { RUNPROTECT(1) }
    for (idx, elem) in enumerated() { LOGICAL(logicalVec)[idx] = (elem ? 1 : 0) }
    return(logicalVec)
  }
  var protectedSEXP: SEXP? {
    return(RPROTECT(SEXP))
  }
}

extension Array where Element == Double {
  init?(_ sexp: SEXP) {
    if (SEXPTYPE(TYPEOF(sexp)) == REALSXP) {
      var val : [Double] = [Double]()
      val.reserveCapacity(Int(Rf_length(sexp)))
      for idx in 0..<Rf_length(sexp) {
        val.append(Double(REAL(sexp)[Int(idx)]))
      }
      self = val
    } else {
      return(nil)
    }
  }
  var SEXP: SEXP? {
    let realVec = RPROTECT(Rf_allocVector(SEXPTYPE(REALSXP), count))
    defer { RUNPROTECT(1) }
    for (idx, elem) in enumerated() { REAL(realVec)[idx] = elem }
    return(realVec)
  }
  var protectedSEXP: SEXP? {
    return(RPROTECT(SEXP))
  }
}

extension Array where Element == Int {
  init?(_ sexp: SEXP) {
    if (SEXPTYPE(TYPEOF(sexp)) == INTSXP) {
      var val : [Int] = [Int]()
      val.reserveCapacity(Int(Rf_length(sexp)))
      for idx in 0..<Rf_length(sexp) {
        val.append(Int(INTEGER(sexp)[Int(idx)]))
      }
      self = val
    } else {
      return(nil)
    }
  }
  var SEXP: SEXP? {
    let intVec = RPROTECT(Rf_allocVector(SEXPTYPE(INTSXP), count))
    defer { RUNPROTECT(1) }
    for (idx, elem) in enumerated() { INTEGER(intVec)[idx] = CInt(elem) }
    return(intVec)
  }
  var protectedSEXP: SEXP? {
    return(RPROTECT(SEXP))
  }
}

Now we can test those out with the above example array of mixed R expressions. Put the following before the Rf_endEmbeddedR(0) call:

let vecToParse = [
  "'Hello, yet again, World!'",
  "2020 + 1", "2020L + 1L",
  "FALSE",
  "head(letters, 10)",
  "ChickWeight$weight[1:10]",
  "as.integer(ChickWeight$weight[1:10])",
  "head(letters) == 'b'",
  "mtcars",
  "'Goodbye, World!'"
]

let parsedProtectedSEXPs = safeSilentEvalParse(vecToParse)

parsedProtectedSEXPs.forEach { sexp in
  if (sexp != R_NilValue) {
    if (SEXPTYPE(TYPEOF(sexp)) == STRSXP) {
      if (Rf_length(sexp) == 1) {
        print("Evaluated result: <chr> \(String(cString: CHAR_Rf_asChar(sexp)))")
      } else {
        print("Evaluated result: [<chr>] \([String](sexp)!)")
      }
    } else if (SEXPTYPE(TYPEOF(sexp)) == REALSXP) {
      if (Rf_length(sexp) == 1) {
        print("Evaluated result: <dbl> \(Double(sexp)!)")
      } else {
        print("Evaluated result: [<dbl>] \([Double](sexp)!)")
      }
    } else if (SEXPTYPE(TYPEOF(sexp)) == INTSXP) {
      if (Rf_length(sexp) == 1) {
        print("Evaluated result: <int> \(Int(sexp)!)")
      } else {
        print("Evaluated result: [<int>] \([Int](sexp)!)")
      }
    } else if (SEXPTYPE(TYPEOF(sexp)) == LGLSXP) {
      if (Rf_length(sexp) == 1) {
        print("Evaluated result: <lgl> \(Bool(sexp)!)")
      } else {
        print("Evaluated result: [<lgl>] \([Bool](sexp)!)")
      }
    } else {
      print("Evaluted result is of type <\(String(cString: Rf_type2char(SEXPTYPE(TYPEOF(sexp)))))> which we do not handle yet")
    }
  }
}

// remember, we need to unprotect all of ^^
parsedProtectedSEXPs.UNPROTECT()

If you build and run the product now, it should produce:

Hello, again, World!
Hello, again, World!
Evaluated result: <chr> Hello, yet again, World!
Evaluated result: <dbl> 2021.0
Evaluated result: <int> 2021
Evaluated result: <lgl> false
Evaluated result: [<chr>] ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"]
Evaluated result: [<dbl>] [42.0, 51.0, 59.0, 64.0, 76.0, 93.0, 106.0, 125.0, 149.0, 171.0]
Evaluated result: [<int>] [42, 51, 59, 64, 76, 93, 106, 125, 149, 171]
Evaluated result: [<lgl>] [false, true, false, false, false, false]
Evaluted result is of type <list> which we do not handle yet
Evaluated result: <chr> Goodbye, World!

5.4 Who You Calling Ugly?

Well, I am calling some of the above still ugly. Lines like } else if (SEXPTYPE(TYPEOF(sexp)) == INTSXP) { are un-Swift-like and needlessly verbose.

While SEXP may be the foundation of all things C+R, it does not mean they can’t be semantically improved in Swift. After all, a SEXP is just a C struct and we are fully able to extend them in Swift. We’ll do that now (in sexp-utils.swift) to make some convenience helpers:

extension SEXP {
  var length: R_len_t {
    Rf_length(self)
  }
  var count: R_len_t { // "count" is more Swift-like
    Rf_length(self)
  }
  var typeOf: SEXPTYPE {
    SEXPTYPE(TYPEOF(self))
  }
  var type: String {
    String(cString: Rf_type2char(SEXPTYPE(TYPEOF(self))))
  }
  var isLOGICAL: Bool {
    self.typeOf == LGLSXP
  }
  var isREAL: Bool {
    self.typeOf == REALSXP
  }
  var isINTEGER: Bool {
    self.typeOf == INTSXP
  }
  var isSTRING: Bool {
    self.typeOf == LGLSXP
  }
  var isLIST: Bool {
    self.typeOf == VECSXP
  }
  var protectedSEXP: SEXP {
    RPROTECT(self)
  }
}

Now, we can replace the previous forEach block with the following:

parsedProtectedSEXPs.forEach { sexp in
  if (sexp != R_NilValue) {
    if (sexp.isSTRING) {
      if (sexp.count == 1) {
        print("Evaluated result: <chr> \(String(cString: CHAR_Rf_asChar(sexp)))")
      } else {
        print("Evaluated result: [<chr>] \([String](sexp)!)")
      }
    } else if (sexp.isREAL) {
      if (sexp.count == 1) {
        print("Evaluated result: <dbl> \(Double(sexp)!)")
      } else {
        print("Evaluated result: [<dbl>] \([Double](sexp)!)")
      }
    } else if (sexp.isINTEGER) {
      if (sexp.count == 1) {
        print("Evaluated result: <int> \(Int(sexp)!)")
      } else {
        print("Evaluated result: [<int>] \([Int](sexp)!)")
      }
    } else if (sexp.isLOGICAL) {
      if (sexp.count == 1) {
        print("Evaluated result: <lgl> \(Bool(sexp)!)")
      } else {
        print("Evaluated result: [<lgl>] \([Bool](sexp)!)")
      }
    } else {
      print("Evaluted result is of type <\(sexp.type)> which we do not handle yet")
    }
  }
}

5.5 Up Next

We’ll wrap a legit R script in the next chapter.

Before diving in, review the Swift code we used to go to/from Swift data structures to R data structures just to make sure you grok what’s going on.

Once you’ve done that, read back up and find where we marked code with “see Up Next for a TODO for you, here”. Showing a message (an R message(), no less) is hardly silent. Consider your needs and, perhaps, add a parameter to toggle messaging.

Since we’re still doing something dangerous (parsing and evaluating R code that may not actually work), also consider making the function to throw a new RError (go back a few chapters to see how we added the first RError and figure out how to add one for general parse/eval errors).

Remember, code examples can be found on GitHub39.