Using R: tibbles and the t.test function

A participant in the R course I’m teaching showed me a case where a tbl_df (the new flavour of data frame provided by the tibble package; standard in new RStudio versions) interacts badly with the t.test function. I had not seen this happen before. The reason is this:

Interacting with legacy code
A handful of functions are don’t work with tibbles because they expect df[, 1] to return a vector, not a data frame. If you encounter one of these functions, use as.data.frame() to turn a tibble back to a data frame (tibble announcement on RStudio blog)

Here is code that reproduces the situation (tibble version 1.2):

data(chickwts)
chick_tibble <- as_tibble(chickwts)
casein <- subset(chickwts, feed == "casein")
sunflower <- subset(chick_tibble, feed == "sunflower")
t.test(sunflower$weight, casein$weight) ## this works
t.test(as.data.frame(sunflower[, 1]), as.data.frame(casein[, 1])) ## this works too
t.test(sunflower[, 1], casein[, 1]) ## this doesn't

Error: Unsupported use of matrix or array for column indexing

I did not know that. The solution, which they found themselves, is to use as.data.frame.

I can see why not dropping to a vector makes sense. I’m sure you’ve at some point expected a data frame and got an ”$ operator is invalid for atomic vectors”. But it’s an unfortunate fact that number of weird little thingamajigs to remember is always strictly increasing as the language evolves. And it’s a bit annoying that the standard RStudio setup breaks code that uses an old stats function, even if it’s in somewhat non-obvious way.