mercredi 28 janvier 2015

When filtering an SQLite database with dplyr, should `|` be avoided?

I ran into some unexpected behavior when filtering an SQLite database with dplyr in R. Here is an example of what I see:



library(dplyr)

# Some test data
df <- tbl_df(data.frame(
v1 = c('a', 'b', 'a', 'b'),
v2 = c('b', 'a', 'a', 'b'),
v3 = month.abb[1:4]))

db <- copy_to(src_sqlite('example.sqlite', create = TRUE), df)

filter(df, v1 == 'a' | v2 == 'a', v3 == 'Jan')
# Source: local data frame [1 x 3]
#
# v1 v2 v3
# 1 a b Jan

filter(db, v1 == 'a' | v2 == 'a', v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [2 x 3]
# Filter: v1 == "a" | v2 == "a", v3 == "Jan"
#
# v1 v2 v3
# 1 a b Jan
# 2 a a Mar


I am very new to working with databases, so I feel like I'm missing something obvious here having to do with using | in a filter since this works as expected:



filter(db, v1 == 'a', v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [1 x 3]
# Filter: v1 == "a", v3 == "Jan"
#
# v1 v2 v3
# 1 a b Jan


Should | be avoided when working in databases with dplyr?


Aucun commentaire:

Enregistrer un commentaire