I ran into some unexpected behavior when filtering an SQLite database with dplyr in R. Here is an example of what I see:
library(dplyr)
# Some test data
df <- tbl_df(data.frame(
v1 = c('a', 'b', 'a', 'b'),
v2 = c('b', 'a', 'a', 'b'),
v3 = month.abb[1:4]))
db <- copy_to(src_sqlite('example.sqlite', create = TRUE), df)
filter(df, v1 == 'a' | v2 == 'a', v3 == 'Jan')
# Source: local data frame [1 x 3]
#
# v1 v2 v3
# 1 a b Jan
filter(db, v1 == 'a' | v2 == 'a', v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [2 x 3]
# Filter: v1 == "a" | v2 == "a", v3 == "Jan"
#
# v1 v2 v3
# 1 a b Jan
# 2 a a Mar
I am very new to working with databases, so I feel like I'm missing something obvious here having to do with using |
in a filter since this works as expected:
filter(db, v1 == 'a', v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [1 x 3]
# Filter: v1 == "a", v3 == "Jan"
#
# v1 v2 v3
# 1 a b Jan
Should |
be avoided when working in databases with dplyr?
Aucun commentaire:
Enregistrer un commentaire