jeudi 29 janvier 2015

Probelm with parallelzing sql query in R

I am new to R and was doing a academic project and i was trying to extract some data from data base. It works normally (without parallelizing )when i parallelize it show some error. i want to get the result for query and check the value is above 100, if so then i must store the result in frame else skip to next id.



while (nflights < number_of_flights) {
query = paste("SELECT longitude AS lon, latitude AS lat,",
"CONVERT(serial, UNSIGNED) AS serial,",
nflights, "as flight",
"FROM ***** m, ***** p",
"WHERE m.id=p.id",
"AND serial=******",
"AND flight=", startID,
"ORDER BY m.time")
positions <- select(query)// is function written for extracting data
startID <- startID+1

# not enough positions, try next
if (nrow(positions) < positions_per_flight) next


parallelizing query



for (i in seq(nflights, number_of_flights, by=increment)) {

max_i = if (i+increment-1<number_of_flights) i+increment-1 else number_of_flights

foreach(f=startID:startID + 10,.combine='rbind',.inorder=FALSE,
.multicombine=TRUE,.errorhandling="stop") %dopar% {

con <- dbConnect(MySQL(),user="****", password="********",dbname="******", host="**********")
query = paste("SELECT longitude AS lon, latitude AS lat,",
"CONVERT(serial, UNSIGNED) AS serial,",
nflights, "as flight",
"FROM ****** m,****** p",
"WHERE m.id=p.id",
"AND serial=*****",
"AND flight=", startID,
"ORDER BY m.time")

positions<-dbGetQuery(ch,query)
startID <- startID+1
# dbClearResult(positions)
if (nrow(positions) > positions_per_flight){
positions <- positions[idc,]
positions <- cbind(positions, data.frame(i=1:positions_per_flight))
}
}


}


i am getting error "Error in { : task 1 failed - "could not find function "dbConnect""" also with dbQuery too"


i installed and import all the packages, it works when i do it serial manner!


any tips??


thanks in advance


Aucun commentaire:

Enregistrer un commentaire