Consider there is a table of job runs history with the following schema:
job_runs
(
run_id integer not null, -- identifier of the run
job_id integer not null, -- identifier of the job
run_number integer not null, -- job run number, run numbers increment for each job
status text not null, -- status of the run (running, completed, killed, ...)
primary key (run_id)
-- ...
)
and it is required to get the last 10 runs with status != 'running'
for each job (jobs differ by job_id
). To do that I wrote the following query:
SELECT
*
FROM
job_runs AS JR1
WHERE
JR1.run_number IN
(
SELECT
JR2.run_number
FROM
job_runs AS JR2
WHERE
JR2.job_id = JR1.job_id
AND
JR2.status != 'running'
ORDER BY
JR2.run_number
DESC
LIMIT
10
)
It do what I need, but even though there is a multifield index on the job_id
and run_num
fields of the job_runs
table the query is slow, because it scans job_runs table and for each its row runs subquery. The index helps subqueries to run fast each time, but the fact that the nester query scans entire table kills performance. So how can I tune performance of the query?
some thoughts:
Number of jobs (different job_id
s) is small and if there were a FOR loop in SQLite it would be easy to loop over all distinct job_id
s and run the subquery passing the job id instead of JR1.job_id
then UNION all results.
important:
Please don't suggest to run the loop inside the source code of my application. I need pure SQL solution.
Aucun commentaire:
Enregistrer un commentaire