lundi 2 novembre 2015

How to run loops in sqlite

Consider there is a table of job runs history with the following schema:

job_runs
(
    run_id integer not null, -- identifier of the run
    job_id integer not null, -- identifier of the job
    run_number integer not null, -- job run number, run numbers increment for each job
    status text not null, -- status of the run (running, completed, killed, ...)
    primary key (run_id)
    -- ...
)

and it is required to get the last 10 runs with status != 'running' for each job (jobs differ by job_id). To do that I wrote the following query:

SELECT
    *
FROM
    job_runs AS JR1
WHERE
    JR1.run_number IN
    (
        SELECT
            JR2.run_number
        FROM
            job_runs AS JR2
        WHERE
            JR2.job_id = JR1.job_id
            AND
            JR2.status != 'running'
        ORDER BY
            JR2.run_number
        DESC
        LIMIT
            10
    )

It do what I need, but even though there is a multifield index on the job_id and run_num fields of the job_runs table the query is slow, because it scans job_runs table and for each its row runs subquery. The index helps subqueries to run fast each time, but the fact that the nester query scans entire table kills performance. So how can I tune performance of the query?

some thoughts:

Number of jobs (different job_ids) is small and if there were a FOR loop in SQLite it would be easy to loop over all distinct job_ids and run the subquery passing the job id instead of JR1.job_id then UNION all results.

important:

Please don't suggest to run the loop inside the source code of my application. I need pure SQL solution.

Aucun commentaire:

Enregistrer un commentaire