I have a table with ~500 million rows and 49 columns of various data types with a composite index on a non-nullable columns user_id and order_id in a SQLite database.
Multiple rows appear for the same user_id and order_id corresponding to the unique items within an order. For instance an order might look like this (excluding non-relevant columns):
USER_ID ORDER_ID ORDER_TOTAL ORDER_DATE CATEGORY ITEM_TOTAL
--------- ------------ ----------- ---------- -------- ----------
094f077b2 47f12c149808 330.41 2015-03-01 BEER 20.30
094f077b2 47f12c149808 330.41 2015-03-01 BEER 36.23
094f077b2 47f12c149808 330.41 2015-03-01 BEER 55.03
As you can see, there are missing items from the order since the sum of the item_total does not equal the order_total.
I would like to extract the unique values of order_date, order_total and order_category for each order_id in order to compute a valid category total by date.
I have come up with the two most obvious solutions:
SELECT category, order_date, SUM(order_total)
FROM (
SELECT DISTINCT category, order_id, order_date, order_total
FROM order_history
)
GROUP BY category, order_date
;
and
SELECT category, order_date, SUM(order_total)
FROM (
SELECT category, order_id, order_date, AVG(order_total)
FROM order_history
GROUP BY category, order_id, order_date
)
GROUP BY category, order_date
;
Where the AVG() could be MIN() or MAX() as well to achieve the same result. Is any one of these preferred over the other or is there a better way in general to achieve this?
Aucun commentaire:
Enregistrer un commentaire