jeudi 12 février 2015

Forcing an Entity Framework 6 query to use the correct index

I have a C# application that uses SQLite as the database and the SQLite Entity Framework 6 provider to generate queries based on user input.


The database contains the following tables and indexes:



CREATE TABLE Lists (
ListRowId INTEGER NOT NULL PRIMARY KEY,
ListId GUID NOT NULL,
ListName TEXT NOT NULL
);

CREATE UNIQUE INDEX [IX_Lists_ListId] ON [Lists] ( [ListId] );

-- One to many relationship: Lists => ListDetails
CREATE TABLE ListDetails (
ListDetailRowId INTEGER NOT NULL PRIMARY KEY,
ListDetailId GUID NOT NULL,
ListId GUID NOT NULL,
Plate TEXT
);

CREATE INDEX [IX_ListDetails_Plate] ON [ListDetails] ( [Plate] ASC );

CREATE TABLE Reads (
ReadRowId INTEGER NOT NULL PPRIMARY KEY,
ReadId GUID NOT NULL,
Plate TEXT
);

-- 1 To many relationship: Reads => Alarms.
-- There may be rows in Reads that have no related rows in Alarms.
CREATE TABLE Alarms (
AlarmRowId INTEGER NOT NULL PPRIMARY KEY,
AlarmId GUID NOT NULL,
ListId GUID NOT NULL,
ListDetailId GUID NOT NULL,
ReadRowId INTEGER NOT NULL
);

CREATE INDEX [IX_Alarms_ListId_ListDetailId] ON [Alarms] ([ListId], [ListDetailId]);

CREATE INDEX [IX_Alarms_ReadId] ON [Alarms] ([ReadRowId]);


Please note that the DDL above only includes the relevant columns and indexes. For reasons of speed and the large number of rows in the ListDetails table, there is no index on the ListDetailId GUID column; nor can I create one. In fact, I cannot change the database's schema at all.


The database does not have any foreign key relationships defined between any of these tables. The reason is internal to our system. I repeat, I cannot change the schema.


Using the SQLite EF6 provider, I've built an entity model from the database. It is a database first model as the application was originally written using a different database and EF 4. We upgraded it to EF 6 and replaced the database with SQLite.


While processing user input, I have to put together a query that joins these tables. Here's the basic EF expression I've built.



from read in context.Reads
join alrm in context.Alarms on read.ReadRowId equals alrm.ReadRowId into alarmJoin
from alarm in alarmJoin.DefaultIfEmpty()
join e in context.ListDetails on alarm.ListPlate equals e.Plate into entryJoin
from entry in entryJoin.DefaultIfEmpty()
join l in context.Lists on alarm.ListId equals l.ListId into listJoin
from list in listJoin.DefaultIfEmpty()
where alarm.ListDetailId = entry.ListDetailId
select new {
alarm,
list.ListName,
read
};


I've used the debugger to take that expression and generate the SQL. I've reduced the output for brevity, as the only part I'm interested in are the join on the ListDetails table:



SELECT *
FROM [Reads] AS [Extent1]
LEFT OUTER JOIN [Alarms] AS [Extent2] ON [Extent1].[ReadRowId] = [Extent2].[ReadRowId]
LEFT OUTER JOIN [ListDetails] AS [Extent3] ON ([Extent2].[ListPlate] = [Extent3].[Plate]) OR (([Extent2].[ListPlate] IS NULL) AND ([Extent3].[Plate] IS NULL))
LEFT OUTER JOIN [Lists] AS [Extent4] ON [Extent2].[ListId] = [Extent4].[ListId]
WHERE ([Extent2].[ListDetailId] = [Extent3].[ListDetailId]) OR (([Extent2].[ListDetailId] IS NULL) AND ([Extent3].[ListDetailId] IS NULL))


Executing EXPLAIN QUERY PLAN on this shows that the query will perform a table scan of the ListDetails table. I do not want that to happen; I want the query to use the index on the Plate column.


If I remove the where clause, the SQL that's generated is different:



SELECT *
FROM [Reads] AS [Extent1]
LEFT OUTER JOIN [Alarms] AS [Extent2] ON [Extent1].[ReadRowId] = [Extent2].[ReadRowId]
LEFT OUTER JOIN [ListDetails] AS [Extent3] ON ([Extent2].[ListPlate] = [Extent3].[Plate]) OR (([Extent2].[ListPlate] IS NULL) AND ([Extent3].[Plate] IS NULL))
LEFT OUTER JOIN [Lists] AS [Extent4] ON [Extent2].[ListId] = [Extent4].[ListId]


EXPLAIN QUERY PLAN on this query shows that the database does indeed use the index on the ListDetails table's Plate column. This is what I want to happen. But, there may be multiple rows in the ListDetails table that have the same Plate; it is not a unique field. I need to return the one and only row that matches the information available to me in the Alarms table.


How do I make my query use the index on the Plate column?


Aucun commentaire:

Enregistrer un commentaire