samedi 26 mars 2016

(Entity framework) Group by - low performance

I have very peculiar problem with performance of Entity Framework. I use version 7 of the framework with SQLite provider (both from nuget). Database have around 10 millions of records but in the future there will be around 100 millions. The construction of db is very simple:

public class Sample
{
    public int SampleID { get; set; }
    public long Time { get; set; }
    public short Channel { get; set; } /* values from 0 to 8191, in the presented test 0-15 */
    public byte Events { get; set; } /* 1-255 */
}

public class Channel
{
    public int ChannelID { get; set; }
    public short Ch { get; set; }
    public int Es { get; set; }
}

public class MyContext : DbContext
{
    // This property defines the table
    public DbSet<Sample> Samples { get; set; }
    public DbSet<Channel> Spectrum { get; set; }

    // This method connects the context with the database
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        var connectionStringBuilder = new SqliteConnectionStringBuilder { DataSource = "E://database.db" };
        var connectionString = connectionStringBuilder.ToString();
        var connection = new SqliteConnection(connectionString);

        optionsBuilder.UseSqlite(connection);
    }
}

I try to group events by channel and sum them up into something like spectrum. When I use linq2sql I have very low performance. For 10m of records the query takes about 15 minutes and get around 1 GB of RAM and then throws an OutOfMemoryException - I think that Entity Framework is loading all records as objects into memory - but why? On the other hand, simple SQL needs about 3 seconds and takes no significant amount of RAM.

        using (var db = new MyContext())
        {
            var res1 = from sample in db.Samples
                       group sample by sample.Channel into g
                       select new { Channel=g.Key, Events = g.Sum(s => s.Events) };
            res1.ToArray();

            var res2 = db.Natas.FromSql("SELECT Channel as ChannelID, Channel as Ch, SUM(Events) as Es FROM Sample GROUP BY Channel");
            var data = res2.ToArray();
        }

Any suggestions? Thank for help ;)

Aucun commentaire:

Enregistrer un commentaire