Categories
.NET

Entity Framework Core 3

I recently upgraded an .NET Core 2 project to .NET Core 3 and with that I also updated other packages, including EF Core. Now EF Core 3 has quite a list of breaking changes you can se them in the list below.

https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-3.0/breaking-changes

The worst change for my application was one that is quit a bit down on the list and one that I did not see until I found the actual issue in my code.

Before 3.0, eagerly loading collection navigations via Include operators caused multiple queries to be generated on relational database, one for each related entity type.

Eager loading of related entities now happens in a single query

This does not sound to bad, nothing will stop working, but it will effect performance. If it effects performance depends on your query, some will go faster and some will go slower.

In my case I had an query that did not like this change at all. the query created an list of items. These items then had a bunch of data from other tables, created an 3d dimensional object. I Mocked up an example below on how the Linq Query looked like.

return context.Item
	.Select(i => new ResultItem()
	{
		ItemId   = i.ItemId,
		Siblings = i.parentNavigation.Items.Count(),
		Position = i.parentNavigation.Items.Count(p => p.Order < I.Order),
		tags = i.Tag.Select(t => new ResultTag {
			tag = t.tag
		})
	}).ToList();

In EF Core 2, this worked quite well. EF Core divided the code in two parts. First it did one query to fetch the items, count the siblings and position, and then it would do multiple queries to fetch the tags. So if I fetched 12 items. it would be 1 query to fetch the items and then 12 queries to fetch all the tags.

EF Core 3 however does one query, ONE BIG QUERY. In this case it resulted in some performance drop. In EF Core 2 it had a stable 250ms execution time, independent on how many tags each item had. In EF Core 3 it hovered around 650ms but could be slower if one item had more tags than usual.

Improvement 1

var items = context.Item
	.Select(i => new ResultItem()
	{
		ItemId   = i.ItemId,
                ParentId = i.ParentId,
                Order = i.Order
	}).ToList();

foreach (var item in items)
{
	item.Siblings = context.Item.Count(i => i.ParentId == item.ParentId);
	item.Position = context.Item.Count(i => i.ParentId == item.ParentId && i.Order < item.Order);
	item.tags = context.Tag.Where(t => t.ItemId = item.ItemId).Select(t => new ResultTag {
			tag = t.tag
		}).ToList();
}

First step was to separate the different queries. We simply fetched the item first and then looped over them to fetch the extra data needed. This resulted in quite some improvement, we decreased the time from around 650ms to 350ms. But it was still not fast enough.

The reason it still not as fast as before is because this code results in a lot of queries. If we fetch 12 items, it will result 1 query to fetch the items, then it will loop over all the items and do 3 separate queries for each item. Resulting in a total of 1+36 queries against the database.

Each query means another roundtrip to the database. And this is partly the reason why EF Core decided to make the change in EF core 3, to try to avoid as many roundtrips as possible.

Improvement 2

var items = context.Item
	.Select(i => new ResultItem()
	{
		ItemId   = i.ItemId,
                ParentId = i.ParentId,
                Order = i.Order
	}).ToList();

var itemIds = items.Select(c => c.ItemId);
var tags = context.Tag.Where(t => itemIds.Contains(t.ItemId)).Select(t => new ResultTag {
	ItemId = t.ItemId,
	Tag = t.Tag
}).ToList();

var parentIds = items.Select(i => i.ParentId);
var siblings = context.Item.Where(i => parentIds.Contains(i.parentId)).Select(i => new SiblingItem {
	ParentId = i.ParentId,
	Order = i.Order
}).ToList();

foreach (var item in items)
{
	item.Siblings = siblings.Count(i => i.ParentId == item.ParentItem);
	item.Position = siblings.Count(i => i.ParentId == item.ParentItem && i.Order < item.Order);
	item.tags = tags.Where(t => t.ItemId = item.ItemId);
}

Now we have complicated our code somewhat, but this was the fastest way of doing it. As previous example we first fetch all the items. Our next step is for all items we fetched, select their siblings in on query and select all their tags in another query. We can then loop over the items and filter out the result for each item. This resulted in a total of 3 queries being sent to the database and the execution time dropped to under 100ms.