Posts

Return Upserted Data

Image
There was an interesting issue recently opened. The user made several interesting decisions that to my view can be useful for all ( click to see ): Spatial types . Not only Postgres has a rich GIS/Spatial support, the driver offers dedicated types, to make the experience as smooth as possible. Batch insert with ON_CONFLICT clause - making the insertion pipeline very efficient. RETURNING clause. Since the ON_CONFLICT clause's action is DO_NOTHING , the clause will return all the INSERTed rows. I believe that it filters out a substantial part of objects, so it effectively reduces the amount of work that needs to be performed with the output (instead of the entire set).

Hierarchical Query in SQL with C# and Entity Framework Core

Image
Hierarchies are very common in our lives and therefore in the problems we need to solve. The most familiar cases are organizations, products built from parts and other products, file systems etc. When an hierarchy is stored in a database table, it's usually represented by rows referencing parent rows, recursively. Suppose we want to query all the employees (directly or indirectly) reporting to a mid-level manager. Can we do that in SQL? Of course there is a naive solution to select employees directly reporting to that manager, take their ids, perform the query again, etc. But clearly it's not efficient. Fortunately SQL has a standard support for hierarchical queries in a single and efficient query. The following example demonstrates an hierarchical query using C# and EF Core using ELINQ : int ? managerId = null ; //external parameter - "highest" manager id DbContext.Staffs.Query(() =...

SQL: SELECT DISTINCT on Some Columns

Image
In some cases we want to ensure uniqueness only on a subset of row columns. There are many different solutions proposed, some of them are simple and others are less. I didn't find a generic one. So let's state the problem first, then analyze it and define a generic solution. Problem DISTINCT is not a function, we cannot "apply" it on a subset of columns. Some folks try to substitute it with GROUP BY . But GROUP BY requires that all the SELECT ed columns will be either grouped by or aggregated. Let's assume we use a FIRST / MIN / MAX aggregate. In this case all the columns that are not grouped by will have same values. I can hardly believe this is desired. Analysis Let's understand better why the original DISTINCT approach does not work. We get too many rows, since all those not important columns are returned with their multiple values, therefore we have duplicates on the im...

MsSQL Table-Valued Functions - the forgotten $1M Feature

Image
One day I came by the following StackOverflow question: "When would you use a table-valued function?" Adding to what is replied there, I would say that this is a great platform to collaborate with DBAs, yet keep the responsibility boundaries: the developer "sees" a function that returns an entity. From the developer perspective it's absolutely "normal" entity, which can be joined or filtered. But since this is a function, it can receive parameters! The DBA may use them to SELECT from different tables or load data from file system. There is no limits to what can be done, keeping the integration simple and transparent. Of course ELINQ fully supports them.

Efficient Data Modification With Entity Framework Core

Image
The rule of dumb tells that for optimal performance we need to make as few database requests as possible. This is especially relevant for insert and update scenarios, where we sometimes need to work with thousands of objects. Sending those objects to the database one by one is usually significantly slower than in a batch. (Spoiler - interactive examples of what's discussed below). The number of possible cases is pretty small, but our options and best approach might differ significantly. So, the cases: INSERT UPDATE ( DELETE ) Upsert ( UPDATE or INSERT ) INSERT For this case we have 2 possible optimizations Insert multiple rows in a single statement ( batch insert ) INSERT INTO ... SELECT ... construct, which inserts data from another table ( bulk insert ) UPDATE UPDATE ... WHERE / UPDATE JOIN - so called bulk update . Pay attention, this is not the conventional update where we want one value to be repla...

A Window into Your Data

Image
Today I came across a very nice article explaining the benefits of SQL Window functions. When I was ... younger, I used to work days to implement features that can be coded in a single query with a Window function. They are doomed to oblivion by developers since ORMs massively do not support them. So folks pull data from the database and perform the calculation in memory. What a loss... But say no more to that with ELINQ Window functions support !

Entity Framework - High Performance Querying Using Temp Tables

Image
Today I came across a nice article explaining how we can optimize a query with multiple Include() of related entities. Usage of a temporary is nice, but the way it's created is not. There is absolutely no need to bring the Ids to the app server and then push them back to the database. The whole operation can be performed in the database using SELECT ... INTO construct: static void PrepareStagingCategories ( Product product ) { var temp = ToTable<Product>(TEMP_PRODUCT_IDS); // create temporary table with product ids SELECT<Product>(product.ProductId.@ as ()).INTO(temp); FROM(product); WHERE(product.Name.Contains( "Chocolate" )); }