Wednesday, February 5, 2020

EntityFramework Core and Cosmos DB


Today I want to show you how EntityFramework Core 3.1 works with Cosmos DB. You have probably used EntityFramework Core or EF 6 for relational databases and finally Microsoft added support for non-sql databases.
Azure Cosmos DB is Microsoft's globally distributed, multi-model database service. With a click of a button, Cosmos DB enables you to elastically and independently scale throughput and storage across any number of Azure regions worldwide. You can also take advantage of fast, single-digit-millisecond data access using your favorite API including SQL, MongoDB, Cassandra, Tables, or Gremlin. Cosmos DB provides comprehensive service level agreements (SLAs) for throughput, latency, availability, and consistency guarantees, something no other database service offers.
So, let’s create a new Console App (.Net Core) application in Visual Studio 2019 and name this solution as “EntityFrameworkCosmosSample”
Click “Create” button and update Main function as described below.

class Program
{
    static async Task Main(string[] args)
    {
       
    }
}

We use async Task instead of void because we are going to use EF Core asynchronous methods.
Once this is done open https://portal.azure.com/ with your credentials, enter in search field “cosmos”.
The first item is what you need.
After that, click “Add” button for creating a new Azure Cosmos DB Account
You can take a look how test data was set up for test account.
When you entered all information to required fields, press “Review + create”. In a few minutes of deployment process, you will see the information that it’s successfully finished.
Please open “EntityFrameworkCosmosSample” C# solution and add Student and StudentDetail classes there.

public class Student
{
    public int Id { get; set; }
    public int? TrackingNumber { get; set; }
    public string PartitionKey { get; set; }
    public string Name { get; set; }
    public DateTime DateOfBirth { get; set; }
    public StudentDetail StudentDetail { get; set; }
}

public class StudentDetail
{
    public decimal Height { get; set; }
    public float Weight { get; set; }
}

Before adding StudentContext class with DbSet<Student>, we need to install Microsoft.EntityFrameworkCore.Cosmos package. Open NuGet package manager and enter the package name in search field. After that click on “Install” button for installing cosmos db library.
After Microsoft.EntityFrameworkCore.Cosmos package installation, we can use DbContext class from EF Core. So, with the next step we will add a new class called StudentContext as described below.

public class StudentContext : DbContext
{
    public DbSet<Student> Students { get; set; }
}

After that, we need to override OnModelCreating method to set-up default model creating.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    #region Container
    modelBuilder.Entity<Student>()
        .ToContainer("Students");
    #endregion

    #region PartitionKey
    modelBuilder.Entity<Student>()
        .HasPartitionKey(o => o.PartitionKey);
    #endregion

    #region PropertyNames
    modelBuilder.Entity<Student>().OwnsOne(
        o => o.StudentDetail,
        sa =>
        {
            sa.ToJsonProperty("Detail");
        });
    #endregion

    modelBuilder.Entity<Student>()
        .HasNoDiscriminator();
}

In this example, Student is a simple entity with a reference to the owned type StudentDetail. Let’s explain a little bit what happened there in function OnModelCreating.

Cosmos-specific model customization

Because we are working with Cosmos DB, we have always remembered about existing limitations
  • Even if there is only one entity type without inheritance mapped to a container it still has a discriminator property
  • Entity types with partition keys don't work correctly in some scenarios
  • Include calls are not supported
  • Join calls are not supported

These limitations described above are temporary and should be fixed soon. Many of them are the result of limitations in the underlying Cosmos database engine and are not specific to EF. Another issue which we have right now is that a lot of functionality has not been implemented yet. Azure Cosmos DB SDK supports only async methods, that’s the reason why only async methods are provided in EF Core Cosmos.

Since there are no sync versions of the low-level methods EF Core relies on, the corresponding functionality is currently implemented by calling. Wait() on the returned Task. This means that using methods like SaveChanges, or ToList instead of their async counterparts could lead to a deadlock in your application.

Azure Cosmos DB limitations
You can see the full overview of Azure Cosmos DB supported features, these are the most notable differences compared to a relational database:
  • Client-initiated transactions are not supported
  • Some cross-partition queries are either not supported or much slower depending on the operators involved

So, let’s proceed with model customization. By default, all entity types are mapped to the same container, named after the derived context ("StudentContext" in this case). To change the default container name, use HasDefaultContainer.

modelBuilder.HasDefaultContainer("Store");

To map an entity type to a different container, use ToContainer:

modelBuilder.Entity<Student>()
    .ToContainer("Students");

To identify the entity type that a given item represent? EF Core adds a discriminator value even if there are no derived entity types. The name and value of the discriminator can be changed.
If no other entity type will ever be stored in the same container, the discriminator can be removed by calling HasNoDiscriminator:

modelBuilder.Entity<Student>()
    .HasNoDiscriminator();

If we do not remove discriminator, we will have this field in Cosmos DB. Here is sample below for class Order


Partition keys
By default, EF Core will create containers with the partition key set to "__partitionKey" without supplying any value for it when inserting items. But to fully leverage the performance capabilities of Azure Cosmos, a carefully selected partition key should be used. It can be configured by calling HasPartitionKey:

modelBuilder.Entity<Student>()
    .HasPartitionKey(o => o.PartitionKey);

Once configured the partition key property should always have a non-null value. When issuing a query, a condition can be added to make it single-partition.
Azure Cosmos DB uses partitioning to scale individual containers in a database to meet the performance needs of your application. In partitioning, the items in a container are divided into distinct subsets called logical partitions. Logical partitions are formed based on the value of a partition key that is associated with each item in a container. All items in a logical partition have the same partition key value.
Azure Cosmos DB transparently and automatically manages the placement of logical partitions on physical partitions to efficiently satisfy the scalability and performance needs of the container. As the throughput and storage requirements of an application increase, Azure Cosmos DB moves logical partitions to automatically spread the load across a greater number of servers.
One last step was left to configure our StudentContext for using our Azure Cosmos DB endpoint and private key. Open our https://portal.azure.com/ one more time and open recently created xaero account.
After that, we need to find “Key” from the right list. After that you will see information with your primary and secondary keys and connection strings. 
So, let’s navigate to out StudentContext in .net core application and override there OnConfiguring method as described below.

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder.UseCosmos(
            "https://xaero.documents.azure.com:443/",
            "5WeHhw7MsVt08gzIGj5cjxs2Uhm5g3N44n5On3PeCw7WKXQHIaNlVlPo0dGPZOQTjTBW1eIk4Johx9qdixluzA==",
            databaseName: "StudentsDB");

Method UseCosmos has three required parameters:
  • Account endpoint
  • Account key
  •  Database name

All configuration for Student we have added and now we can update our Main method and try to use StudentContext for adding data in Azure Cosmos DB.

static async Task Main(string[] args)
{
    using (var context = new StudentContext())
    {
        await context.Database.EnsureDeletedAsync();
        await context.Database.EnsureCreatedAsync();

        context.Add(new Student
        {
            Id = 1,
            StudentDetail = new StudentDetail { Height = 1.2m, Weight = 70 },
            PartitionKey = "1"
        });

        await context.SaveChangesAsync();
    }

    using (var context = new StudentContext())
    {
        var student = await context.Students.FirstAsync();
        Console.WriteLine($"First student has height: {student.StudentDetail.Height}, and weight: {student.StudentDetail.Weight}");
        Console.WriteLine();
    }
}

In this sample, we just have added one item to Students table and get the first item from there.
So, let’s open our azure portal one more time and take a look how our data were stored. For this purpose, refresh your portal page and find the item called “Data Explorer”.
We have only one item here and we can open it.
The next step I would suggest to add code for updating existing value. After inserting our student, please replace all code below to this one:

using (var context = new StudentContext())
{
    var student = await context.Students.FirstAsync();
    var orderEntry = context.Entry(student);
    student.Name = "Alex";

    orderEntry.State = EntityState.Modified;

    await context.SaveChangesAsync();
}

using (var context = new StudentContext())
{
    var student = await context.Students.FirstAsync();
    Console.WriteLine($"First student {student.Name} has height: {student.StudentDetail.Height}, and weight: {student.StudentDetail.Weight}");
    Console.WriteLine();
}

In this code, we have changed our student name from null to Alex.

{
    "Id": 1,
    "DateOfBirth": "0001-01-01T00:00:00",
    "Name": "Alex",
    "PartitionKey": "1",
    "TrackingNumber": null,
    "id": "1",
    "Detail": {
        "Height": 1.2,
        "Weight": 70
    },
    "_rid": "RIcPAIf4CNwBAAAAAAAAAA==",
    "_self": "dbs/RIcPAA==/colls/RIcPAIf4CNw=/docs/RIcPAIf4CNwBAAAAAAAAAA==/",
    "_etag": "\"fb00abce-0000-0700-0000-5e0e0b300000\"",
    "_attachments": "attachments/",
    "_ts": 1577978672
}

Sometimes, when you cannot perform particular operations with EntityFramework on Azure Cosmos DB you can use directly CosmosClient from your DB context. In example below, I will remove “TrackingNumber” from existing student entity using CosmosClient. I will add next code just after previous update code described above.

#region CosmosClient
using (var context = new StudentContext())
{
    var cosmosClient = context.Database.GetCosmosClient();
    var database = cosmosClient.GetDatabase("StudentsDB");
    var container = database.GetContainer("Students");

    var resultSet = container.GetItemQueryIterator<JObject>(new QueryDefinition("select * from s"));
    var student = (await resultSet.ReadNextAsync()).First();

    Console.WriteLine($"First student JSON: {student}");

    student.Remove("TrackingNumber");

    await container.ReplaceItemAsync(student, student["id"].ToString());
}
#endregion

After run our solution we can open azure portal and verify that we do not have TrackingNumber field there.

{
    "Id": 1,
    "DateOfBirth": "0001-01-01T00:00:00",
    "Name": "Alex",
    "PartitionKey": "1",
    "id": "1",
    "Detail": {
        "Height": 1.2,
        "Weight": 70
    },
    "_rid": "ebQvAOfIVm8BAAAAAAAAAA==",
    "_self": "dbs/ebQvAA==/colls/ebQvAOfIVm8=/docs/ebQvAOfIVm8BAAAAAAAAAA==/",
    "_etag": "\"590065f6-0000-0700-0000-5e0e0d800000\"",
    "_attachments": "attachments/",
    "_ts": 1577979264
}

Great. It is exactly what we expected.
I think you wonder if it’s possible to track somehow this unstructured data. For example, if someone removed particular field from Cosmos DB, is it possible to know about that somehow. So, the response would be – yes. It is possible to tack missing properties in Azure Cosmos DB. Please look at the example below to check how you can do that.

#region Missing Properties
using (var context = new StudentContext())
{
    var students = await context.Students.ToListAsync();
    var sortedStudents = await context.Students.OrderBy(o => o.TrackingNumber).ToListAsync();

    Console.WriteLine($"Number of students: {students.Count}");
    Console.WriteLine($"Number of sorted students: {sortedStudents.Count}");
}
#endregion
When we run our solution, we will get the result:

Number of students: 1

Number of sorted students: 0

That looks like the result what we expected to have. I’d like to add the whole Main function for self-checking your code.

static async Task Main(string[] args)
{
    using (var context = new StudentContext())
    {
        await context.Database.EnsureDeletedAsync();
        await context.Database.EnsureCreatedAsync();

        context.Add(new Student
        {
            Id = 1,
            StudentDetail = new StudentDetail { Height = 1.2m, Weight = 70 },
            PartitionKey = "1"
        });

        await context.SaveChangesAsync();
    }

    using (var context = new StudentContext())
    {
        var student = await context.Students.FirstAsync();
        var orderEntry = context.Entry(student);
        student.Name = "Alex";

        orderEntry.State = EntityState.Modified;

        await context.SaveChangesAsync();
    }

    using (var context = new StudentContext())
    {
        var student = await context.Students.FirstAsync();
        Console.WriteLine($"First student {student.Name} has height: {student.StudentDetail.Height}, and weight: {student.StudentDetail.Weight}");
        Console.WriteLine();
    }

    #region CosmosClient
    using (var context = new StudentContext())
    {
        var cosmosClient = context.Database.GetCosmosClient();
        var database = cosmosClient.GetDatabase("StudentsDB");
        var container = database.GetContainer("Students");

        var resultSet = container.GetItemQueryIterator<JObject>(new QueryDefinition("select * from s"));
        var student = (await resultSet.ReadNextAsync()).First();

        Console.WriteLine($"First student JSON: {student}");

        student.Remove("TrackingNumber");

        await container.ReplaceItemAsync(student, student["id"].ToString());
    }
    #endregion

    #region Missing Properties
    using (var context = new StudentContext())
    {
        var students = await context.Students.ToListAsync();
        var sortedStudents = await context.Students.OrderBy(o => o.TrackingNumber).ToListAsync();

        Console.WriteLine($"Number of students: {students.Count}");
        Console.WriteLine($"Number of sorted students: {sortedStudents.Count}");
    }
    #endregion
}

Conclusions

Finally, Microsoft has updated EntityFramework Core for supporting non-sql database such as Azure Cosmos Db. If you see, this is quite raw product with so many limitations in EF Core and Azure Cosmos DB, however at the same time you can combine EF Core with CosmosClient and solve the complex query or updated which you are not able to do with EF Core limitation.

Bonus

Cosmos DB has limitation data types parsing. For example Cosmos DB does not support byte[] type. You will find the table with all supported types by Cosmos DB bellow.
Data Type in Schema
Allowed Run-time Data Type
Integer
Short, Integer
BigInt or Long
Short, Integer, Long (maximum precision 19)
Float
Short, Integer, Long, Float
Double
Short, Integer, Long, Float, Double
Decimal (maximum precision 28)
Short, Integer, Long, Float, Double, Long
String
String


No comments:

Post a Comment