Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling BulkInsertOrUpdate with SetOutputIdentity=true, PreserveInsertOrder=true, returns data that contains data from other rows. #1629

Open
MiladNL opened this issue Dec 5, 2024 · 15 comments
Labels

Comments

@MiladNL
Copy link

MiladNL commented Dec 5, 2024

I have a table of instances. Example:

id:1, someValue: A
id:0, someValue: B
id:0, someValue: C

Instances contains database records with some of them Id = 0 (to be inserted) and some records id > 0 (to be updated).
Column id is an Identity Column:

image

var bulkConfig = new BulkConfig { SetOutputIdentity = true, PreserveInsertOrder = true, SRID = Constants.DefaultSrid }; _context.BulkInsertOrUpdate(instances, bulkConfig);

After running the code, instances will be populated with data from other rows:
id:1, someValue: C
id:2, someValue: A
id:3, someValue: B

This happens in EFCore.BulkExtensions 8.1.2 (8.1.1 is working correctly).

@streamBASETdS
Copy link

Hi, I can confirm that the IDs returned by the BulkInsert call when PreserveInsertOrder and SetOutputIdentity is true are arbitrarily sorted and NOT correct. Entity instances get different ID values than is written to the DB. If you use those returned IDs to set FKs, it results in absolute chaos :-(

@buituansonHE172207
Copy link

I have the same issue and i have to change to use add range async

@borisdj
Copy link
Owner

borisdj commented Dec 7, 2024

Can you make a Test for this issue.
I have added this one and it passed:

[Theory]
[InlineData(SqlType.SqlServer)]
public void BulkOrderedInsertTest(SqlType sqlType)
{
  ContextUtil.DatabaseType = sqlType;
  using var context = new TestContext(ContextUtil.GetOptions());

  context.Items.Add(new Item { ItemId = 0, Name = "Name 0", TimeUpdated = DateTime.Now });
  context.SaveChanges();

  var entities = new List<Item>();
  entities.Add(new Item { ItemId = 1, Name = "Name A", TimeUpdated = DateTime.Now,});
  entities.Add(new Item { ItemId = 0, Name = "Name B", TimeUpdated = DateTime.Now, });
  entities.Add(new Item { ItemId = 0, Name = "Name C", TimeUpdated = DateTime.Now, });

  var bulkConfig = new BulkConfig
  {
    SetOutputIdentity = true,
    PreserveInsertOrder = true
  };
  context.BulkInsertOrUpdate(entities, bulkConfig);

  Assert.Equal(1, entities[0].ItemId);
  Assert.Equal(2, entities[1].ItemId);
  Assert.Equal(3, entities[2].ItemId);

  Assert.Equal("Name A", entities[0].Name);
  Assert.Equal("Name B", entities[1].Name);
  Assert.Equal("Name C", entities[2].Name);
}

@buituansonHE172207
Copy link

I think you should test with auto identity with guid and using bulk insert, i have problem when using this
image
image

@streamBASETdS
Copy link

Can you make a Test for this issue. I have added this one and it passed:

For me, the issue doesn't occur for small sized bulk inserts (a few entries). When bulkinserting larger numbers of entities (1000+), the issue shows. I'm just inserting new entities (not upserting); and I'm using the technique to assign incrementing "pre-IDs" to define the sort order. I'm using negative numbers for this to differentiate between new and existing entities when debugging. Up to 8.1.1 this works perfectly.

@borisdj
Copy link
Owner

borisdj commented Dec 9, 2024

@streamBASETdS you say you are not Upserting but then menition that need to differentiate between new and existing entities.
Which one is it, if you have some for update then you should use InsertOrUpdate.

@streamBASETdS
Copy link

No, I meant I use negative "pre-IDs" for new entries, and I use them also to differentiate during debugging (so if I encounter entities with negative IDs while stepping through code, I know those are new ones that will be bulk-inserted later; positive IDs are existing entities that are not bulked whatsoever but rather updated in a completely different codepath that allows for dynamic diffing/historising of data etc). Everything that is fed to EFCore.BulkExtensions is "new" with negative IDs.

@borisdj
Copy link
Owner

borisdj commented Dec 9, 2024

@streamBASETdS are you using SqlServer, here is a new sample and it also passes:

[Theory]
[InlineData(SqlType.SqlServer)]
public void BulkOrderedInsertTest(SqlType sqlType)
{
  ContextUtil.DatabaseType = sqlType;
  using var context = new TestContext(ContextUtil.GetOptions());

  context.Items.RemoveRange(context.Items.ToList());
  context.SaveChanges();

  int counter0 = 1_000;
  var entities0 = new List<Item>();
  for (int i = 1; i <= counter0; i++)
  {
    entities0.Add(new Item { Name = "Name Init" + i, TimeUpdated = DateTime.Now });
  }
  context.Items.AddRange(entities0);
  context.SaveChanges();

  int startingIdNumber = entities0[counter0 - 1].ItemId;

  var entities = new List<Item>();
  int counter = - 10_000;
  int number = 0;
  for (int i = counter; i < 0 ; i++)
  {
    number++;
    entities.Add(new Item { ItemId = i, Name = "Name Bulk " + number, TimeUpdated = DateTime.Now });
  }

  var bulkConfig = new BulkConfig
  {
    SetOutputIdentity = true,
    PreserveInsertOrder = true
  };
  context.BulkInsert(entities, bulkConfig);

  Assert.Equal(startingIdNumber + 0 + 1, entities[0].ItemId);
  Assert.Equal(startingIdNumber + 1000 + 1, entities[1000].ItemId);
  Assert.Equal(startingIdNumber + 2500 + 1, entities[2500].ItemId);
  Assert.Equal(startingIdNumber + 4678 + 1, entities[4678].ItemId);
  Assert.Equal(startingIdNumber + 8888 + 1, entities[8888].ItemId);

  Assert.Equal("Name Bulk " + (0 + 1), entities[0].Name);
  Assert.Equal("Name Bulk " + (1000 + 1), entities[1000].Name);
  Assert.Equal("Name Bulk " + (2500 + 1), entities[2500].Name);
  Assert.Equal("Name Bulk " + (4678 + 1), entities[4678].Name);
  Assert.Equal("Name Bulk " + (8888 + 1), entities[8888].Name);
}

@streamBASETdS
Copy link

Yes, I'm using SqlServer; I'll try to create a minimal reproduction-example project, but I didn't succeed just yet (a new minimal project with 8.1.2 showed the same correct behaviour as your test, but bumping my full solution to 8.1.2 shows the errorneous behaviour; I can't share the code of the full solution obviously).

@streamBASETdS
Copy link

streamBASETdS commented Dec 11, 2024

Unfortunately, I didn't succeed in reproducing the issue in a fresh project. This is bad, as the behaviour in 8.1.2 causes data corruption (wrong FKs) - I don't find the difference in the bulk calls / preparation between my test project and the real app solution. I'll continue investigating

@MiladNL
Copy link
Author

MiladNL commented Dec 11, 2024

I am also unable to replicate the issue in a new project.

For our current code, we have reverted to version 8.1.1.

I plan to test it again soon with version 8.1.2 (or newer).

@borisdj
Copy link
Owner

borisdj commented Dec 11, 2024

Might be the case that you have some peculiar settings or model.
Also have you installed the same nugget (8.1.2) in test project, or you have taken the latest source?

@streamBASETdS
Copy link

Same nuget 8.1.2 for my test-app

@borisdj
Copy link
Owner

borisdj commented Dec 12, 2024

One thing that comes to mind is to load the entire source of Lib into your real project and try to debug it.
But since this is not some explicit exception, not sure if it would be easy to find the cause.
Might help to try pin-point the commit when the change was made.
Fastest way would be to look at commit history between 8.1.1. and 8.1.2., list those repos in time, but skip ones where only change was ReadMe file (there are a lot of those). Then take the remaining ones and use Interval Splitting approach, and in few steps and tests you should be able on find when the bug was introduced. Then we could focus analysis on changes in that commit.

@streamBASETdS
Copy link

Will do, already checked out EF.BulkExtensions from github. I'll need a bit of time (not sure if I'll manage this in the rest of 2024 or in January). I'll come back to this ticket when I have results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants