Ciddan (2) [Avatar] Offline
#1
Hey everyone!

I'm trying to make an application that imports product data from an XML-file and then storing the data in an MSSQL-database.

The way I'm trying to do it is:

Extract relevant data from the importfile using LINQ to XML and then loop through all the XElements extracted and put them in my database using a DataContext. The biggest problem is that the data needs to be regulary updated (say once a day - on the products that need to be updated with new data). How do I go about updating entries in an efficient way? If I just add the products using the .InsertOnSubmit-method the import is lightning-fast, it runs through 200+ XElements in less than a second - however it is bound to fail due to the fact that the products are most likley already in the database, giving me a DuplicateKeyException. I've written a method that checks the database to see if the product is already there and if there is a need to update it. However, using the method slows the import down to a crawl, since it has to send a ton of select-statements.

What I would like to do is one of two things - either have the application do an Update instead of an Insert as soon as it finds a duplicate key (which I guess would be slow, since I'd have to Catch the DuplicateKeyException and then do something to the transaction) or to replace the insert-statement with a stored procedure on the SQL server and deal with the updating there. I have created a Stored Procedure that does exactly what I need and changed the Insert statement on the table from Runtime to the SP. The DataContext still crashes on a DuplicateKeyException even though my SP is fully capable of handling duplicate keys.

How do I get around this?

P.S.
The book is phenomenal. Great stuff.
fabrice.marguerie (224) [Avatar] Offline
#2
Re: DuplicateKeyException - Switching from Insert to Update
Hello Mikael,

I'm glad to see that you enjoy the book! Feel free to spread the word and let people know about it smilie

Regarding your question, do you have a way to identify the records by key from the information you have in the XML import file? Would the following approach work for you?

1) You select all the keys you have in the database and keep them in memory. Here is sample code if the keys are integers: List<int> existingKeys = db.YourTable.Select(record => record.TheKey).ToList()

2) You test if a a record exists in your loop to decide whether to insert or update. Even better, you can separate the XElements to process using LINQ queries and then process each sets separately:
var toInsert = allElements.Where(element => existingKeys.Contains((int)element.Attribute("theKey"smilie);
var toUpdae = allElements.Where(element => !existingKeys.Contains((int)element.Attribute("theKey"smilie);


This may not be perfect code, but just an idea.

Fabrice
Ciddan (2) [Avatar] Offline
#3
Re: DuplicateKeyException - Switching from Insert to Update
Thank you for your suggestion, Fabrice. That's an angle I hadn't explored. I do have a way of identifying the keys in the XML file since every product in the file has a unique ID number.

I'll go ahead and try this out and see what kind of performance I can get out of it. I went back to the traditional DataSet and TableAdapter approach for inserting data and speeds went up like crazy - however I would like to avoid doing it that way. It feels a bit more unreliable.

I'll get back with the results as soon as I've had a chance to implement the new stuff smilie

Thanks for your help!