4
\$\begingroup\$

I have below code which is going to copy records from one container in cosmos db to another container in cosmos db. I am just worried on not following the best practices and below code might time out when querying the whole database to get only result with "ClassID". I would appreciate any suggestion as what should I improve in below code. make it async/sync and how can I enumerate in order to overcome any issues in future. As this is an Azure Function which will run first time from beginning but going forward it will be triggered only when there is any changes in Cosmos Db

namespace MigrateDbWithChangeFeed
{
    public static class Function1
    {
        [FunctionName("Function1")]
        public static async Task Run([CosmosDBTrigger(
            databaseName: "databaseName",
            collectionName: "collectionName",
            StartFromBeginning = true,
            ConnectionStringSetting = "",
            LeaseCollectionName = "leases",
            CreateLeaseCollectionIfNotExists = true
    )] IReadOnlyList<Document> source, DocumentClient client,
    ILogger log  )
        {
            string continuationToken = null;
            string collectionUrl = UriFactory.CreateDocumentCollectionUri( "databaseName", "SourceCollection" ).ToString();
            do
            {
                var feed = client.CreateDocumentQuery(
                           collectionUrl,
                           new SqlQuerySpec( "select * FROM c WHERE IS_DEFINED(c.ClassId)" ),
                           new FeedOptions { MaxItemCount = 10, EnableCrossPartitionQuery = true, RequestContinuation = continuationToken } ).AsDocumentQuery();
                var result = feed.ExecuteNextAsync().Result;
                var itemCount = result.Count;
                continuationToken = result.ResponseContinuation;
                foreach( var doc in result )
                {
                    var data = (JObject)JsonConvert.DeserializeObject( doc.ToString() );
                    string partitionkeyValue = string.Concat( data["ClassId"].Value<string>(), "|", data["Project"].Value<string>() );
                    data.Add( new JProperty( "PartitionKey", partitionkeyValue ) );
                    await client.CreateDocumentAsync( UriFactory.CreateDocumentCollectionUri(
                   "SourceDatabase", "TargetCollection" ),
                   data );

                }
            }
            while( continuationToken != null );
            
        }
    }
}  
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Welcome to Stack Review, if you think it can be applied to your question you can add the tag azure-cosmosdb or another tag related to databases. \$\endgroup\$ Commented Aug 23, 2021 at 7:16
  • \$\begingroup\$ You are welcome. \$\endgroup\$ Commented Aug 24, 2021 at 6:59

1 Answer 1

6
\$\begingroup\$

I'm not familiar with Azure Functions and CosmosDb, so I won't review your code from API consumption perspective.

Here are my observations:

  • FunctionNameAttribute: Please prefer nameof operator over hard-coded values
    • If you rename your class (from Function1) then the hard-coded value will not change with it accordingly
    • On the other hand if you need to refer Function1 elsewhere in your cloud application then create a static class where you store your constants
  • collectionUrl: I think you don't need to create a separate variable for this, you can simply inline it
client.CreateDocumentQuery(
    UriFactory.CreateDocumentCollectionUri("databaseName", "SourceCollection"),
    new SqlQuerySpec("SELECT * FROM c WHERE IS_DEFINED(c.ClassId)"),
    new FeedOptions { ... });
  • feed.ExecuteNextAsync().Result: Please prefer async calls over blocking sync calls
  • result.Count: As I said I'm not familiar with this API but I would suggest to do some checks on the FeedResponse object to make sure that your issued operation has succeeded before you start to consume it
  • itemCount: As I can see it is unused so you can freely get rid of it
  • (JObject)JsonConvert.DeserializeObject: Please prefer JObject.Parse to semi parse json strings
    • Please also bear in mind that this might throw exception (JsonReaderException) if not valid json is retrieved for whatever reason
  • data["ClassId"]: Please prefer TryGetValue method over index operator in production code
    • You can examine data existence before you start to using it
  • data["ClassId"].Value<string>(): You can convert the JToken to string with simple casting: (string)data["ClassId"]
  • string.Concat: Please prefer string interpolation over explicit call of concat.
  • data.Add(new JProperty ...): I think you don't need to create a JProperty explicitly, there is an other overload, which accepts the same as the JProperty constructor
  • var data: Please try to use better naming than this, data is meaningless
  • continuationToken != null: Please prefer string.IsNullOrEmpty over direct null check against string variable

UPDATE: How to use TryGetValue

ZZZSharePoint has pointed out that the provided link for TryGetValue is insufficient to put theory into practice. So, here is my suggestion how to apply that function onto this problem:

const string classIdFieldName = "ClassId", projectFieldName = "Project";
const string collectionName = "SourceCollection", partitionKeyFieldName = "PartitionKey";
            
var semiParsedFetchedRecord = JObject.Parse(doc);
if(!semiParsedFetchedRecord.TryGetValue(classIdFieldName, out var classIdField))
    throw new MissingFieldException(collectionName, classIdFieldName);

if (!semiParsedFetchedRecord.TryGetValue(projectFieldName, out var projectField))
    throw new MissingFieldException(collectionName, projectFieldName);

semiParsedFetchedRecord.Add(partitionKeyFieldName, $"{classIdField} | {projectField}");
  • I've introduced several constants to make maintenance (and reuse) easier
  • I've used JObject.Parse to get a semi-parsed json (I've renamed your data variable to express what's stored inside it)
  • I've used TryGetValue to make data retrieval operation safe
    • I've used here the out var feature of C# to make the code more concise
    • otherwise we need to declare a JToken variable before the TryGetValue call
  • I've used MissingFieldException to indicate that something expected is missing
    • ArgumentException might be a good choice as well
    • Or if we can ignore those records where the expected fields are missing then a simple continue might be okay as well
  • Last but not least I've inlined the partitionKey calculation
\$\endgroup\$
2
  • 1
    \$\begingroup\$ wow!! people like you are reason why Stack-exchange is so successful. THANKS!!! I did all changes but cant figure out TryGetValue method. How do I do that? I didnt get any idea from the link. \$\endgroup\$ Commented Aug 24, 2021 at 6:34
  • 1
    \$\begingroup\$ @ZZZSharePoint I've extended my post with that, please check it again. \$\endgroup\$ Commented Aug 24, 2021 at 7:17

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.