2
\$\begingroup\$

In an Azure Function, I have a c# program that processes tab delimited data into JSON entries. I'd like to improve it in terms of performance and the connection management.

I want to remove personal data from each line.* As each line gets prepped for turning into JSON I perform this function for two values.** I need to work with the depersonalised data but the operational systems will need to request data from the system. They have access to the PII but the analytical platform (downstream of this function) will not have it. The Key Value pairs are therefore constructed so that the operational system and the log ingestion processes can extract guids. Each key is like AccountID:PIItype:PII and then each value is a guid.***

If I've previously depersonalised a value, I want to retrieve the previously provided value from the Azure Redis Cache. If it doesn't exist, add it to the cache.

using StackExchange.Redis;
using System.Text;
using System.Threading;

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            ThreadPool.SetMinThreads(50, 50);
            string redisCacheName = System.Environment.GetEnvironmentVariable("rediscache_name", EnvironmentVariableTarget.Process).ToString();
            string redisCachePassword = System.Environment.GetEnvironmentVariable("rediscache_password", EnvironmentVariableTarget.Process).ToString();
            return ConnectionMultiplexer.Connect(redisCacheName + ",abortConnect=false,ssl=true,password=" + redisCachePassword);
        });

public static ConnectionMultiplexer Connection
{
    get
    {
        return lazyConnection.Value;
    }
}

static string depersonalise_value(string input, string field, int account_id)
{
    IDatabase cache = Connection.GetDatabase();
    string depersvalue = $"{account_id}:{field}:{input}";
    string existingguid = (string)cache.StringGet(depersvalue);
    if (String.IsNullOrEmpty(existingguid))
    {
        string value = $"{account_id}{Guid.NewGuid()}";
        cache.StringSet(depersvalue, value);
        return (value);
    }
    else return (existingguid);
}

* Hashing is something considered but discarded

** The two values are independent of each other

*** There is a redis cache per region so can't rely on uniqueness of the guid by host. This is just a simple preventative as an account can only come from a single region.

\$\endgroup\$
3
  • \$\begingroup\$ Can you briefly clarify the cache schema? It looks like the personal value forms the key and the value is a GUID? I feel like I'm missing the context of what you're trying to do. \$\endgroup\$ Commented Sep 14, 2016 at 13:55
  • \$\begingroup\$ Updated with paragraph re: the KVs \$\endgroup\$ Commented Sep 14, 2016 at 14:44
  • \$\begingroup\$ Added a small update to my answer - the scheme makes perfect sense and seems like a sensible approach. The word "depersonalised" is what threw me. \$\endgroup\$ Commented Sep 14, 2016 at 14:59

1 Answer 1

5
\$\begingroup\$

Good use of Lazy on the connection multiplexer but consider making it a readonly field so you can't accidentally replace it later.

Doing this in the Lazy's factory function is a bit odd in my opinion ThreadPool.SetMinThreads(50, 50);. I'm not sure what options you have for initialization in a Azure Function but you could do worse than use a static constructor on your type.

As for the rest of the function, did you know that there is a ConfigurationOptions type in StackExchange.Redis?

new Lazy<ConnectionMultiplexer>(() =>
    {
        string redisCacheName = System.Environment.GetEnvironmentVariable("rediscache_name", EnvironmentVariableTarget.Process).ToString();
        string redisCachePassword = System.Environment.GetEnvironmentVariable("rediscache_password", EnvironmentVariableTarget.Process).ToString();
        return ConnectionMultiplexer.Connect(new ConfigurationOptions
            {
                AbortOnConnectFail = false,
                Ssl = true,
                // not 100% sure this is exactly right but should be rightish
                EndPoints = { { redisCacheName, 0 } }
                Password = redisCachePassword
            }
        });
};

It's a bit more code but it means you don't have to build the string yourself.

Expression bodied members save some boilerplate:

public static ConnectionMultiplexer Connection => lazyConnection.Value;

C# should never snake_case anything. All methods, regardless of accessibility, should be PascalCase: depersonalise_value should be DepersonaliseValue. See the capitalization conventions on msdn.

I notice that you're not using var at all. Just wanted to double check that you knew it existed. e.g

var depersvalue = $"{account_id}:{field}:{input}";

Naming again: depersvalue should be depersonalisedValue although not clear to me whether it's already depersonalised here or whether it's about to be...

Your final else is redundant, it can be removed.

Remove the parentheses around the returned value - they just add noise.

The connection management looks fine to me - ConnectionMultiplexer is incredibly clever and it's designed to be a long-lived singleton which I think makes it fine to be in a static readonly field.

IDatabase is intentionally very lightweight and you can make as many as you like.

Edit

In response to your edit, I would suggest renaming this from "depersonalised" data to something like a PersonalDataAlias (naming is hard).

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.