C# performance tips & tricks

Nowadays is common have distributed systems, or some microservices oriented applications.
These have a lot positive points, for example is easy to scale up, partial deploy, isolate behavior and responsibilities.

And yes, it isn’t perfect. We have some concern points, like the challenge to perform integration tests and communication performance (for instance, when we have round-trips between APIs over HTTP protocol communication).

But if you are designing and implementing something like a microservice API, maybe some lean adjustments can be done to improve performance… nothing so much usual, but when you think about an API with a lot of users accessing simultaneously, it can make a significant difference.

Lets start with the classical: strings.

string.Empty

First of all, each time that you transform a string you are creating a new one. And each time you compare something with “”, you have a new string instance in memory. So if you need of “”, prefer to user string.Empty. This is a constant and you’ve have it (avoiding create a new one).

StringBuilder

Probably you’ve heard about StringBuilder. What about see some results about how it could improve your string concatenations?

Using a Console Application project, I created a benchmark structure like this:

static void Main(string[] args)
{
    Benchmark(ConcatStrings, nameof(ConcatStrings), 1000000);
    Benchmark(ConcatStringsOptimized, nameof(ConcatStringsOptimized), 1000000);

    Console.WriteLine();
    Console.WriteLine("end...");
    Console.ReadLine();
}

static void Benchmark(Action action, string actionName, int maxInteractions)
{
    var stopwatch = new Stopwatch();

    // stabilizing system
    Thread.Sleep(1000);

    stopwatch.Start();

    // call action N times
    for (int i = 0; i < maxInteractions; i++)
        action(); 

    stopwatch.Stop();

    // log result
    Console.WriteLine($"{actionName} took {stopwatch.ElapsedMilliseconds} milliseconds");
}

And to concatenate strings I created this two ways:

static void ConcatStrings()
{
    var result = "a";

    for (int i = 0; i < 10; i++)
        result += "a";
}

static void ConcatStringsOptimized()
{
    var builder = new StringBuilder("a");

    for (int i = 0; i < 10; i++)
        builder.Append("a");

    var result = builder.ToString();
}

And here is the result:

ConcatStrings took 257 milliseconds
ConcatStringsOptimized took 90 milliseconds

Comparing Non-Case-Sensitive Strings

Using the same benchmark structure, lets see tree way to compare strings, ignoring the case factor. First we will have a common code, widely found. Second is an optimized way to do the operation. Finally other optimized way:

static void CompareStrings()
{
    var result = "aaa".ToLower() == "AAA".ToLower();
}

static void CompareStringsOptimized1()
{
    var result = string.Compare("aaa", "AAA") == 0;
}

static void CompareStringsOptimized2()
{
    var result = "aaa".Equals("AAA", StringComparison.CurrentCultureIgnoreCase);
}

With those, I got the follow results:

CompareStrings took 273 milliseconds
CompareStringsOptimized1 took 114 milliseconds
CompareStringsOptimized2 took 113 milliseconds

So far, if we pay attention in the improvement, some cases we got even twice better performance!

Exception Handlers

Exceptions are good, believe me! They are a proper way to treat non desired behaviors in the apps. You can validate an input, and launch specific exception for each case, to handle properly further, for instance, a custom validation message to final user.

But, for some reason, there are developers that uses try/catch syntax like if/validation statement.

And here we have a seriously performance problem, for example:

static void ExceptionHandler()
{
    var collection = new []{ "a", "a", "a", "a", "a", "a", "a", "a", "a", "a" };

    foreach (var item in collection)
    {
        try
        {
            var value = Convert.ToInt32(item);
        }
        catch { }
    }
}

static void ExceptionHandlerOptimized()
{
    var collection = new[] { "a", "a", "a", "a", "a", "a", "a", "a", "a", "a" };

    foreach (var item in collection)
        int.TryParse(item, out int value);
}

ExceptionHandler took 1350 milliseconds
ExceptionHandlerOptimized took 0 milliseconds

You can think that is unbelieveble, but here I ran 10 interaction only.

Benchmark(ExceptionHandler, nameof(ExceptionHandler), 10);
Benchmark(ExceptionHandlerOptimized, nameof(ExceptionHandlerOptimized), 10);

So we can see how serious this can be, event in a small software.

Memory locality matters

That one I learned in the master degree. Think in a matrix (bi-dimensional array), and two loops to interact with it. The CPU will get the info from memory and put them into the registers. But registers are sparse. So if you can find a way to optimize the info into the registers, for instance, what is better, have the column info in the register, or the line info?

this way you will avoid miss in the registers and round-trips to memory.

Lets see it in a practical example. What do you think will perform better?

static void ReadMatrix1()
{
    var size = 1000;
    var matrix = new int[size, size];
    var result = 0;

    for (int i = 0; i < size; i++)
        for (int j = 0; j < size; j++)
            if (matrix[j, i] > 0)
                result++;
}

static void ReadMatrix2()
{
    var size = 1000;
    var matrix = new int[size, size];
    var result = 0;

    for (int i = 0; i < size; i++)
        for (int j = 0; j < size; j++)
            if (matrix[i, j] > 0)
                result++;
}

The only difference is the how to use the “i” or “j” (tho order).

Here is the answer (running 10 times each one):

ReadMatrix1 took 17688 milliseconds
ReadMatrix2 took 6012 milliseconds

It sounds silly, but it is anohter point to pay attention.

Using AddRange rather than Add in collection

When it is possible, prefer to use the AddRange method of the collections.

static void AddToList()
{
    var items = new int[100000000];
    var list = new List<int>();

    foreach (var item in items)
        list.Add(item);
}

static void AddToListOptimized()
{
    var items = new int[100000000];
    var list = new List<int>();

    list.AddRange(items);
}

AddToList took 10683 milliseconds
AddToListOptimized took 6228 milliseconds

Buffer Size

When you are designing applications that will handle with buffers, for instace read images or other files, try to keep IO buffer size between 4KB and 8KB. That is because of the Operation System (OS) architecture.

Files are already buffered by the file system cache. You just need to pick a buffer size that doesn’t force FileStream to make the native Windows ReadFile() API call to fill the buffer too often. Don’t go below a kilobyte, more than 16 KB is a waste of memory and unfriendly to the CPU’s L1 cache (typically 16 or 32 KB of data).

For nearly every application, a buffer between 4KB and 8KB will give you the maximum performance. For very specific instances, you may be able to get an improvement from a larger buffer (loading large images of a predictable size, for example), but in 99.99% of cases it will only waste memory.

All buffers derived from BufferedStream allow you to set the size to anything you want, but in most cases 4 and 8 will give you the best performance.

The code

Here is the whole code if you want to perform the tests:

class Program
{
    static void Main(string[] args)
    {
        Benchmark(ConcatStrings, nameof(ConcatStrings), 1000000);
        Benchmark(ConcatStringsOptimized, nameof(ConcatStringsOptimized), 1000000);

        Benchmark(CompareStrings, nameof(CompareStrings), 1000000);
        Benchmark(CompareStringsOptimized1, nameof(CompareStringsOptimized1), 1000000);
        Benchmark(CompareStringsOptimized2, nameof(CompareStringsOptimized2), 1000000);

        Benchmark(ExceptionHandler, nameof(ExceptionHandler), 10);
        Benchmark(ExceptionHandlerOptimized, nameof(ExceptionHandlerOptimized), 10);

        Benchmark(ReadMatrix1, nameof(ReadMatrix1), 10);
        Benchmark(ReadMatrix2, nameof(ReadMatrix2), 10);

        Benchmark(AddToList, nameof(AddToList), 10);
        Benchmark(AddToListOptimized, nameof(AddToListOptimized), 10);

        Console.WriteLine();
        Console.WriteLine("end...");
        Console.ReadLine();
    }

    static void Benchmark(Action action, string actionName, int maxInteractions)
    {
        var stopwatch = new Stopwatch();

        // stabilizing system
        Thread.Sleep(1000);

        stopwatch.Start();

        // call action N times
        for (int i = 0; i < maxInteractions; i++)
            action(); 

        stopwatch.Stop();

        // log result
        Console.WriteLine($"{actionName} took {stopwatch.ElapsedMilliseconds} milliseconds");
    }


    static void CompareStrings()
    {
        var result = "aaa".ToLower() == "AAA".ToLower();
    }

    static void CompareStringsOptimized1()
    {
        var result = string.Compare("aaa", "AAA") == 0;
    }

    static void CompareStringsOptimized2()
    {
        var result = "aaa".Equals("AAA", StringComparison.CurrentCultureIgnoreCase);
    }


    static void ConcatStrings()
    {
        var result = "a";

        for (int i = 0; i < 10; i++)
            result += "a";
    }

    static void ConcatStringsOptimized()
    {
        var builder = new StringBuilder("a");

        for (int i = 0; i < 10; i++)
            builder.Append("a");

        var result = builder.ToString();
    }


    static void ExceptionHandler()
    {
        var collection = new []{ "a", "a", "a", "a", "a", "a", "a", "a", "a", "a" };

        foreach (var item in collection)
        {
            try
            {
                var value = Convert.ToInt32(item);
            }
            catch { }
        }
    }

    static void ExceptionHandlerOptimized()
    {
        var collection = new[] { "a", "a", "a", "a", "a", "a", "a", "a", "a", "a" };

        foreach (var item in collection)
            int.TryParse(item, out int value);
    }


    static void ReadMatrix1()
    {
        var size = 10000;
        var matrix = new int[size, size];
        var result = 0;

        for (int i = 0; i < size; i++)
            for (int j = 0; j < size; j++)
                if (matrix[j, i] > 0)
                    result++;
    }

    static void ReadMatrix2()
    {
        var size = 10000;
        var matrix = new int[size, size];
        var result = 0;

        for (int i = 0; i < size; i++)
            for (int j = 0; j < size; j++)
                if (matrix[i, j] > 0)
                    result++;
    }

    static void AddToList()
    {
        var items = new int[100000000];
        var list = new List<int>();

        foreach (var item in items)
            list.Add(item);
    }

    static void AddToListOptimized()
    {
        var items = new int[100000000];
        var list = new List<int>();

        list.AddRange(items);
    }
}
wrap up

“Just some milliseconds” you can say, but when you think in a system receiving thousands of request at the same time, this can make the difference to improve that precious milliseconds in the response time 😉

One last detail, the benchmark ran in debug mode, so think about how the performance will improve in release mode!?

GitHub

Spaki

With more than 15 years of experience developing softwares and technologies, talking about startups, trends and innovation, today my work is focused to be CTO, Software Architect, Technical Speaker, Technical Consultant and Entrepreneur.

From Brazil, currently lives in Portugal working at https://www.farfetch.com as Software Architect, besides to keep projects in Brazil, like http://www.almocando.com.br/

Share

Leave a Reply

Your email address will not be published. Required fields are marked *