Where Should Documentation Go? Or “Is DateTime Broken?”

“Where should my documentation go?” may seem like an odd question to ask, but where you put your documentation defines how it is consumed and where you train your Developers and Users to look for it when troubleshooting problems.

In the VisualStudio & .NET world, I find more often than not, our team goes to Intellisense first for documentation (and I suspect that is the most common practice in other teams as well).

We’ll put summary tags explaining parameters, expected behavior, etc. over commonly used key functions for the convenience of new developers and maintainers, and later generate XML documentation from the tags for storage and distribution.

Here’s an example of one of the more detailed summary tags we have for our extension method, DistinctBy:

/// <summary>
/// Returns distinct elements from a sequence using the provided function to compare values.
/// </summary>
/// <typeparam name="TSource">The type of the elements of source.</typeparam>
/// <typeparam name="TKey">The type of the key used to compare elements.</typeparam>
/// <param name="source">The sequence to remove duplicate elements from.</param>
/// <param name="keySelector">A function to select the key for determining equality between elements.</param>
/// <returns>An IEnumerable&lt;T&gt; that contains distinct elements from
/// the source sequence.</returns>
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)

Now that we’ve established where our team (and I suspect other developers) go first for documentation, I’d like to present a small program I wrote that caused the team a lot of grief two weeks ago:

static void Main(string[] args)
{
    var now = DateTime.MinValue;
    var nowPlusASecond = now.AddSeconds(1);
    var nowPlusOneSecondAndABit = now.AddSeconds(1.0001);
          
    if (nowPlusASecond.Ticks == nowPlusOneSecondAndABit.Ticks)
    {
        Console.WriteLine("Do you think we will get here?");
        Console.WriteLine("Apparently these DateTimes refer to the same instant");
    }
    Console.WriteLine("Press 'Enter' to exit");
    Console.ReadLine();
}

The output of this program surprised us:

Do you think we will get here?
Apparently these DateTimes refer to the same instant
Press 'Enter' to exit

Wait… what?

Those two times should be different. Granted, they wont be different by much, but they should be demonstrably different on a Ticks level (since a Tick represents one ten millionth of a second). For applications that assume time resolution on a Tick level, this is a big deal.

Confronted with this problem, we visited the Intellisense for AddSeconds and found the following:
dtAddSecIntell

This seems fairly reasonable and our team was scratching our heads trying to determine if we had discovered some bug in .NET (spoiler… we hadn’t), or if there was just some key piece of information we were missing from the documentation. At this point, we turned to our second-tier source for documentation: the MSDN. The MSDN’s reference on AddSeconds, and found the following (emphasis mine):

Remarks
This method does not change the value of this DateTime. Instead, it returns a new DateTime whose value is the result of this operation.
The fractional part of value is the fractional part of a second. For example, 4.5 is equivalent to 4 seconds, 500 milliseconds, and 0 ticks.
The value parameter is rounded to the nearest millisecond.

This is a key piece of information that would be, in my opinion, good information to have in the Intellisense. By having the type of the parameter be double it implies to the developer full double precision, when in fact, it is rounding to three decimal places.


The Takeaway
If you think your API violates the Principle of Least Astonishment, not only should you document it, but you should ensure it is documented in the place your Developers are most likely to look first in addition to your primary source of documentation.

Counter-intuitive LINQ

When someone asks me to describe LINQ, depending on their familiarity I might say something along the lines of:

It’s magic!

or

A way of writing SQL-like statements in C#

or most specifically

A set of tools using extension methods and generics to perform queries on sets of data

At the end of the day, however, I do caution them that LINQ is easy to learn, hard to master.

When I first started using LINQ my mentor said “At the end of your query, just send it .ToList(). You’ll thank me later.”

He and I had a few more discussions on why you should be sending your LINQ queries .ToList() and he didn’t know himself other than “Performance and Delayed Execution.”

When working with other C# developers, I find that the Delayed Execution feature of LINQ is the concept they struggle with most. They remember it, work with it, but inevitably write code that forgets that feature, and ultimately create bugs.

Consider the following classes:

Master:

class Master
{
    public Guid MasterID { get; set; }
    public string SomeString { get; set; }
    public Master()
    {
        MasterID = Guid.NewGuid();
        SomeString = "Some Master";
    }
}

And Detail:

class Detail
{
    public Guid MasterFK { get; set; }
    public string SomeDetails { get; set; }
    public Detail(Guid masterFK, string someDetails)
    {
        MasterFK = masterFK;
        SomeDetails = someDetails;
    }
}

Using those two classes, read the following lines of code and think about what the output will be.

static void Main(string[] args)
{
    var mast = new Master();
    var deta = new Detail(mast.MasterID, "");
    var masters = new List<Master>() { mast };
    var details = new List<Detail>() { deta };

    int iterations = 0;
    var joinedValues = masters.Join(details,
                                    x => x.MasterID,
                                    x => x.MasterID,
                                    (x, y) =>
                                    {
                                      iterations++;
                                      return new { Mas = x, Det = y };
                                    });

    Console.WriteLine("The number of times we returned a value is: " + iterations);
    Console.WriteLine("The number of values is: " + joinedValues.Count());
    Console.ReadLine();
}

Got it? Okay, here’s the output:


The number of times we returned a value is: 0
The number of values is: 1

To some of coworkers, when they saw this result, they immediately wanted me to open up GitHub and submit a bug report to the .NET team. They thought they found a critical bug in the LINQ library.

The thing to realize is that in this code we have only created the query when we print out “iterations”, we haven’t executed the query yet, so the value of iterations is still 0. Adding the following line will get results closer to what you expect:

Console.WriteLine("The number of times we returned a value is: " + iterations);
Console.WriteLine("The number of values is: " + joinedValues.Count());
Console.WriteLine("The number of times we returned a value now is: " + iterations);
Console.ReadLine();

Output:

The number of times we returned a value is: 0
The number of values is 1
The number of times we returned a value now is: 1

Since we executed the query when we called joinedValues.Count(), we incremented the iterations variable in our return value, giving the result we initially expected.

A final word of warning on this, however: consider the following code modification. What do you think will be the output?

Console.WriteLine("The number of times we returned a value is: " + iterations);
while (true)
{
    Console.WriteLine("The number of values is " + joinedValues.Count());
    Console.WriteLine("The number of times we returned a value now is: " + iterations);
    Thread.Sleep(1000);
}
Console.ReadLine();

You can probably see where this is going:

The number of times we returned a value is: 0
The number of values is 1
The number of times we returned a value now is: 1
The number of values is 1
The number of times we returned a value now is: 2
The number of values is 1
The number of times we returned a value now is: 3
...

And so on and so on

Every time we are calling .Count() on our IEnumerable (joinedValues) we are re-evaluating the query. Think about what that might mean if you wrote expensive code in your join like so:

var joinedValues = masters.Join(details,
                                x => x.MasterID,
                                x => x.MasterID,
                                (x, y) =>
                                {
                                  iterations++;
                                  //Do some expensive work
                                  Thread.Sleep(10000);
                                  return new { Mas = x, Det = y };
                                });

Then every time you did an operation on that query, you are re-doing that expensive work.

So remember: if you want the code in your join to be executed immediately, or you are doing expensive work you don’t want to repeat, it is safest to send your LINQ queries .ToList() or some other persistent data object.