David Shifflet's Snippets

Mindset + Skillset + Toolkit = Success




< Back to Index

C# Yield and the Hidden Dangers

C# has a yield keyword the best description I could find is available here. Basically it allows you to write a custom iterator.

Given this code:

public static class GalaxyClass
{
    public static void ShowGalaxies()
    {
        var theGalaxies = new Galaxies();
        foreach (Galaxy theGalaxy in theGalaxies.NextGalaxy)
        {
            Debug.WriteLine(theGalaxy.Name + " " + theGalaxy.MegaLightYears.ToString());
        }
    }

    public class Galaxies
    {

        public System.Collections.Generic.IEnumerable NextGalaxy
        {
            get
            {
                yield return new Galaxy { Name = "Tadpole", MegaLightYears = 400 };
                yield return new Galaxy { Name = "Pinwheel", MegaLightYears = 25 };
                yield return new Galaxy { Name = "Milky Way", MegaLightYears = 0 };
                yield return new Galaxy { Name = "Andromeda", MegaLightYears = 3 };
            }
        }

    }

    public class Galaxy
    {
        public String Name { get; set; }
        public int MegaLightYears { get; set; }
    }
}

If we run this we will see:

Tadpole
Pinwheel
Milky Way
Andromeda

Basically it will return one at a time each time MoveNext is called via the foreach.

Something to keep in mind, once you use yield in a method you don't need to check for an empty collection it will just return an empty collection.

        static IEnumerable GetDates(DateTime[] dates)
        {
            foreach (var date in dates)
            {
                yield return date;
            }
        }
You don't need to check for dates.Any() it will just return an empty collection. This looks odd because this method has a return type but no return having the yield keyword allows for this.

So how is this useful?

Null Safe Iterator

If our collection in our foreach is null it will throw an exception. Can we work around it?

This can be done like so:

        static IEnumerable GetDates(DateTime[] dates)
        {
            if (!dates.Any())
            {
                // This basically stops execution like an empty collection
                yield break;
            }

            // So we don't get an exception in the foreach
            foreach (var date in dates)
            {
                yield return date;
            }
        }
Notice the yield break;, this basically acts like the end of the collection in the iterator. We are at the end so essentially in our foreach calling this it would be like working with a DateTime[] {}.

Default Value on an Empty Collection

Let's say given some dates we might want the minimum date, but if there are no dates use 1/1/1980. If we have no dates like so:

            var dates = new DateTime[] { };
            var min = dates.Min();
The variable min would be null! What we can do is:
		var dates = new DateTime[] { };
		var min = GetDates(dates).Min();
		
		...
			
        static IEnumerable GetDates(DateTime[] dates)
        {
            if (!dates.Any())
            {
                yield return new DateTime(1980, 1, 1);
                yield break; //Read below this is IMPORTANT!
            }

            // So we don't get an exception in the foreach
            foreach (var date in dates)
            {
                yield return date;
            }
        }			
That yield break; is very important if you want to do this. Because the yield return is not in an iterator and the collection is empty after calling it dates will have the new DateTime in it and the foreach will be executed. This is a pitfall because you see return and you'd think execution would jump out, it does not. It continues! The yield break; will make the iteration stop and prevent the rest of the method body from executing. If your yield is inside a loop this will not happen, it will jump out at the yield return.

Lazy Loading

The text book example for yield is lazy loading. Here is some sample reader code:

            var result = new List();
            //Command is a DbCommand
            using (var reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    result.Add(LoadDog(reader));
                }
            }
            return result;
If we call this, it will return a collection of all the dogs. What if we were using linq and doing something like:
var chompers = result.FirstOrDefault(o => o.Name.Equals("Chompers"));
What if Chompers was the third row in the result set from the database, we would still roll through the entire collection loading every single one just to find Chompers.

Let's change it to use yield and see if we can read them one at a time until we find what we are looking for.

            var result = new List();
            //Command is a DbCommand
            using (var reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    yield return LoadDog(reader);
                }
            }
Now the FirstOrDefault will not load every single one, once it finds Chompers it will stop looking at all of the other data.

This is especially useful for lazy loading of children on an object. If we had an owner class and a collection of dogs, we could make the get on the Dogs property lazy load by using the yield keyword. We won't need a backing variable and have to load the dogs into it, we can only call the database when we access the Dogs property and we only deal with it one at a time.

Summary

Yield allows you to write custom iterators. There are pitfalls and you should make yourself familiar with yield before abusing it. Think if you are using the entire collection, should you use yield?