Object Equality – Hash Codes & LINQ Extensions

Reference equality is the default comparator for objects in .Net, but it is seldom what I want when coding. More often I want to know if objects are the same based on the state they contain. Changing object equality comparison is as easy as overriding the Equals method. Yet, when you are using LINQ extension methods such as Distinct() or Contains() there are a few things to be aware of.

In Domain Driven Design (DDD) the value object pattern is frequently used.. Value objects in DDD don't need their own identity. They are identified by the uniqueness of all the information they contain. For example consider an Address object that has several fields (Street, City, State, Zip). No address is the same as another, unless all the field values are equal.

Imagine that we have a list of customer addresses and we want to return only the unique addresses.

 var addresses = new List<Address>() {
      Address = new Address("102 Birch", "Spring", "TX", 77777),
      Address = new Address("304 Elm", "Newport", "MA", 33234),
      Address =  new Address("102 Birch", "Spring", "TX", 77777)
};

var uniqueAddresses = addresses.Distinct();

Distinct calls the GenericEqualityComparer. This in turn uses the Equals & GetHashCode methods on the objects in the collection. GetHashCode is called to determine possible equality. Equals is called to determine absolute equality. Overriding GetHashCode is important in making the Distinct method work with the GenericEqualityComparer. If you fail to override GetHashCode, you will find the Distinct method does not work right. If LINQ collection extensions aren't working as expected this is likely the cause.

Guidelines for Object Equality & Hash Codes – If you have two objects considered equal then they should return the same hash. – Hash codes on mutable objects should be calculated off immutable fields. This keeps the hash the same through the object lifetime. – If you are using LINQ methods be sure that your objects aren't mutating and causing a hash code change. If for whatever reason you need to mutate objects in a collection, consider returning 1 as the hash. Caution: this will have a performance impact.

For a more in-depth look at all the things to consider, checkout this post by Eric Lippert.

A few more things to think about... LINQ collection extensions that do comparison allow passing in a EqualityComparer function. This is great as long as you control all the places in code where you need comparison. Beware of third party APIs that may have calls that would use the GenericEqualityComparer. Also, if you override equals, you may want to seal you class. This will prevent inherited classes from creating an incorrect Equals implementation. Lastly, override the equality (==) and inequality operators (!=) to avoid accidental comparison bugs.