Four flavors for object equality

12 min readApr 6, 2020

An in-depth analysis of equality concept in programming languages.

Four flavors for object equality seems too much: objects have a unique identity. Very often, however, when we say that two entities are equal we do not mean they are physically the same object, but we are referring to a set of physical objects that have properties that make them equivalent.

E.g. to replace a light bulb with one equal bulb we mean: same color temperature, same brightness, same power consumption.

In programming languages it is important to distinguish between four different meanings of equality.
In order to define them clearly, some basic concepts are outlined below.

Relations

Relations (binary relations) are sets of ordered pairs of entities.

If a pair belong to a relation it is possible to represent it by drawing the two elements of the pair connected by an arrow (and an indication of the relation on top of it):

A more formal writing is the following:

<284,220> ∈ “greater than”
{<284, 220>, <220, 284>} ∈ “sum of proper divisor”
<square, circle> ∈ “homeomorphic”

The set of words we use to indicate a relation evokes an operative procedure to decide if a given ordered pair belongs to the relation.

Continuous deformations of entities is a decision procedure for ‘homeomorphic’ relation

‘Greater than’ possible decision procedure

Equivalent relations

A relation between entities is an equivalence relation if fulfill three simple conditions

Reflexive, symmetric, transitive properties

Reflexivity: Entities are equivalent to themselves

Symmetry: if a is equivalent to b, then b is equivalent to a

Transitivity: if a is equivalent to b and b is equivalent to c then a is equivalent to c

In the previous definitions, replacing ‘ is equivalent to’ with a given relation, it is possible to test if it is an equivalence relation.

Some examples:

a is greater than b is not reflexive (a is not greater than a), not symmetric or antisymmetric (if a is greater than b, b is not grater than a), transitive
a is greater or equal to b is reflexive, not symmetric (it is symmetric when a is equal to b, but in general it is antisymmetric), transitive
a divided by two gives the same remainder as b divided by two is reflexive, symmetric and transitive.

The last example is an equivalence relation.

The first two are examples of another important class of relations: the order relations. For them, the antisymmetry sorts the set of entities and imposes the order.

Relations can be interesting even if they own none of the previous properties. E.g. the relation sum of proper divisors is irreflexive, antisymmetric (in general) and intransitive, but when it is symmetric, it defines an evanescent set of integer pairs: the amicable numbers (the smallest pair is 284, 220 and obviously 5564, 5020 is the fourth).

Equivalence classes

Relations (subset of ordered pairs of entities) can be used to define subset of entities. E.g. given the relation sum of proper divisors, the set of elements that are related to 1 is the set of prime numbers.

This characteristic acquires particular relevance in the case of equivalence relations.

Because of symmetry and transitivity if an entity a is equivalent to an entity b

it is equivalent to all entities that are equivalent to b (transitivity)
and b is equivalent to all entities equivalent to a (symmetry).

If a is not equivalent to b

all entities equivalent to a are not equivalent to b
and are not equivalent to any other entity equivalent to b.

The sets of equivalent entities are called equivalence classes. These classes cannot overlap: every entity belongs to only one equivalence class.

E.g. the relation a divided by two gives the same remainder as b divided by two defines the two equivalence classes of even and odd numbers.

Equivalence classes of ‘*a divided by two gives the same remainder as b divided by two’* for integers between 1 and 8

Equality

When speaking about even, 2 is the same of 4, but what we really mean is that they belong to the same equivalence class. The equivalence class defines a new and more abstract concept (even numbers) that is different from the concept of the basic entities (natural numbers).

So seems not correct to say that 2 is equal to 4 because the concept of even numbers is very different from the concept of natural numbers.

But we often say π is equal to 3,14 implying that for our context two decimals is a sufficient approximation. In this case 3,14 represents the equivalence class of all real numbers between 3,14 and 3,15.

Depending on the context, we can (pragmatically) replace the concept represented by an entity with the concept represented by its equivalence class.
We can say that

Equality is an equivalence relation that is used to decide, in a given context, if two entities are the same entity.

The fact that the meaning of equality depend on the context is stigmatized by the saying:

topologists can’t tell the difference between a coffee cup and a doughnut

Because homeomorphic is an equivalence relation and a coffee cup is homeomorphic to a doughnut, topologists say they are equal.

Continuous deformations between a coffee cup and a doughnut. Homeomorphism, https://en.wikipedia.org/w/index.php?title=Homeomorphism&oldid=935290673.

Coffee cups, doughnuts, rings are solids with an hole. A bottle does not belong to this class: it is equivalent to a sphere.

Names

Names are the powerful part of languages. A name evokes mental representations that constitute our knowledge about that name. More complex are the evoked representations and more deep is our knowledge.

Giving a name to things allow us to reflect them in our speeches. Naming things is the first step for the knowledge.

How can we not remember here the famous aphorism of Stanislaw Lec:

Reflect before you think.

The reason I listed the names of peaks in the caption of the featured picture of this story is that without knowing the name of the peaks, any hiker can’t completely appreciate the view he sees from the summit he have reached.

The following suggestions can reinforce the feeling of control of that picture.

Resegone means big saw, but we need to look at it from the right side

Legnone derives from lin a Celtic word for water. The winter snow stored in hidden ravines provides a large amount of water during the warmer summer months.

There is a word in the Lombard dialect ‘grigna’ which means subtly mischievous smile. It seems that the name of the two remaining mountains derives from this word: Grignetta means little grigna and Grignone big grigna.

Names have these relevant characteristics:

Each name (at a given point in time) refers to a single entity. This is true even when names seems related to multiple entities: they really refer to a (single) collection of sub entities.
Names of sub entities have a hierarchical structure.
E.g. the houseNumber of myHouse is 5 and it is part of the address of myHouse. Its full name is myHouse.address.houseNumber.
Sub entities of ordered collections are identified by their position inside the collection. The position becomes part of the name.
E.g. fooStreet.houses[10] is another name of myHouse: it is the tenth house of the street.
The referred entity can change during the time: e.g. myHouse can change due to a moving.
This characteristic is so important for programming languages that lot of names used inside a program are called variables.
Every name own an underlay concept that defines the universal set of entities that can be (potentially) referred with that name.
In programming languages this universal set is called type.
Even if a given language is not statically typed (the language does not have constructs to explicitly declare the type of names) the type is inherently defined by the concept the name is trying to capture.
Every name belongs to a scope. Languages are used to create stories and these stories for programming languages are called programs. Scopes are different sections of a program. Names (variables) exist inside a scope: they are created, modified while the program runs and destroyed when the section (the scope) terminates.

The essential aspect of object equality is that an entity can have different names. E.g. myHouse and fooStreet.houses[10] both identify my house and I can write myHouse is equal to fooStreet.houses[10]

This chapter on names cannot end without citing the powerful incipit of Moby Dick with its profound biblical evocations:

Call me Ishmael.

Programming languages and entities

Until now I have used the term entities to indicate the elements of a language (natural or formal) that represent values that are referred by names. The universal sets of homogeneous values define types, and types can be used to classify names.

With programming languages we distinguish between primitive types and object types.

Primitive types

Primitive types are the bricks that can be used to build object. It is possible to define them using the equality concept, because for them there is only one flavor of equality.

A primitive type is the set of all values that can be tested for equality (and diversity) using the simple equal (not-equal) operator of the language.

Incidentally in javascript there are four equal algorithms for primitive type (four is central in this story): abstractEqual (==), strictEqual (===), sameValue (Object.is) and sameValueZero.

The operator abstractEqual (==) compares operands of different primitive type (e.g. 1 == '1’ but !( 1 === '1’)).
The other three algorithm differ only when they are applied to two edge values of number type: -0 and NaN (not a number).

NaN comes from the evaluation of some floating expressions. E.g. Math.pow(-1, 0.5) evaluate to NaN: it is the imaginary unit!

The following relations are all true and highlight the differences between the three algorithms

-0 === 0; 0 === +0; NaN !== NaN
!sameValue(-0, 0); sameValue(NaN, NaN)
sameValueZero(-0, 0); sameValueZero(NaN, NaN)

Considering that 1/-0 !== 1/+0 // -Infinity !== +Infinity the snippets for sameValue and sameValueZero are

const sameValue = (x, y) =>
  x === y && (x !== 0 || 1 / x === 1 / y)) || (x !== x && y !== y)const sameValueZero = (x, y) => x === y || (x !== x && y !== y)

Objects

Objects are made of three parts:

A physical representation inside the memory space of a program.
Objects need a physical representation that shall be explicitly created, while primitive values do not have a recognizable physical representation: they simply exist.
A set of names that refer to the physical representation (references).
A name associated to an object value is a reference to that object value.
Names are fundamental for the objects. Without names that refer to an object value (without references), the object does not exist and the physical representation, depending on programming languages, becomes garbage or memory leak (if not correctly destroyed).
A structure: a set of sub-names that refer to sub-entities (primitive types or objects).

Because of this complexity, assigning an object value to a new name can involve three distinct semantic operations. The new name can be

a new reference to the same physical representation
a swallow copy of the source object (sub objects are shared between the source and new object)
a deep copy of the source object (all physical representations are duplicated)

Visual illustration of the three flavors of ‘assignment’ operator and the corresponding javascript snippets that implement them.

Four flavors of object equality

Now we have all elements for clearly define the four different flavors for object equality. They are:

Equal physical representation (reference equality)
Equal physical representation of sub entities (shallow equality)
Equal structure (deep equality)
Equal key property (key equality)

Equal physical representation (reference equality)

In this flavor, two names refer to equal objects when they are referring to the same physical representation.

This is the most stringent criteria because the associated equivalence classes contain a single (physical) object.

In all programming languages this equality flavor can be tested using the equal operator of the language.

x === y     // object reference equality

This means that, for javascript language, all four equal algorithms used for primitive types can be used also for objects.

A new created value is always different (not reference equal) from all other existing values, independently from its content.

{} !== {} // two different empty objects

Equal physical representation of sub entities (shallow equality)

In this flavor, two names refer to equal objects when they have the same structure (same set of sub entity names) and the sub entities with the same sub name refer to the same value of a primitive type or refer to the same physical representation of an object.

This flavor of equality is interesting because all shallow copies belongs to the same equivalent class: shallow copies are shallow equal but not reference equal.

In general languages do not have a native operator to test this flavor of equality. A simple javascript implementation can be this one:

const isObject = o => o !== null && typeof o === 'object'const shallowEqual = (a, b) => 
  a === b 
    ||
  isObject(a) && isObject(b) &&
    Object.keys({...a, ...b}).every(k => a[k] === b[k])
    ||
  Array.isArray(a) && Array.isArray(b) && a.length === b.length &&
    a.every((e, i) => e === b[i])

In the previous snippet the fact that a and b shall have the same structure is captured by both Object.keys({...a, ...b}) that retrieves the set of sub names of a and b and the fact that if b do not have the sub name k b[k] === null

In case of vectors the snippet supposes that vector values represent ordered tuples. But there is a slightly different interpretation: vectors values can represent sets. In this case vector are equal if they have the same length and all values present in the first vector are present in the second vector:

Array.isArray(a) && Array.isArray(b) && a.length === b.length &&
    a.every(e => b.include(e))

the include check b.include(e)is performed using the sameValueZero algorithm.

In some cases is useful to know the object subentities that are not reference equal. This for example can help to understand why a react component is re-rendered. A simple snippet is:

const shallowDiff = (a, b) =>
  Object.keys({...a, ...b}).filter(k => a[k] !== b[k])

Equal structure (deep equality)

In this case two names refer to equal objects if all sub entities, at any level, share the same structure and have the same primitive values.

This is the most generic flavor of equality: if two entities are reference equal they are also shallow equal and deep equal.

It represent the equality flavor more near to the common intuition of object equality but it is rarely used because it have the most expensive decision procedure (this is particularly true for names that refer to large collections of objects).

const isObject = o => o !== null && typeof o === 'object'const deepEqual = (a, b) => 
  a === b 
    ||
  isObject(a) && isObject(b) && 
    Object.keys({...a, ...b}).every(k => deepEqual(a[k],b[k]))
    ||
  Array.isArray(a) && Array.isArray(b) && a.length === b.length &&
    a.every((e, i) => deepEqual(e, b[i]))

The procedure is expensive because of the recursive step it requires.

Equal key property (key equal)

This equal flavor is the most used criteria to identify entities. It is so precise that it is adopted by all revenue agencies. Actually, it’s also widely used in databases. Objects have a sub entity (usually a primitive type) that unequivocally identify the object. This property is called key (tax code for the revenue agency case).

Two names refer to the same object if their key is the same.

More generally objects can have a sub set of sub entities, the key attributes, that unequivocally identify the object. E.g. in Italy the tax code is derived from the set of key attributes composed by name, surname, bird date, gender, municipality of birth.

This flavor can be considered a simple version (computational effective) of the deep equal flavor when this statement is assumed:

If the key attributes are equal then all other attributes are equal.

But this flavor is very important because it owns a feature that makes it different from the previous three: it is possible to assume a slightly different statement:

If the key attributes are equal then object are equal independently from the values of other attributes

In this case it is possible to manage the identity of objects that mutate over time.
E.g. A taxpayer identified by his tax code has a gross income that changes every year {taxCode: xxx, grossIncome: nnn, tax: nn}