Galois connections


This article is part of my migration effort, moving some of my articles over from the excellent Functor Network.


Some theory

Let XX and YY be two partially ordered sets (posets). Two functions f:XYf : X \to Y and g:YXg : Y \to X are said to form a Galois connection when the following is true for all aXa \in X and all bYb \in Y: f(a)bag(b).\begin{equation*}f(a) \leq b \iff a \leq g(b).\tag{C}\end{equation*} We say that ff is the lower (or left) adjoint, and gg is the upper (or right) adjoint. Oftentimes the lower adjoint is marked with a lower-star f*f_* and the upper adjoint with an upper-star f*f^*. We will see for instance that the direct and inverse image form such a connection, and this hopefully explains the usual notation for these concepts. The shorthand I’m going to use for saying two functions are in Galois connection is f*f*f_* \dashv f^*.

In some sense, this says f(a)f(a) is the best approximation of aa from above using objects of YY with respect to gg, and g(b)g(b) is the best approximation of bb from below using objects of XX with respect to ff. Don’t worry if this is unclear (it is I’m sure).

Some examples

Three-way connection: direct image, inverse image, image in fibers

Let f:XYf : X \to Y be any set function. The direct image is the function f*:2X2Yf_* : 2^X \to 2^Y defined as f*(A)={yYxA,f(x)=y}.f_*(A) = \{ y \in Y \mid \exists x \in A, f(x) = y \}. The inverse image is the function f*:2Y2Xf^* : 2^Y \to 2^X defined as f*(B)={xXf(x)B}.f^*(B) = \{ x \in X \mid f(x) \in B \}. These functions form a Galois connection f*f*f_* \dashv f^*. The direct image is left adjoint to the inverse image, and reciprocally the inverse image is the right adjoint to the direct image. The use of “the” in the previous sentence will be explained below, where I show that the left or right adjoint, if it exists, is uniquely determined.

Here’s a more interesting example! We now construct a right adjoint to the inverse image. That’s not something you see often! Given a set function f:XYf : X \to Y, we define the image in fibers to be the function f!:2X2Yf_! : 2^X \to 2^Y with the equation f!(A)={yYf1(y)A}.f_!(A) = \{ y \in Y \mid f^{-1}(y) \subseteq A \}. It’s the set of points such that their fiber is contained in AA. It’s an interesting construction. Notice for instance that any point that is not in the image of ff has an empty fiber, hence is an element of f!(A)f_!(A). Let’s prove that there is a connection f*f!f^* \dashv f_!. First, suppose f*(B)Af^*(B) \subseteq A. In this context we want to show that Bf!(A)B \subseteq f_!(A), so pick any element bBb \in B. We just have to show f1(b)Af^{-1}(b) \subseteq A, which is the case because our hypothesis says every element of BB has its preimage contained in AA, so we’re done. Second, suppose Bf!(A)B \subseteq f_!(A), and let’s show that as a consequence f*(B)Af^*(B) \subseteq A. Pick any element af*(B)a \in f^*(B), so f(a)Bf(a) \in B. Hence f(a)f!(A)f(a) \in f_!(A) by hypothesis. This means f1(f(a))Af^{-1}(f(a)) \subseteq A, and since aa is an element of f1(f(a))f^{-1}(f(a)) we are done.

Linear algebra: span

Let VV be some vector space. Given a collection 𝒱\mathcal{V} of vectors of VV, their span is the smallest vector space which contains all vectors in 𝒱\mathcal{V}. It is denoted by 𝒱\langle \mathcal{V} \rangle. That’s a well-defined operation that produces a function :2VSub(V)\langle{-}\rangle : 2^V \to \text{Sub}(V), where Sub(V)\text{Sub}(V) is the set of subspaces of VV, partially ordered by inclusion. There’s also a “forgetful” function UU that goes the other way and produces the set of vectors underlying a given subspace of VV. These two functions are in Galois connection, with the span being left adjoint to the forgetful function.

Algebraic geometry: the vanishing set

Here’s another example which is close to my heart. Consider [x,y]\mathbb{C}[x,y] the ring of polynomials in the variables xx and yy, with coefficients in \mathbb{C}. Of course we could generalize the base field and the number of variables, but I bet you’ll be thankful I’m to lazy to do it. Anyways, for S[x,y]S \subseteq \mathbb{C}[x,y] a set of polynomial, we define their vanishing set V(S)V(S) to be the subset of points of 2\mathbb{C}^2 where they all vanish: V(S)={(x,y)2pS,p(x,y)=0}.V(S) = \{ (x,y) \in \mathbb{C}^2 \mid \forall p \in S, p(x,y) = 0\}. Reciprocally, given a subset V2V \subseteq \mathbb{C}^2, we define the set I(V)I(V) consisting of the polynomials that vanish at all points of VV: I(V)={p[x,y](x,y)V,p(x,y)=0}.I(V) = \{ p \in \mathbb{C}[x,y] \mid \forall (x,y) \in V, p(x,y) = 0\}. If we order the subsets of [x,y]\mathbb{C}[x,y] by “reverse containement”, we obtain a connection IVI \dashv V, which formally expresses the idea that being “algebraically small” is the same as being “geometrically big”, and vice-versa. In other words, the smaller a geometric place is, the more constraints (that’s the algebra part) are needed to specify it.

Closure in topology

Now for a purely topological example. Consider XX a topological space, and partially order its closed subsets by containement. Name the set of closed subsets 𝒞\mathcal{C}. The closure is the function that sends any subset AA of XX to A¯\overline{A}, the smallest closed subset that contains AA. There’s also a “forgetful” function U:𝒞2XU : \mathcal{C} \to 2^X that sends a closed subset to itself. The claim here is that the closure is left adjoint to the forgetful function.

Do you see the similarity between this example and the one about the span in vector spaces? It’s this kind of abstract correspondence that a Galois connection expresses.

More theory

The composite gf:XXg\circ f : X \to X is called the closure operator, and composition the other way around fg:YYf \circ g : Y \to Y is called the kernel operator. A connection is called a Galois insertion if the kernel operator is the identity map. Consider the last example where we used the closure of a set in a topological space. Here the closure operator is literally the closure, while the kernel operator is the identity, so we have a Galois insertion. The span example is also a Galois insertion.

Because we always have g(b)g(b)g(b) \leq g(b), the condition (C) says we always have f(g(b))bf(g(b)) \leq b (just set a=g(b)a = g(b)). Similarily, we always have ag(f(a))a \leq g(f(a)). Also notice that if ff and gg are in connection, then ff and gg are necessarily monotone. For instance, take a1a_1 and a2a_2 two elements of XX with a1a2a_1 \leq a_2. Then a1a2g(f(a2))a_1 \leq a_2 \leq g(f(a_2)), hence f(a1)f(a2)f(a_1) \leq f(a_2). For instance, this gives us for free that the direct and inverse image constructions are monotone, and we always have:

  • Af*(f*(A))A \subseteq f^*(f_*(A)) for any AXA \subseteq X
  • f*(f*(B))Bf_*(f^*(B)) \subseteq B for any BYB \subseteq Y

But also:

  • Bf!(f*(B))B \subseteq f_!(f^*(B)) for any BYB \subseteq Y
  • Af*(f!(A))A \subseteq f^*(f_!(A)) for any AXA \subseteq X
  • 𝒱𝒱\mathcal{V} \subseteq \langle \mathcal{V} \rangle for any 𝒱V\mathcal{V} \subseteq V (that’s actually part of the definition of the span)
  • YV(I(Y))Y \subseteq V(I(Y)) for any Y2Y \subseteq \mathbb{C}^2
  • I(V(S))SI(V(S)) \subseteq S for any S[x,y]S \subseteq \mathbb{C}[x,y]

While we’re at it, notice what happens if we replace aa by f*(b)f^*(b) in the equation af*(f*(a))a \leq f^*(f_*(a)). (Here f*f_* and f*f^* are two general functions in connection, not necessarily the direct/inverse image pair). We get f*(b)f*(f*(f*(b)))f^*(b) \leq f^*(f_*(f^*(b))), and that’s true for every bYb \in Y. On the other hand, it’s also true for every bb that f*(f*(b))bf_*(f^*(b)) \leq b, by what we said above. Because f*f^* is monotone, we obtain f*(f*(f*(b)))f*(b)f^*(f_*(f^*(b))) \leq f^*(b). From antisymmetry of our order relation, we conclude f*(b)=f*(f*(f*(b))).f^*(b) = f^*(f_*(f^*(b))). In other words, for every bYb \in Y the element f*(b)Xf^*(b) \in X is a fixed point for the closure operator. Similarily, for every aXa \in X the element f*(a)Yf_*(a) \in Y is a fixed point for the kernel operator.

Even more theory

If you stare long enough at condition (C) above, after a while you’ll understand suddenly and all at once that it is precisely the data of two equations. First, it gives a definition of the upper adjoint in terms of the lower adjoint: g(b)=max{aXf(a)b}.g(b) = \max\{ a \in X \mid f(a) \leq b \}. Second, it gives a definition of the lower adjoint in terms of the upper adjoint: f(a)=min{bYag(b)}.f(a) = \min\{ b \in Y \mid a \leq g(b) \}.

A striking consequence of the previous equations is that two functions in connection mutually define each other. Hence an adjoint to some given function, if it exists, is unique. Another sizeable consequence: if a function f:XYf : X \to Y is such that, for some bYb \in Y, the set {aXf(a)b}\{ a \in X \mid f(a) \leq b\} does not have a maximum element, then ff cannot have a right adjoint. A similar remark applies to a function g:YXg : Y \to X, which would then have no left adjoint if no minimum exists for a given aXa \in X.

When we considered the direct image f*f_* of a set function f:XYf : X \to Y, we explicitely built a right adjoint for it: it was the inverse image f*f^*. Now we can give a new characterization of the inverse image in terms of the direct image, using the previous formulas: for any BYB \subseteq Y, f*(B)=max{AXf*(A)B}.f^*(B) = \max\{ A \subseteq X \mid f_*(A) \subseteq B\}. In other words, the inverse image of a set BB is the biggest subset of XX such that its image is contained in BB. That’s really nice. We can also play the same game and characterize the direct image in terms of the inverse image: the direct image of a set AA is the smallest subset of YY such that its inverse image contains AA. Hence it seems lower adjoints can be thought of as supremums, approximating from above, while upper adjoints behave as infimums, approximating from below.

Since we have a three-way connection f*f*f!f_* \dashv f^* \dashv f_!, one could reasonably ask if it’s possible to extend it further. Let’s ask the first obvious question: is there a left adjoint to f*f_*? It would necessarily be given, for each BYB \subseteq Y, by the minimum of the set {AXBf*(A)}\{ A \subseteq X \mid B \subseteq f_*(A) \}. Suppose ff is not surjective. Then there is some point y0Yy_0 \in Y which is not contained in the image of ff. For a left adjoint to f*f_* to be defined, it would have to spit out, when evaluated at B={y0}B=\{y_0\}, the minimum of {AXy0f*(A)}\{A \subseteq X \mid y_0 \in f_*(A)\}, but this set is empty since y0y_0 is not in the image of ff. Therefore the left adjoint cannot exist when ff is not surjective. Well, what if ff is surjective? After thinking about this for a while, I realized we need another disjunction here. Suppose ff is not injective. Take two distincts elements x1x_1 and x2x_2 in XX with y=f(x1)=f(x2)y = f(x_1) = f(x_2). Set B={y}B = \{y\}. If a left adjoint to f*f_* were to exist, it would have to be defined at BB as the minimum of the set {AXyf*(A)}\{A \subseteq X \mid y \in f_*(A) \}. Clearly both {x1}\{x_1\} and {x2}\{x_2\} are elements of this set; hence, if there was a minimum element, it would have to be contained in both {x1}\{x_1\} and {x2}\{x_2\}, which would make it the empty set. However, the empty set fails to verify Bf*()B \subseteq f_*(\varnothing). So the left adjoint cannot exist in this case either. Since it trivially exists when ff is bijective (it’s f*f^*), we conclude: the direct image f*f_* admits a left adjoint if and only if ff is bijective.

The next obvious question on this matter is: does the image in fibers f!f_! admit a right adjoint? From the previous paragraph our expectations are: if and only if ff is bijective. Let’s see if that’s really the case. In fact, when ff is bijective we have f!=f*f_! = f_*, so by unicity of right adjoints we already know that the right adjoint of f!f_!, if it exists, has got to be f*f^*, which it is. So imagine ff is not bijective. First case: suppose it is not surjective. Then f!(A)f_!(A) contains all yYy \in Y which are not contained in the image of ff, since their fibers are all empty. Hence by taking BB to be any subset of YY completely contained in the image of ff, we find that the set {AXf!(A)B}\{A \subseteq X \mid f_!(A) \subseteq B\} is empty; in particular it has no maximum element, so the right adjoint does not exist. Second case: suppose ff is not injective. Without loss of generality we may suppose ff is surjective, so that the fiber of any point in YY is never empty. Pick some point yYy \in Y with at least two distinct preimages and partition f1(y)f^{-1}(y) into two disjoint non-empty subsets: f1(y)=A1A2f^{-1}(y) = A_1 \cup A_2. If the right adjoint were to exist, its value at the empty set would have to be the maximum element of the set {AXf!(A)=}\{A \subseteq X \mid f_!(A) = \varnothing\}. Quick scribling on napkin paper is able to show that both A1A_1 and A2A_2 are contained in this set, but their union is not. Hence it is necessary that ff be bijective for f!f_! to admit a right adjoint.

We have seen that the inverse image is quite special because it is the only one having both a left and a right adjoint in general. This is reflected in the nice properties the inverse image admits, for instance it preserves unions and intersections of subsets. We will see momentarily that this fact is an instance of a general fact about Galois connections (and even more generally about adjunctions).

When will this theory ever end?

I want to end on the general note that left adjoints preserve joins and right adjoints preserve meets. This is a special case of a more general theorem that left adjoints (in a more general sense) preserve colimits while right adjoints preserve limits.

Suppose f*f*f_* \dashv f^* are functions in Galois connection, with f*:XYf_* : X \to Y. Suppose also that XX and YY are not only posets but also lattices (i.e. every pair of elements has an infimum and a supremum). Fix a1a_1 and a2a_2 two elements of XX. Because f*f_* is necessarily monotone, we must have f*(a1)f*(a1a2)f_*(a_1) \leq f_*(a_1 \vee a_2) and f*(a2)f*(a1a2)f_*(a_2) \leq f_*(a_1 \vee a_2). Hence f*(a1a2)f_*(a_1 \vee a_2) is an upper bound for the pair f*(a1)f_*(a_1), f*(a2)f_*(a_2). Now we show that this is not only an upper bound, it’s actually the least upper bound. Let bb be any other upper bound, that is, f*(a1)bf_*(a_1) \leq b and f*(a2)bf_*(a_2) \leq b. By the Galois connection condition we must have a1f*(b)a_1 \leq f^*(b) and a2f*(b)a_2 \leq f^*(b). Then f*(b)f^*(b) is an upper bound for the pair a1a_1, a2a_2 whence a1a2f*(b)a_1 \vee a_2 \leq f^*(b). But now by the connection condition again, we must have f*(a1a2)bf_*(a_1 \vee a_2) \leq b, which is what we wanted. Therefore f*(a1a2)=f*(a1)f*(a2)f_*(a_1 \vee a_2) = f_*(a_1) \vee f_*(a_2). A dual argument shows that f*(b1b2)=f*(b1)f*(b2)f^*(b_1 \wedge b_2) = f^*(b_1) \wedge f^*(b_2) for any pair of elements b1b_1, b2b_2 in YY.

Let’s return to our example of three-way connection between the direct image, the inverse image and the image in fibers for any set function f:XYf : X \to Y. What we can conclude from the previous paragraph is: the inverse image is very special, for it preserves both unions and intersections, being a right and left adjoint at the same time. The direct image only preserves unions in general, while the image in fibers only preserves intersections.