Archives for category: category theory

Firstly, sorry for the blog silence! Christmas, New Year, birthday and exams all got in the way… but now we’re finally onto this Zariski site post I’ve been going on about!

If you remember some of the stuff from this post about sheaves on sites, and this post about the functor of points of a scheme , then you’re good to go here. We introduced the functor of points of a scheme X, namely the representable presheaf h_X = \text{Mor}(-,X), and looked at how it relates to the solutions of Diophantine equations in the affine case X = \text{Spec}(A). We also saw here how if X is a K-scheme then the set of “rational points”  X(K) of X identifies with h_X (\text{Spec}(K)) and with the sections of the morphism X\rightarrow \text{Spec}(K). And then I mentioned somewhere that it is possible to “identify” a scheme with its functor of points.

What this really means is that we can “embed’ the category of schemes into the larger category of presheaves on the category of schemes (this is just the Yoneda embedding), but we can also completely characterise when a presheaf on the category of schemes is the functor of points of some scheme. To do this, we had to introduce the notion of what a sheaf on a category was, so we needed Grothendieck (pre)topologies. It will turn out that a presheaf on the category of (affine) schemes is the functor of points of a scheme if and only if it is a sheaf on the Zariski site and a certain “covering condition” is satisfied.

To explain why I have put the word “affine” in brackets above, we’ll prove the following nice lemma:

Lemma: Let R be a ring, and \textbf{Sch}/R the category of schemes over R. The functor

h: \textbf{Sch}/R \rightarrow [R\textbf{-Alg}, \textbf{Set}], \quad X\mapsto h_X = \text{Mor}_R (-,X)

is full and faithful, where the presheaf $text{Mor}_R (-,X)$ denotes the presheaf on the category of R-schemes.

This looks very like the usual Yoneda embedding except that the functor category here is slightly different. We’re sending each R-scheme X to its functor of points h_X, but this time not considered as a presheaf on the category of all R-schemes but now just as a presheaf on the subcategory of affine R-schemes; in other words, as a covariant functor from the category of R-algebras to sets. The lemma says that each functor of points is completely determined by its action on affine R-schemes, which in some way just reflects the fact that all schemes are just glued together from affine schemes.

Proof of the lemma: Let S = \text{Spec}(R) and let h_X denote the restriction of the presheaf \text{Mor}_S (-, X) to the category of affine R-schemes. We want to show that every natural transformation \phi: h_X \mapsto h_{X'} comes from a unique morphism of R-schemes f: X\rightarrow X'. To do this, let X = \cup_{i} X_i be a cover of X by affine schemes, and let

j_i : X_i \hookrightarrow X

denote the inclusions (which are morphisms of R-schemes). Then f_i:=\phi_{X_i} (j_i)\in h_{X'} (X_i) is a morphism X_i \rightarrow X'. Then using the “glueing” of the topological spaces, continuous maps and structure sheaves, we get a unique morphism of schemes f: X\rightarrow X' such that the restriction of f to X_i is f_i.

You can check that the natural transformation \phi: h_X \rightarrow h_{X'} is the image of f under h, so h is a full functor. Now again using that schemes are glued up from affine schemes, we see that if we have two morphisms f,g: X\rightarrow X' that differ then they must differ on one of the affine covers X_i\subseteq X. Then by the construction above, the induced natural transformations h_f, h_g: h_X \rightarrow h_{X'} will not be equal because they differ on their component at X_i. Thus h is a full functor. This completes the proof.

So a (relative) scheme is determined by the values its functor of points takes on the category of affine (relative) schemes, which really is just the dual of the category of rings. This gets around the awkward circular attempt to define a scheme as a certain type of presheaf on the category of schemes, because we already know what the category of rings is. When we eventually categorise schemes as certain sheaves on the category of affine schemes (the dual of the category of rings) we won’t need to already know what a scheme is because everything goes through purely categorically using that the category of affine schemes is dual to rings, and we can define topologies on this dual category.

The Zariski Topology

Let \mathcal{C} denote the category of affine schemes (i.e. the dual of the category of rings). To specify what the Zariski pretopology is on \mathcal{C} we need to say what the covering families are:

Suppose \left\{\text{Spec}(A_i)\rightarrow \text{Spec}(S)\right\} is a family of morphisms in \mathcal{C}. This family is a covering family for the Zariski pretopology iff

  1. Each ring A_i is the localisation of S at a single element s_i \in S;
  2. The morphism \text{Spec}(A_i)\rightarrow\text{Spec}(S) is the functorial inclusion induced by the localisation map S\rightarrow A_i = S[s_i^{-1}];
  3. There exists a finite set f_i \in S such that s_1 f_1 + \dots + s_n f_n = 1.

The first condition means we want to think of each s_i as a function on \text{Spec}(S); the second condition means we want the inclusion \text{Spec}(A_i)\hookrightarrow \text{Spec}(S) to be the subset of the space on which the function s_i is nonzero; the third condition expresses that these functions should form a “partition of unity“, a useful geometric condition.

You can check that these satisfy the axioms for a pretopology on \mathcal{C} given in this post – clearly, the family \left\{id_{\text{Spec}(S)}\right\} is a covering family induced by localisation at 1\in S; the pullback condition holds by looking at the images of these localised elements under homomorphisms; the final condition holds via composition of localisations.

This defines a pretopology on \mathcal{C} denoted J_{\text{Zar}}, the Zariski pretopology. This induces a unique Grothendieck topology on the category, and we denote the site (\mathcal{C}, J_{\text{Zar}}) by \textbf{Aff}_{\text{Zar}}, the affine Zariski site. A sheaf on this site is then a functor

F: \textbf{Aff}_{\text{Zar}} \rightarrow \textbf{Set}

such that for every covering family

\left\{\text{Spec}(A_i) = X_i \rightarrow\text{Spec}(S) = X\right\}

the set F(X) is the equaliser of the two maps \prod F(X_i) \rightrightarrows \prod F(X_i \times_X X_j) induced by restrictions, where the pullback in the category of affine schemes is defined as

X_i \times_X X_j = \text{Spec}\left(A_i \otimes A_j \right)

i.e. the spectrum of the tensor product of the S-algebras A_i and A_j.

Open subfunctors

We’re nearly ready to state the classification theorem for schemes as functors; we just need one final notion:

Suppose that \alpha: G\hookrightarrow F is a subfunctor, where both are functors from rings to sets. We say that G is an open subfunctor if whenever X = \text{Spec}(R) is an affine scheme and \psi: h_X \rightarrow F is a natural transformation, the morphism

h_X \times_F G \rightarrow h_X

is isomorphic to the natural transformation h_Y \rightarrow h_X induced by the inclusion Y\hookrightarrow X of an open subscheme.

One can show that the open subfunctors of a functor of points h_X are precisely the functors of points of open subschemes of X.

Characterisation of schemes amongst all functors

We’ve seen that the whole category of schemes embeds inside the category of presheaves on the category of affine schemes. Let’s now see what that embedded category looks like:

Functor classication theorem: Let F: \textbf{Aff}^{\text{op}} \rightarrow\textbf{Set} be a presheaf on the Zariski site (i.e. a functor from rings to sets). Then F is the functor of points of a scheme X if and only if

  1. F is a sheaf on the Zariski site \textbf{Aff}_{\text{Zar}};
  2. there exists rings R_i and open subfunctors \alpha_i : h_{R_i} \rightarrow F such that for each field K, F(\text{Spec}(K)) is the union of the images of the sets h_{R_i} (K) under the \text{Spec}(K)-components of the natural transformations \alpha_i.

This is actually not hard to prove – just using that schemes are glued up from affine schemes. Although it’s abstract it’s somehow closer to home than viewing schemes as geometric spaces, because we understand algebra (rings) well in the abstract sense, and we understand categorical algebra well too – the geometry of schemes then emerges as categorical algebra on the category of rings. I’ll try to emphasise the two points of view throughout this blog, although the “ringed space” viewpoint will undoubtedly serve us better until we start doing more advanced stuff, where it is possible to change the Zariski site to “finer” sites like the étale site having better properties. It’s also necessary to use the functor of points viewpoint for anything involving stacks, and generalisation of schemes, and this particular arm of algebraic geometry makes heavy use of this “geometry-through-algebra” idea.

Next time I’m leaving all this abstract nonsense and going back to doing some real geometry with schemes, starting from the basics.

Right, it’s nearly Christmas and I have been so busy working on my undergrad thesis that I haven’t had any time to fire off another proper post. So I just wanted to record a few more basic facts about Grothendieck topologies further to my last post about them!

Let \mathcal{C} be any small category. Just as for an arbitrary topological space, there are two obvious (and not always so interesting) Grothendieck topologies we can put on \mathcal{C}:

The Indiscrete Topology: for each U\in\mathcal{C}, let J_{\text{ind}}(U) consist of just the maximal sieve \bigcup_{A\in\mathcal{C}} \mathcal{C}(A,U). This is analogous to the indiscrete topology on a topological space where the only open sets are the empty set and the whole set. The Grothendieck pretopology associated to this topology (if \mathcal{C} has pullbacks) has only isomorphisms as covering families, so it’s a strange topology in which you can’t cover “open sets” with any smaller or larger “open sets”.

Another noticeable thing about the indiscrete topology J_{\text{ind}} is that every presheaf F on the site (\mathcal{C}, J_{\text{ind}}) is a sheaf; this is because the only sieve on an object U is the maximal sieve, which can be identified with the respresentable presheaf h_U = \mathcal{C}(-,U), and every morphism h_U \rightarrow F in [\mathcal{C}^{op}, \text{Set}] has exactly one extension to a morphism h_U \rightarrow F; namely, the identity natural transformation.

This is analogous to equipping a space with the usual indiscrete topology; the sheaf axioms are automatically satisfied for any presheaf because the only open sets to worry about are the whole space and the empty set. There is not enough “separation” between points to mess up a presheaf becoming a sheaf.

The discrete topology: Conversely, the “finest” possible topology on a category is the discrete topology J_{\text{dis}} in which every sieve is a covering sieve. Here every single covering family is a covering of a set. In contrast to the indiscrete topology, the only presheaf on (\mathcal{C}, J_{\text{dis}}) that is a sheaf is the terminal object in the category [\mathcal{C}^{op}, \text{Set}] i.e. the presheaf sending every object to a one-point set. There is just “too much separation” going on for sheaves to behave nicely.


Apart from these two extremes, there is also another checkpoint in the lattice of Grothendieck topologies on a category; this is called the canonical topology. As we have seen, the more covering sieves are introduced into a topology, the fewer sheaves are left on the category. There are certain presheaves that, often, we would like to be sheaves; these are the representable presheaves h_U = \mathcal{C}(-,U). As we’ll later see, schemes turn out to be certain sheaves on the “Zariski site”, and these are closely linked to the concept of representability.

The canonical topology is the largest Grothendieck topology on a category on which every representable presheaf h_U is a sheaf. Any topology where every representable presheaf is a sheaf is called subcanonical, and in practice most useful topologies are subcanonical – the Zariski topology, étale topology, fppf and fpqc topologies, …

One day in the future we’ll get to work with these exotic things but at the moment we’ll stick with the first. Anyway, I’ll get back to this sometime soon; Merry Christmas and a happy New year!

This is going to be one of the most abstract and technical posts yet; it’s inessential for reading this blog, but I wanted to include it because I would like to cultivate as many different possible viewpoints of scheme theory from the very start. There will be one more post running in this vein and then we will go back to doing actual geometry. However, this post will provide some more structure to the “functor of points” approach we have seen.

Essentially, when learning about the functor of points we saw that every scheme can be viewed as a set-valued presheaf on the opposite of the category of rings (i.e. just a functor from rings to sets, but this is a better viewpoint). But in fact this does not characterise schemes – not every presheaf on the opposite category of rings is the functor of points of a scheme (i.e. not every presheaf on the category of rings is representable). So in order to characterise schemes in this way we will need to find some further structure present. It turns out that with a suitable definition of a “topology” on the (opposite) category of rings – turning this category into something like a space – we will see that schemes are actually the sheaves on this category, which obey the “identity” and “glueing” axioms we looked at in one of the very first posts on this blog.

This post will consist of establishing the vocabulary of these so-called “Grothendieck (pre)topologies” on categories, and we will use these to define sheaves on categories in two ways. In an upcoming post we will use this to see the applicability to schemes, but this is going to be a purely category-theoretic post (and mostly taken from Johnstone’s lovely book Topos Theory). Feel free to skip!


Let \mathcal{C} be a (small – objects form a set) category with pullbacks. A Grothendieck pretopology P on \mathcal{C} is an assignment of a set P(U) to every object U\in\mathcal{C} where the elements of P(U) are sets of \mathcal{C}-morphisms

\left\{ U_i \xrightarrow{\alpha_i} U: i\in I\right\}

into U (called covering families) satisfying the following three axioms:

  1. For every U\in\mathcal{C}, the family \left\{\text{id}_U : U\rightarrow U\right\} \in P(U);
  2. If f: V\rightarrow U is a morphism in \mathcal{C} and \left\{ U_i \rightarrow U: i\in I\right\}\in P(U) then \left\{V\times_U U_i \xrightarrow{\pi_V} V: i\in I\right\}\in P(V);
  3. If \left\{ U_i \xrightarrow{\alpha_i} U : i\in I\right\} is an element of P(U) and \left\{V_{i j} \xrightarrow{\beta_{i j}} U_i : j\in J_i \right\} is an element of P(U_i) for each i\in I then the composition \left\{ V_{i j} \xrightarrow{\alpha_i \circ \beta_{i j}} U : i\in I, j\in J\right\} is an element of P(U).

Morally, these should be interpreted as follows: if U is any topological space then (1) U\xrightarrow{\text{id}} U is an open covering of U; (2) if you’re given an open covering U_i \hookrightarrow U of a space U and V is a subset of U then V \cap U_i \hookrightarrow V is an open covering of V; and (3) if you’re given an open covering U_i \hookrightarrow U and each of the open sets U_i has an open covering V_{i j} \hookrightarrow U_i then the collection of all the V_{i j} also covers U.

As a concrete example, let X be a topological space and let \text{Op}(X) be the poset category of all its open subsets, ordered by inclusion (we saw this in this post). The pretopology on this category is just the set of open covers of each open set i.e. a covering family for U is just a set of inclusions U_i \hookrightarrow U such that the union of all the U_i equals U.

Grothendieck pretopologies already give us enough “local” information in our categories to define a suitable notion of a sheaf on the category. Recall we could define a presheaf on any category as simply a contravariant functor, but in order to define a sheaf we needed to be able to talk about covers, and this is exactly what a pretopology lets us do. With this in mind we say that a presheaf F on a category \mathcal{C} (taking values in sets) is a sheaf if for every covering family \left\{ U_i \rightarrow U: i\in I\right\}\in P(U) the diagram

F(U) \rightarrow \prod_{i\in I} F(U_i) \rightrightarrows \prod_{i,j\in I} F(U_i \times_U U_j)

is an equaliser. Here the first map is the universal map into the product induced by the restrictions F(U)\rightarrow F(U_i) for every morphism U_i \rightarrow U in the covering family. The top and bottom second maps are constructed as follows: let

\pi_i : \prod_{i\in I} F(U_i) \rightarrow F(U_i)

be the canonical projections. For each j\in I we have a restriction map

\rho^i_{i j } : F(U_i)\rightarrow F(U_i \times_U U_j)

Therefore the compositions \rho^i_{i j} \circ \pi_i induce a universal morphism \prod_{i\in I} F(U_i) \rightarrow \prod_{i,j} F(U_i \times_U U_j). This is the first of the two maps. The second follows by first restricting to U_j and then repeating the procedure. In general these will give different morphisms, but must agree when precomposed with the morphism from F(U) if F is a sheaf.

We will use this definition of a sheaf sometimes because it is more closely related to what we have already seen. However it is a slightly vague definition; to see what I mean, suppose that F satisfies this condition for a covering family R = \left\{U_i \rightarrow U: i\in I\right\}\in J(U). Then if S is another family of covers containing R, F will also satisfy the sheaf condition with respect to S. This means that two possibly different pretopologies can give rise to exactly the same sheaves.

This is a far-reaching quasi-generalisation of the fact that two different metrics on a space can induce exactly the same topology (the metrics then being called equivalent) and therefore the sheaves on these spaces (in the usual sense we defined here) will be exactly the same. Given that sheaves are more of  a “topological” object and not a “metric” object, it might make sense to get to the root of this and define actual “topologies” on our categories rather than “metrics” (bear in mind this is a very loose analogy!). So we might want to consider covering families which are somehow “maximal”, and this is what the notion of a “sieve” encapsulates. Following this, we can also dispense with the assumption that our category \mathcal{C} has pullbacks; the drawback is that the notion of a sheaf becomes significantly more abstract.


Let P be a Grothendieck pretopology on a category \mathcal{C}, and let U\in\mathcal{C} be an object. We say the set R\in P(U) is a sieve on U if

\left(V\xrightarrow{\alpha} U\right)\in R \implies \left(W\xrightarrow{\beta} V\xrightarrow{\alpha} U\right)\in R

for every morphism \beta: W\rightarrow V.

Definition: Let \mathcal{C} be a small category. A Grothendieck topology J on \mathcal{C} is a set J(U) of sieves on every object U\in\mathcal{C} (called covering sieves) such that

  1. The maximal sieve \bigcup_{A\in\mathcal{C}} \mathcal{C}(A,U) is an element of J(U);
  2. If R\in J(U) and f: V\rightarrow U is a morphism in \mathcal{C} then the pullback sieve f^* (R) = \left\{W\xrightarrow{\alpha} V : f\circ \alpha \in R \right\} is an element of J(V);
  3. If R\in J(U) and S is some sieve on U such that for every morphism (V\xrightarrow{f} U)\in R we have f^* (S)\in J(V) then S\in J(U).

A category equipped with a (Grothendieck) topology is called a site. This definition implies some important things:

Let R\in J(U) and let S be a sieve on U containing R i.e. every morphism in R is in S. Pick a morphism f: V\rightarrow U in R; then if \alpha: W\rightarrow V is is any morphism in f^* (S), the fact that R is a sieve implies that (W\xrightarrow{\alpha} V \xrightarrow{f} U)\in R, and therefore (W\xrightarrow{\alpha} V)\in f^* (R). But this shows that f^* (S)\subseteq f^* (R). Conversely it is clear that f^* (R)\subseteq f^* (S) from the definitions; therefore f^* (S) = f^* (R)\in J(V). By axiom 3, it follows that S\in J(U).

Note that every Grothendieck pretopology J induces a Grothendieck topology J' as follows: for each covering family R\in J(U), the collection J'(U) of sieves S on U containing R is a Grothendieck topology (check the axioms!). This is in some way (maybe) like how a metric induces a topology on a space, and in some sense this makes topologies more general.

The real benefit of using sieves instead of covering families is that we can drop the assumption about the base category having pullbacks. The reason is that we can always pull a sieve back along a morphism even if we cannot do this for the individual morphisms in the sieve. To see why this works, let R be a sieve on an object U\in\mathcal{C} and let

h_U = \mathcal{C}(-, U)

be the representable presheaf on \mathcal{C} induced by U. Let

\hat{R}: \mathcal{C}^\text{op} \rightarrow \text{Set}

be the presheaf V\mapsto \left\{\alpha \in R : \text{domain}(\alpha) = V\right\}. Then \hat{R} is a presheaf on \mathcal{C} and is actually a subpresheaf of the representable presheaf h_U. This means that in the functor category [\mathcal{C}^\text{op}, \text{Set}] \hat{R} is a subobject of h_U, but concretely it just means that for every V\in\mathcal{C}, \hat{R}(V) is a subset of h_U (V). Now the presheaf category [\mathcal{C}^\text{op}, \text{Set}] does have pullbacks (indeed, it has all limits and colimits and these are computed “pointwise”). By the Yoneda lemma we can think of \mathcal{C} as being fully and faithfully embedded in this presheaf category, so we can identify objects U with their representable presheaves h_U and sieves R with the presheaves \hat{R}. This explains why we can dispense with \mathcal{C} having pullbacks, because they are present in this larger category in which \mathcal{C} sits.

Sheaves on a site

A sheaf F on a site (\mathcal{C}, J) is then a presheaf on \mathcal{C} such that for every U\in\mathcal{C} and every covering sieve R\in J(U), every morphism

\hat{R}\rightarrow F

in [\mathcal{C}^\text{op}, \text{Set}] has exactly one extension to a morphism

h_U \rightarrow F

Ouch! How abstract can you get???

But there’s a nice way to decode this to see it really isn’t too different to what we already have. Recall that by the Yoneda Lemma, natural transformations h_U \rightarrow F correspond bijectively to elements of the set F(U). Now, a natural transformation \eta:\hat{R}\rightarrow F consists of a function \eta_V: \hat{R}(V)\rightarrow F(V) for every object V\in\mathcal{C} that is compatible with morphisms f: W\rightarrow V i.e. we get the usual naturality squares for every morphism.

If V is not the domain of any of the morphisms in the sieve R then \hat{R}(V)=\varnothing. In this case the function \eta_V is the unique “empty function”. If not, \eta_V produces an element s_\alpha of F(V) for every element \alpha\in\hat{R}(V). Furthermore, if f: W\rightarrow V is a morphism then since R is a sieve, \hat{R}(W) is nonempty and we obtain elements

\hat{f}(\alpha) = \alpha\circ f\in \hat{R}(W)

for each \alpha\in\hat{R}(V). Mapping these into F(W) via \eta_W we see that F(f) (s_\alpha) equals s_{\hat{f}(\alpha)}.

Therefore a morphism \eta: \hat{R}\rightarrow F amounts to specifying a collection of “sections” s_\alpha in every F(V) for each V arising as a domain of one of the elements in R which are “compatible” in the sense that the “restriction maps” F(V)\rightarrow F(W) induced by morphisms W\rightarrow V take sections to sections. By the definition above, F being a sheaf means there is a unique natural transformation h_U \rightarrow F extending \eta. But since this corresponds bijectively to an element of F(U), this means there is a unique section s\in F(U) such that the “restriction” of s to each V (by which I mean the map F(U)\rightarrow F(V) induced by some \alpha\in \hat{R}(V)) equals s_\alpha. And this looks a lot more like our definition of a sheaf we originally gave!

Here are some final notes for this post:

  1. In the last paragraph we saw how unique “sections” s\in F(U) can be glued up uniquely from smaller sections s_\alpha when F is a sheaf. The crucial difference (which confused me for a while) is that whereas in the topological case there was always at most one map V\hookrightarrow U (corresponding to the inclusion of an open subset), this time there may be many different ways to “restrict a section”; in fact we can do this along each of the different morphisms \alpha\in\hat{R}(V). So in F(V) there may be many different sections s_\alpha, s_\beta, \dots $ all being “restrictions” of the same “global section” s; this is fine, just noting that they have been “restricted” in different ways.
  2. Sometimes it is more useful to work with a topology and other times a pretopology may be better suited. I get the feeling from reading the experts that pretopologies may be more suitable for actually computing things, whereas topologies and sites are more elegant, but really I’m not sure. I’m only beginning to get my head around these things.

Finally, in one of the next few posts I will attempt to use these gadgets to show how every scheme, as we’ve defined them, can also be defined as a sheaf on the Zariski site.


Schemes really are mysterious objects; in one form they manifest themselves as locally ringed spaces as we’ve defined them. But they also appear naturally as set-valued functors on the category of rings. There is actually an intuitive explanation of this fact, and in Diophantine geometry this method of viewing schemes is sometimes very helpful. This might motivate what’s going on:

Let A be a ring and let X = \text{Spec}(A); consider the polynomial f=x^2 + 1\in A[x]; for a ring B let X(B) denote the set of roots \alpha\in B to f. We call X(B) the set of “B-points of X“. For example, say A = \mathbb Z. Then X(\mathbb {R}) = \varnothing but X(\mathbb{C}) = \left\{i, -i\right\}, X(\mathbb{F}_2) = \left\{1\right\} and if p \equiv 1 \pmod{4} then X(\mathbb{F}_p) has two points. So applying X to a ring takes the ring to the set of points of a variety over that ring.

The “functor of points” approach says that we should think of each of these varieties X(R) as being “part of the scheme X“, and the scheme X itself is built from the collections of all its R-points X(R) as R ranges over every ring. Alternatively, a scheme really is a “protocol” assigning to each ring the set of solutions to an equation. This is intuitively why we have to define a variety over a field but schemes in general are not dependent on some ground field or ring (although there is a notion of an S-scheme which is the relative version analogous to varieties over a field which I need to introduce sometime), and this explains why we can chop, change and glue schemes easily, unlike for varieties.

Now viewing a scheme as the variety cut out by a polynomial over every possible ring gives us a lot more information about a polynomial f than just considering its roots in any one ring. For example, there are many curves over the complex numbers all having only one “rational point” i.e. a point in X(\mathbb{Q}); but these curves may be fundamentally different – for example, they may have different point structures over number fields i.e. the sets X(K) and X(L) may be different for different number fields K and L. So the overall schemes associated to the curve differ in their points coming from certain rings but not others. Considering just their R-points which agree for a certain R loses lots of information about the overall schemes.

As a concrete example, let consider the two curves C_1 and C_2 given respectively by the equations

C_1: y^2 = x^3 + 7

C_2: y^2 = x^3 - 2

It is a theorem of Lebesque that there are no integer points on C_1 (i.e. points (x,y), both integers, satisfying the equation for C_1), while Fermat proved that C_2 has precisely two integer points, namely (x,y) = (3,5), (3,-5). The curves C_1 and C_2 are both examples of “elliptic curves” (which we will see lots more of), and over \mathbb C they are both topologically just tori. So there is a bijection C_1 (\mathbb{C}) \cong C_2 (\mathbb{C}), while C_1 (\mathbb{Z}) = \varnothing and C_2 (\mathbb{Z}) has two points. If we only took the \mathbb{C}-points of C_1 and C_2 into account, we might think they were isomorphic, but considering their collections of all their points shows us they are not.

But there’s more than this: let f now be a polynomial in n variables (we can naturally also consider collections of polynomials). If \phi: R\rightarrow S is a ring homomorphism then we get a natural function

\phi_* : X(R)\rightarrow X(S)

given by

(a_1, \dots, a_n) \mapsto (\phi(a_1), \dots, \phi(a_n))

In this case we see that X is a functor from the category of commutative rings to sets. In fact, we will now see that X is representable i.e. it is naturally isomorphic to a functor of the form \text{Hom}(C,-) where C is some ring. Let’s set everything up in generality:

Theorem: Let A be a ring and let g_1, \dots, g_m be polynomials in A[x_1, \dots , x_n]. Let I be the ideal generated by g_1, \dots, g_m and set

X = \text{Spec}\left(A[x_1, \dots, x_n]/I\right)

Let h_X be the functor from the category of commutative rings to sets

h_X(R) = \left\{P \in R^n: h(P) = 0 \text{ for all } h\in I\right\}

Then h_X is naturally isomorphic to the functor

\text{Hom}(A[x_1, \dots, x_n]/I, -)\cong\text{Mor}(\text{Spec}(-), X)

Proof: We will prove naturality first and then that this is an isomorphism. Let R be a ring and take P\in h_X (R). Then the homomorphism “evaluate at P

e_P : A[x_1, \dots, x_n]/I \rightarrow R, h+I \mapsto h(P)

is well-defined. This gives a function

f_R : X(R)\rightarrow \text{Hom}(A[x_1, \dots, x_n]/I, R), P\mapsto e_P

If \phi: R\rightarrow S is a ring homomorphism, it is easy to see that

\phi \circ e_P = e_{\phi_* (P)}

This establishes naturality. Now suppose that we are given a ring homomorphism \psi: A[x_1, \dots, x_n]/I\rightarrow R; then this ring homomorphism is determined by where it sends A and each monomial x_i. Thus choosing P = (a_1, \dots, a_n)\in R^n so that \psi(x_i) = a_i is well-defined provided that \psi(g_j)(P) = 0 for each j i.e. P\in X(R). So let us define

q_R : \text{Hom}(A[x_1, \dots, x_n]/I, R)\rightarrow X(R)

\psi \mapsto P = (\psi(x_1 + I), \dots, \psi(x_n +I))

Then it is easy to see that q_R = f_R^{-1}, so every component of the natural transformation is a bijection and therefore it is a natural isomorphism. The isomorphism with \text{Mor}(-,X) follows from the big theorem in this post. Therefore we see that the functor assigning to an affine scheme X its collections of R-points for each ring R  is the same as the (contravariant) “codomain functor” assiging to X the collections of morphisms from affine schemes into X.

Here’s how this connects to general schemes; let X now be an arbitrary scheme. Then we know the functor

\text{Mor}(\text{Spec}(-), X): \text{CRing}\rightarrow \text{Set}

is isomorphic to the functor h_X sending each ring R to the set of R-points of X. The Yoneda embedding is full and faithful; this means that the functor X\mapsto h_X on the category of schemes “embeds” the category of schemes into the larger category of functors \text{CRing}\rightarrow\text{Set} i.e. the category of presheaves on the category of affine schemes. What this really tells us is that two schemes X and Y can be distinguished by knowing all their different R-points X(R) and Y(R) as R varies over all rings i.e. the morphisms from all affine schemes into an arbitrary scheme determine that scheme up to isomorphism.

Therefore we can “redefine” what a scheme is: it is (roughly) a presheaf on the opposite category of rings i.e. a functor from the category of rings to sets. More specifically we need to equip the category of commutative rings with something called a Grothendieck topology to turn it into a site (the Zariski site) which is basically a way of turning a category into a “space”; then every scheme is actually a sheaf on the Zariski site. I will explore this idea more next time.

You might want to look at this fantastic post on Neverendingbooks for an example of viewing \text{Spec}(\mathbb{Z}[x]) through its functor of points. For now, I will just give a couple of quick examples of the functor of points:

Let X = \text{Spec}(\mathbb{Z}[x]) be the “arithmetic plane” scheme; then for every ring R, the set of R-points of X is

h_X (R) \cong \text{Mor}(\text{Spec}(R), X) \cong \text{Hom}(\mathbb{Z}[x], R)

Now a ring morphism \mathbb{Z}[x]\rightarrow R is determined by what happens to \mathbb{Z} and what happens to x. Since \mathbb Z is the initial object in the category of rings, it follows that every morphism \mathbb{Z}[x]\rightarrow R is determined by where x is sent alone. Since we can “evaluate” a polynomial f\in \mathbb{Z}[x] at any element of R, it follows that h_X (R) is isomorphic to the underlying point set of the ring R i.e. h_X is just the forgetful functor from the category of rings to sets. It is interesting to note that the forgetful functor is representable, and the linked blog post above tells us that this does give us interesting information about X.

What about when X = \text{Spec}(\mathbb{Z})? Then h_X (R) \cong \text{Hom}(\mathbb{Z}, R). Since \mathbb{Z} is the initial object, each of these is a one-element set and all such sets are therefore isomorphic. Hence h_X is the functor mapping every ring to a terminal object in the category of sets i.e. a one-point set. So this functor is also representable! Translating this to schemes, it means every affine scheme Y as exactly one morphism to X i.e. there is exactly one Y-point of X for every affine scheme Y.

Finally, a note about the name “functor of points”; in arithmetic geometry, we’re often interested in the rational points on a variety/scheme X i.e. the K-points X(K), where K is some (not necessarily algebraically closed) field. We’ve seen these correspond to the scheme morphisms

\text{Spec}(K)\rightarrow X

and since \text{Spec}(K) is just a single closed point, we can think of these morphisms as the actual points on the scheme X (as a locally ringed space) that “look like” \text{Spec}(K) i.e. points where the restriction of the structure sheaf becomes isomorphic to K. These we can think of as points having coordinates valued in the field K, and recover the usual definition of K-points this way. The kicker is that we can do this for other rings R that are not fields, which correspond to “points” of X that “look like” \text{Spec}(R), and structurally are subsets of the scheme. Doing this for every single ring R splits up the scheme into a (not necessarily disjoint) union of all its generalised “points”, along with the functorial data of how these fit together in terms of the ring morphisms. I’ve drawn a vague picture of how this works for the scheme

E = \text{Spec}\left(\mathbb{Z}[x,y]/(y^2 - x^3 + 2)\right)

corresponding to the elliptic curve C_2: y^2 = x^3 -2 above.


In this post I’d like to establish some basic properties of schemes and prove a really important theorem – the category of affine schemes is the categorical dual of the category of commutative rings!

Basic Technical Results

So, let’s begin:

Theorem: Let R be a commutative ring and X=\text{Spec}(R) the affine scheme obtained from R (here I use X to mean the ringed space with the sheaf of rings \mathcal{O}_X hiding somewhere in the context!). Let P\in X be any point (i.e. any prime ideal of R). Then the stalk \mathcal{O}_{X, P} is isomorphic to the localisation R_P by a unique isomorphism.

Proof: Recall that the stalk at P is the colimit \mathcal{O}_{X,P} = \text{colim}_{P\in U} \mathcal{O}_X (U) where U runs over all open sets of X. Since we have a base for our topology consisting of the basic open sets D(f), we may as well take the colimit over this instead:

\mathcal{O}_{X,P} = \text{colim}_{P\in D(f)} \mathcal{O}_X(D(f))

= \text{colim}_{f\notin P} R_f

Now for any f\notin P, there is a natural ring homomorphism R_f \rightarrow R_P sending each element of the form a/f^n \in R_f to itself in R_P; this is well-defined because every element of R\backslash P is a unit in R_P. Clearly this is also compatible with the restriction maps, so by the universal property of the colimit there is a unique ring homomorphism \mathcal{O}_{X,P} \rightarrow R_P.


Let’s call this morphism \phi. Now every element of R_P is of the form a/f^n for some a\in R, f\in R\backslash P and f\in \mathbb{Z}, and therefore \phi is a surjection (because we can find this element in the localisation R_f and then include it into the stalk). Now suppose that a/f^n \in R_f is sent to 0\in R_P; by definition of being zero in a localisation this means there exists g\in R\backslash P such that a g = 0\in R. But then a/f^n = 0 \in R_{f g}, and thus its image in the stalk is zero. Hence \ker \phi = \left\{0\right\}, so \phi is injective. Thus

\mathcal{O}_{X,P} \cong R_P

This establishes that the affine scheme (X,\mathcal{O}_X) is not only a ringed topological space but a locally ringed space, which means that the stalks at each of its points are local rings.

Here’s a useful consequence of the previous result:

Lemma: Let R be an integral domain with field of fractions k = \text{Frac}(R), and let x\in X = \text{Spec}(R) be the point of X corresponding to the prime ideal (0). Then \mathcal{O}_{X,x} \cong k. Furthermore, every nonempty open subset U\subseteq X contains x and the universal inclusions into the colimit \mathcal{O}_X (U)\rightarrow \mathcal{O}_{X,x} are injective. Hence if V\subseteq U then the restriction \mathcal{O}_X(U) \rightarrow \mathcal{O}_X (V) is injective.

Proof: We have \mathcal{O}_{X,x} \cong R_{(0)} \cong k. For a principal open subset U = D(f) we have \mathcal{O}_X (U) = R_f \subseteq k, so there is a natural injection \mathcal{O}_X (U) \rightarrow k. Now suppose that U = \bigcup_{i\in I} D(f_i) is a general open subset of X and suppose a section s\in \mathcal{O}_X (U) is mapped to zero in k. Then using the diagram

FullSizeRender-1we know the restriction of s to each D(f_i) is zero. Since \mathcal{O}_X is a sheaf on X, the restriction \mathcal{O}_X \vert_{U} is a sheaf on U and hence the identity sheaf axiom implies s = 0 in \mathcal{O}_X (U). Thus \mathcal{O}_X (U)\rightarrow k is injective. The commutativity of the restrictions and inclusions into the stalk implies that each restriction map \mathcal{O}_X (U)\rightarrow \mathcal{O}_X (V) is injective.

Now I want to tell you two more magical properties of my functor \text{Spec}: \text{CRing} \rightarrow \text{Top} (we can forget about the ringed space structure for a moment).

Lemma: Let \phi: A\rightarrow B be a ring homomorphism and f: \text{Spec} (B) \rightarrow \text{Spec} A) the corresponding continuous map (which I wrote about in this post). Then:

  1. If \phi is surjective then f induces a homeomorphism \text{Spec}(B)\cong V(\ker \phi)\subseteq \text{Spec}(A).
  2. If \phi: A\rightarrow S^{-1} A is a localisation morphism then f induces a homeomorphism from \text{Spec}(S^{-1} A) onto the subspace

Y =\left\{ P\in\text{Spec}(A): P\cap S = \varnothing\right\}.


(1): Since B \cong A/\ker\phi as \phi is surjective, the prime ideals of B are in bijective correspondence with the prime ideals of A/\ker\phi, which are themselves in bijective correspondence with the prime ideals of A containing \ker\phi. This establishes a bijection of sets \text{Spec}(B) \cong V(\ker\phi)\subseteq \text{Spec}(A); since f is continuous, this is a continuous bijection. Furthermore for an ideal J\subseteq B we have

f(V(J)) = \left\{P \in\text{Spec}(A): \phi(P)\supseteq J\right\}

= \left\{P\in\text{Spec}(A): P\supseteq \phi^{-1} (J)\right\}

= V(\phi^{-1} (J))

This tells us that the image of a closed set under f is closed, so f is closed. Therefore f is a homeomorphism.

(2): The prime ideals of S^{-1} A are in bijection with the prime ideals of A containing no elements of S because all these elements become units in the localisation S^{-1} A. Thus f establishes a continuous bijection from \text{Spec}(S^{-1} A) to Y. Now let J\subseteq S^{-1} A be an ideal; then

f(V(J)) = V(\phi^{-1} (J))\cap Y

Again this proves that f is closed and so it is a homeomorphism.

We can use this to prove a useful lemma I really ought to have proved last time:

Lemma: Let X = \text{Spec}(R) be an affine scheme and let f\in R. Then the open subset Y = D(f)\subseteq X with the induced ringed space structure from X is an affine scheme isomorphic to \text{Spec}(R_f).

Proof: By the previous lemma there is an open immersion i: Y\hookrightarrow X whose image is D(f). Suppose D(g)\subseteq D(f), and let g' be the image of g in R_f. Then we have

\mathcal{O}_X (D(g)) = R_g \cong (R_f)_{g'} = \mathcal{O}_Y (D(g')) = i_* \mathcal{O}_Y (D(f))

Now the D(g)‘s form a base of open subsets for D(f); because we have well-defined restriction maps between these basic open sets, we see we obtain an isomorphism of sheaves

\mathcal{O}_X \cong i_* \mathcal{O}_Y

It’s easy to check that the resulting isomorphism of ringed spaces (Y, \mathcal{O}_Y) \cong (D(f), \mathcal{O}_X \vert_{D(f)} ) is local, giving an isomorphism of schemes.

The next lemma generalises this:

Lemma: Let X be a scheme (not necessarily affine). Then for any open subset U\subseteq X the ringed space (U, \mathcal{O}_X \vert_U ) is also a scheme.

Proof: Since X is a scheme there exist open sets U_i, each isomorphic to an affine scheme, such that X = \bigcup_i U_i. Since the principal open sets form a basis for the topology on X, each intersection U\cap U_i is a union of principal open sets. We’ve just seen each of these is isomorphic to an affine scheme, so it follows that U is a scheme.

If X is a scheme and U\subseteq X is an open subset as above we call U an open subscheme of X; if it turns out that U is an affine scheme then (duh!) we call U an affine open subscheme.

Generalised basic open sets

Now let’s generalise the notion of principal open subset; in the affine case \text{Spec}(R), we took a ring element f\in R and looked at the open set D(f) of prime ideals not containing it. Now let’s apply our “reverse geometry” that I’ve alluded to before: we want to consider f as a function on \text{Spec}(R), and the open set D(f) as the set of points on which its value is nonzero. To formalise this, for any prime ideal \mathfrak{p}\in\text{Spec}(R) we set

f(\mathfrak{p}) = f \pmod{ \mathfrak p}

So the value of the “function” f at the point \mathfrak{p} is the value of f in the quotient ring R/\mathfrak{p}. Clearly this isn’t a function in the usual sense because the rings it takes values in vary from point to point! But what makes sense – just about the only thing that does – is that we can talk about whether f is zero at a point or not. And so on D(f), we see that the “function” f is never zero, so we consider this the subset on which the function is defined. Indeed, for any prime ideal \mathfrak{p}\in D(f) we saw above that \mathcal{O}_{X,\mathfrak{p}} \cong R_\mathfrak{p}, so if \mathfrak{p} \in D(f) then f\notin \mathfrak{p} and hence the image of f in R_\mathfrak{p} is a unit. It is this fact we’ll use to generalise principal open sets.

We’ve now made a suitable definition of “scheme” built up from affine schemes. In the affine case we picked global sections f \in R = \mathcal{O}_{\text{Spec}(R)} (\text{Spec}(R)) to define our principal open subsets D(f), but there’s no reason in the general nonaffine case that we can’t do the same for global sections f\in \mathcal{O}_X (X). With this in mind, let’s make the following definition:

Definition: Let X be a scheme and let f\in \mathcal{O}_X (X) be a global section of the structure sheaf. We define a generalised principal open set X_f to be

X_f = \left\{x\in X: f_x \in \mathcal{O}_{X,x}^\times\right\}

where \mathcal{O}_{X,x}^\times is the group of units of the stalk \mathcal{O}_{X,x}.

We see that for an affine scheme X = \text{Spec}(R) the ring of global sections \mathcal{O}_X (X) is just R. Choosing f\in R and defining X_f as above just gives X_f = D(f) because f is a unit in all the stalks of elements in D(f), and if f is a unit in a stalk of the form \mathcal{O}_{X,\mathfrak{p}} then f\notin \mathfrak{p}. So this definition really does generalise principal open sets.

Proposition: Let X be a scheme and f\in \mathcal{O}_X (X). Then X_f is an open subset of X.

Proof: We’ll show that any point in X_f is contained within an open neighbourhood contained in X_f. Take x\in X_f; then f\in\mathcal{O}_{X,x}^\times, so there exists an open neighbourhood U \ni x and a section g\in \mathcal{O}_X (U) with f_x g_x = 1 \in\mathcal{O}_{X,x}. But equality in the stalk means there is an open subset V with x\in V\subseteq U such that (f g)\vert_{V} = 1\in\mathcal{O}_X (V). Hence f\vert_V is a unit in \mathcal{O}_X (V) and hence f_y is a unit in the stalk \mathcal{O}_{X,y} for any point y\in V. Thus V\subseteq X_f. Therefore every point of X_f is an interior point, so X_f is open.

We’ll now construct the inverse for (the image of) f in \mathcal{O}_X (X_f). By the construction above, for each x\in X_f we have an open set V_x \subseteq X_f and a section g(x)\in \mathcal{O}_X (V_x) with (f g(x))\vert_{V_x} = 1\in\mathcal{O}_X (V_x). Suppose that the two open sets V_x and V_y have nonempty intersection. Write \rho^x : \mathcal{O}_X (V_x) \rightarrow \mathcal{O}_X (V_x \cap V_y) and \rho^y : \mathcal{O}_X (V_y) \rightarrow \mathcal{O}_X (V_x \cap V_y) for the corresponding restriction maps. Then in \mathcal{O}_X (V_x \cap V_y) we have equalities

1 = \rho^x (1) = \rho^x ((f g(x))\vert_{V_x}) = (f g(x))\vert_{V_x \cap V_y} = f\vert_{V_x \cap V_y} g(x)\vert_{V_x\cap V_y}

1 = \rho^y (1) = \rho^y ((f g(y))\vert_{V_y}) = (f g(y))\vert_{V_x \cap V_y} = f\vert_{V_x \cap V_y} g(y)\vert_{V_x\cap V_y}

Hence we have

f\vert_{V_x \cap V_y} g(x) \vert_{V_x \cap V_y} = f\vert_{V_x \cap V_y} g(y) \vert_{V_x \cap V_y}

Since f\vert_{V_x \cap V_y} is a unit, we can multiply through by its inverse to obtain \rho^x (g(x)) = \rho^y (g(y)) All this was just a long-winded procedure to prove that the restrictions of the g(x) sections agree on the overlaps between open sets V_x and V_y. Therefore by the sheaf axiom all these sections glue together to give a unique section g\in \mathcal{O}_X (X_f) such that g\vert_{V_x} = g(x). We see that f g = 1 \in\mathcal{O}_X (X_f), so we’ve constructed the inverse of f in \mathcal{O}_X (X_f).

Finally, note that since X_f is open there is a restriction map \rho^X_{X_f} :\mathcal{O}_X (X)\rightarrow \mathcal{O}_X (X_f) and we’ve seen this sends f to a unit in \mathcal{O}_X (X_f). Therefore it sends the multiplicative set

S = \left\{1,f,f^2,\dots\right\}

to units in \mathcal{O}_X (X_f). By the universal property of localisation there is a unique ring homomorphism

\phi: \mathcal{O}_X (X)_f = S^{-1} \mathcal{O}_X (X)\rightarrow \mathcal{O}_X (X_f)

such that \phi \circ L_f = \rho^{X}_{X_f}, where L_f : \mathcal{O}_X (X)\rightarrow \mathcal{O}_X (X)_f is the canonical localisation homomorphism.

We’ll see later that under certain “covering” conditions this homomorphism \phi : \mathcal{O}_X (X)_f \rightarrow \mathcal{O}_X (X_f) is an isomorphism. For now, let’s press on with something more fundamental.

Morphisms of Schemes

morphism of schemes is just a morphism of locally ringed spaces; isomorphisms are defined similarly. A closed/open immersion of schemes is a closed/open immersion of ringed topological spaces. We’ll now show how we can construct scheme morphisms from our continuous maps \text{Spec}(\phi) defined previously – i.e. how to make them work nicely with the ringed space structures to give morphisms of locally ringed spaces.

Theorem A: Let \phi: A\rightarrow B be a ring homomorphism and write f= \text{Spec}(\phi) : \text{Spec}(B)\rightarrow \text{Spec} (A) for the corresponding continuous map of spectra (this isn’t currently a morphism of locally ringed spaces). Then there exists a morphism of schemes

(f, f^\#): \text{Spec}(B)\rightarrow \text{Spec}(A)

such that f^\# _{\text{Spec}(A)} = \phi.

Proof: Let X = \text{Spec}(B) and Y = \text{Spec}(A). It is easy to check that for any g\in A we have

f^{-1} (D(g)) = D(\phi (g))

Therefore this shows that f_* \mathcal{O}_X (D(g)) = B_{\phi (g)}. Define the ring homomorphism

\psi: \mathcal{O}_Y (Y) \rightarrow f_* \mathcal{O}_X (D(g)) = B_{\phi(g)}

s \mapsto \phi(s)/1 \in B_{\phi (g)}

Then \phi sends every element of the multiplicative set S = \left\{1,g,g^2,\dots\right\} to a unit in B_{\phi(g)}, and hence there exists a unique ring homomorphism

f^{\#}_{D(g)} : \mathcal{O}_Y (D(g))\rightarrow f_* \mathcal{O}_X (D(g))

such that f^{\#}_{D(g)} \circ L_g = \psi, where L_g : \mathcal{O}_Y (Y)\rightarrow \mathcal{O}_Y(D(g))=A_g is the canonical localisation map. This is compatible with restrictions of basic open sets so it gives a morphism of sheaves on the base of open sets which extends to a unique morphism of sheaves on Y:

f^{\#} : \mathcal{O}_Y \rightarrow f_* \mathcal{O}_X

Moreover let x = \mathfrak{q}\in X be a prime ideal of B. Then the morphism of stalks induced by \phi

\mathcal{O}_{Y,f (x)} = A_{\phi^{-1}(\mathfrak{q})}\rightarrow B_\mathfrak{q} = \mathcal{O}_{X,x}

is a local homomorphism equal to f^{\#}_x. Hence (f,f^{\#}) is a morphism of locally ringed topological spaces, and hence of schemes. Finally, we clearly have f^{\#}_Y = \phi. This concludes the proof

So any ring homomorphism \phi induces a morphism of the affine schemes (f,f^{\#}) in the opposite direction, and the original ring homomorphism can be recovered as the component of f^{\#} on global sections.

Therefore we can redefine our functor \text{Spec} to be

\text{Spec} : \text{CRing}^{\text{op}} \rightarrow \text{Sch}\subseteq \text{LocRingTop}

where \text{Sch} is the category of schemes and \text{LocRingTop} is the category of locally ringed topological spaces. I’ll now prove the big theorem!

Lemma: Let X, Y be two schemes and write \text{Hom}_{\text{Sch}} (X,Y) = \text{Mor}(X,Y) for the set of scheme morphisms from X to X. We write \text{Hom}_{\text{CRing}} (A,B) for the set of ring homomorphisms from A to B. Then for all schemes Y there is a natural transformation

\eta : \text{Mor}(-,Y) \implies \text{Hom}_{\text{CRing}} (\mathcal{O}_Y (Y), \mathcal{O}_{(-)} (-))

where \text{Mor}(-,Y) and \text{Hom}_{\text{CRing}} (\mathcal{O}_Y (Y), \mathcal{O}_{(-)}(-)) are functors \text{Sch}^{\text{op}} \rightarrow \text{Set}.

Proof: The functor \text{Mor}(-,Y) is self-explanatory, but the functor \text{Hom}_{\text{CRing}} (\mathcal{O}_Y (Y), \mathcal{O}_{(-)}(-)) needs some clarification. It sends every scheme X to the set of ring homomorphisms \mathcal{O}_Y (Y) to \mathcal{O}_X (X) on the rings of global sections, and every morphism of schemes

(f,f^{\#}): X\rightarrow Z

to the function \text{Hom}(\mathcal{O}_Y (Y), \mathcal{O}_Z (Z)) \rightarrow \text{Hom}(\mathcal{O}_Y (Y), \mathcal{O}_X (X)) given by

(\phi: \mathcal{O}_Y (Y) \rightarrow \mathcal{O}_Z (Z))\mapsto (f^{\#}_Z \circ \phi : \mathcal{O}_Y (Y) \rightarrow f_* \mathcal{O}_X (Z) = \mathcal{O}_X (X))

The component of \eta at X is

\eta_X : \text{Mor}(X,Y)\rightarrow \text{Hom}_{\text{CRing}} (\mathcal{O}_Y (Y), \mathcal{O}_X(X))

(f:X\rightarrow Y) \mapsto (f^{\#}_Y : \mathcal{O}_Y (Y) \rightarrow \mathcal{O}_X(X))

But then for a morphism of schemes (f, f^{\#}): X\rightarrow Y

\eta_Z \circ \text{Mor}(Y,(f,f^{\#})) = \text{Hom}_{\text{CRing}} (\mathcal{O}_Y (Y), (f,f^{\#})) \circ \eta_X

Hence \eta is indeed natural.

BIG THEOREM: For any ring R the natural transformation

\eta : \text{Mor}(-,\text{Spec}(R)) \implies \text{Hom}_{\text{CRing}} (R, \mathcal{O}_{(-)} (-))

is a natural isomorphism (here I’ve implicitly used that R = \mathcal{O}_{\text{Spec}(R)} (\text{Spec}(R)).

Proof: Proving that \eta is a natural isomorphism amounts to proving that for any scheme X the component \eta_X is a bijection of sets (seeing as we’ve already proved it’s natural). To do this, we’ll construct an explicit inverse function \epsilon_X. To ease notation let Y = \text{Spec}(R) and let \phi : R\rightarrow \mathcal{O}_X (X) be a ring homomorphism. For any point P \in X we let \mathfrak{m}_P denote the unique maximal ideal in the local ring \mathcal{O}_{X,P}. Then the preimage of \mathfrak{m}_P under the composition

R\xrightarrow{\phi} \mathcal{O}_X (X) \xrightarrow{\rho_P} \mathcal{O}_{X,P}

is a prime ideal of R, where \rho_P is the restriction to the stalk. Hence we obtain a map of the underlying point sets

\psi: \lvert X \rvert \rightarrow \lvert Y \rvert

where \lvert X \rvert is just the point set of X, considered without topology or ringed space structure. But in fact this map is also continuous with respect to the Zariski topology. To see this, take f\in \mathcal{O}_Y (Y) = R and consider the principal basic open set D(f)\subseteq Y. We want to show that \psi^{-1} (D(f))\subseteq X is open. We have

\psi^{-1} (D(f)) = \left\{ x\in X: \psi (x) \in D(f)\right\}

= \left\{x\in X: f\notin \phi^{-1} (\rho_x^{-1}(\mathfrak{m}_x))\right\}

= \left\{x\in X: \phi(f)_x \notin \mathfrak{m}_x \right\}

= \left\{x\in X: \phi(f)_x \in \mathcal{O}_{X,x}^\times \right\}

= X_{\phi(f)}

where X_{\phi (f)} is one of our generalised basic open sets defined above, and is indeed open. Hence \psi is a continuous map on the underlying topological spaces. We now need to define a morphism of sheaves.

For each basic open set D(f)\subseteq Y define

\psi^{\#}_{D(f)} : R_f = \mathcal{O}_Y (D(f)) \rightarrow \psi_* \mathcal{O}_X (D(f))

as the composition

R_f \rightarrow \mathcal{O}_X (X)_{\phi (f)} \rightarrow \mathcal{O}_X (\psi^{-1} (D(f)))

where the first morphism is the obvious map obtained from \phi : R\rightarrow \mathcal{O}_X (X) by localising and the second map is the canonical morphism \mathcal{O}_X (X)_{\phi (f)} \rightarrow \mathcal{O}_X (X_{\phi (f)}) which we proved the existence of above with regards to generalised basic open sets. We have a commutative diagram


from which it is clear that this defines a morphism of sheaves on the base (as we can do our “localise again” trick to deal with restrictions D(g)\subseteq D(f)) and thus by glueing these all together a uniquely defined morphism of sheaves

\psi^\# : \mathcal{O}_Y \rightarrow \psi_* \mathcal{O}_X

It is also clear from the above diagram that the induced homomorphism \psi^{\#}_Y on global sections is equal to \phi. Now suppose that \psi (P) = Q. The diagram below commutes because it expresses that \psi^\# is a sheaf morphism:


We want to show that \psi^{\#}_Q (\mathfrak{m}_Q) \subseteq \mathfrak{m}_P which will prove that \psi^{\#}_Q is a local homomorphism. To do this, let f\in \mathfrak{m}_Q and BWOC suppose that \psi^{\#}_Q (f) is a unit in \mathcal{O}_{X,P}. Let S be the complement of P in \mathcal{O}_X (X) so that S is multiplicative. Then T = \phi^{-1} (S)\subseteq R is multiplicative and the image of T in R_Q must contain only units, because \psi^{\#}_Q(T_Q) = S_P \subseteq \mathcal{O}_{X,P}^\times consists only of units (here I use T_Q to mean the image of T in R_Q). But T contains an element which localises to f, which we assumed was not a unit in R_Q. This is a contradiction, and hence \psi^{\#}_Q (\mathfrak{m}_Q)\subseteq \mathfrak{m}_P. Thus \psi^{\#}_Q is a local homomorphism.

Hence (\psi, \psi^\#): X\rightarrow Y is a morphism of locally ringed spaces and hence of schemes. We therefore define the inverse \epsilon_X to \eta_X to be the function

\phi \mapsto (\psi, \psi^\#)

where \left(\psi, \psi^{\#}\right) is constructed as above.

Now we just need to prove that \eta_X and \epsilon_X are mutually inverse. Theorem A above gives that

\eta_X \circ \epsilon_X = \text{id}_{\text{Hom}(R,\mathcal{O}_X (X))}

Starting with a morphism of schemes (f,f^{\#}) : X\rightarrow \text{Spec}(R) = Y we can apply \eta_X to get a ring homomorphism f^{\#}_Y : R\rightarrow \mathcal{O}_X (X) on global sections. Applying \epsilon_X then sends f^{\#}_Y to the scheme morphism (\psi, \psi^\#) : X\rightarrow Y with \psi^\# = f^\# and

\psi (x) = \left\{ r\in R: f^{\#}_{f(x)} (r_{f(x)}) \in \mathfrak{m}_x\right\}

[This notation will freak you out if you don’t stop and think what’s what – remember, f(x)\in \text{Spec}(R) so f(x) is a prime ideal of R.] Clearly if r \in f(x) then its image r_{f(x)} in the localisation R_{f(x)} is contained within \mathfrak{m}_{f(x)} and so applying f^{\#}_{f(x)} and recalling that it is a local homomorphism we obtain an element of \mathfrak{m}_x; this proves that f(x)\subseteq \psi(x). Now take r\in \psi(x); then f^{\#}_{f(x)} (r_{f(x)}) \in \mathfrak{m}_x so r_{f(x)}\in\mathfrak{m}_{f(x)} by locality. But then r\in f(x), so f(x) = \psi(x). This proves that

\epsilon_X \circ \eta_X = \text{id}_{\text{Mor}(X,\text{Spec}(B))}

Thus this gives us a natural isomorphism. This concludes the proof.

Translation: Restricting X to an affine scheme, we see that the morphisms between two ring spectra are in natural bijection with the ring homomorphisms between the rings, with the direction of the arrows reversed. Therefore the category of affine schemes \text{Sch}_\text{aff} is the categorical dual of the category of commutative rings \text{CRing}!!!

I think I’ll stop here. Next time I’ll derive some more technical properties of schemes and then try to draw some pretty pictures. After that, I’d like to look at an alternative – and more abstract – way of defining schemes via their functors of points.

Sheaves in mathematics get their name from the agricultural term “sheaf” –  a collection of stalks of grain or cereal all bundled together. This is like how properties of sheaves are determined by their stalks from the gluing axioms.

This time, as promised, we will discuss sheafification and show it is the left-adjoint of the inclusion functor \text{Sh}_\mathcal{C} (X) \hookrightarrow \text{PSh}_\mathcal{C} (X).

Defintion 1: Let F be a presheaf (of sets, say) on a topological space X. The sheafification of F is the presheaf F^+ where F^+ (U) is the set of functions f: U\rightarrow \coprod_{x\in U} F_x such that for each x\in U, (i) f(x)\in F_x and (ii) there exists an open neighbourhood V_x of x and a section s\in F(V_x) such that for all y \in V_x we have f(y) = s_y. If V\subseteq U then the restriction map F^+ (U) \rightarrow F^+ (V) is the restriction of each of these functions to V.

So the sheafification of a presheaf basically bundles all the sections of the presheaf into “compatible stalks” in the sense that a section of the sheafification F^+ is the collection of germs of a particular section of F in all its stalks. As you’d expect, the sheafification of a presheaf is actually a sheaf:

Theorem 1: The sheafification F^+ is a sheaf (of sets, but this will generalise to other categories).

Proof: We know that F^+ is a presheaf so we need to just check the identity and gluability axioms. Let U be an open subset of X and let U = \bigcup_i U_i be an open cover.

First, let’s check the identity axiom. Let s,t\in F^+ (U) be sections over U such that their restrictions s_i, t_i \in F^+ (U_i) are equal for all open sets in the cover. Then for any x\in U there exists some U_i containing x and on which s_i = t_i, so s(x) = s_i (x) = t_i (x) = t(x). This holds for all x \in U, so s = t.

Now let’s check the gluing axiom. Let s_i \in F^+ (U_i) be sections such that for each i, j the restrictions of s_i and s_j to F^+(U_i \cap U_j) are equal. Now define a function s: U\rightarrow \coprod_{x\in U} F_x by s(x) = s_i (x) for any U_i containing x. This is well-defined because the sections s_i agree on overlaps. So now we just need to check that s\in F^+(U) i.e. that it satisfies the two conditions in the definition. The fact that the s_i‘s all satisfy s_i (x)\in F_x give that s(x)\in F_x. Now for each s_i there exists an open set V_i\subseteq U_i and a section t\in F(V) such that for all y \in V, s_i(y) = t_y. This means that this condition holds for s too – just take any of the V_i containing x. So we actually have s\in F^+ (U). This completes the proof. \square

So given any presheaf we can construct a sheaf from its stalks. There is a natural morphism of presheaves \phi: F\rightarrow F^+ whose components \phi_U : F(U)\rightarrow F^+ (U) are given by

\phi_U(s) = (x\mapsto s_x\in F_x)

For any morphism F\rightarrow G of presheaves, the universal property of the colimit means we get a unique morphism of stalks F_x \rightarrow G_x for each stalk (which means taking the stalk is a functor!). It follows that the morphism F \rightarrow F^+ induces a unique morphism on stalks F_x \rightarrow F_x^+. But we also obtain a unique morphism F_x^+ \rightarrow F_x for each x\in X as follows:

Take s\in F^+(U) and for each x\in U let s^{(x)}\in F(V_x) be the section (defined above) such that s^{(x)}_y = s(y) for all points y\in V_x. Then we define a collection of morphisms \alpha_x^U: F^+(U)\rightarrow F_x in our target category (one for each open set U containing x) by s\mapsto s^{(x)}_x = s(x). These are clearly compatible with restriction and so we obtain a cocone (F_x, (\alpha_x^U)_{x\in U}) over F_x. By the universal property of the colimit F^+_x = \text{colim}_{x\in U} F^+ (U), we obtain a unique morphism \alpha_x: F_x^+ \rightarrow F_x such that the maps \alpha_x^U factor as the compositions of \alpha_x and the inclusions into the colimit.

So now we have unique morphisms F_x \rightleftarrows F_x^+ in our target category for each point x\in X. It follows that these morphisms compose to give the identity morphisms, and therefore F and its sheafification F^+ have isomorphic stalks. This is a hugely useful fact which we will use again. For now, note that if F is already a sheaf then F and its sheafification F^+ are isomorphic as sheaves, because morphisms of sheaves are determined by their stalks!

Now let G be another sheaf on X and let \psi : F\rightarrow G be a morphism (here F is a presheaf). Then there is a morphism of sheaves \eta: F^+ \rightarrow G whose components \eta_U : F^+(U)\rightarrow G(U) are defined as follows: for s\in F^+(U) the open neighbourhoods V_x containing each point x\in U each have a section s^{(x)} \in F(V_x) such that s^{(x)}_y = s(y) for all y \in V_x. This condition ensures that the restrictions of s^{(x)} and s^{(y)} to V_x \cap V_y are equal, and therefore their images \psi_{V_x} (s^{(x)}) and \psi_{V_y} (s^{(y)}) are also equal on the overlaps V_x \cap V_y. Therefore since G is a sheaf and U = \bigcup_{x\in U} V_x is an open cover, the sections \psi_{V_x} (s^{(x)}) \in G(V_x) all glue together to give a section t\in G(U). We then set \eta_U (s) = t. This clearly commutes with restrictions so it is actually a sheaf morphism \eta : F^+ \rightarrow G. Furthermore you can check that it also satisfies \eta \phi = \psi, so the diagram below commutes:


Now suppose that \eta': F^+ \rightarrow G is another sheaf morphism that satisfies \eta' \phi = \psi. We have induced maps on stalks: \psi_x :F_x \rightarrow G_x, \eta'_x:F_x^+\rightarrow G_x. But since sheaf morphisms are determined by their stalks and F_x \cong F_x^+ as we saw above, it follows that \eta and \eta' induce the same map on every stalk and therefore they are equal. So the map \eta is uniquely determined by \psi.

Now if \psi:F\rightarrow G is a morphism of presheaves on X then the composition \phi_G \circ \eta : F^+ \rightarrow G^+ (hopefully you can work out the notation from above!) gives a uniquely defined morphism of sheaves on X. In this way, we see that sheafification is a functor (-)^+: \text{PSh}_\mathcal{C} (X) \rightarrow \text{Sh}_\mathcal{C} (X). Furthermore, the discussion above shows that in fact (-)^+ is the left-adjoint of the inclusion functor from sheaves to presheaves on X.

A subcategory \mathcal{A} of a category \mathcal B is called reflective if its inclusion functor has a left adjoint, which we call the reflection. One immediate thing to note is that since (-)^+ is a left adjoint, it preserves colimits. Some general theorems which you can look up in the Handbook of Categorical Algebra (section 3.5 on reflective subcategories) or on the nLab page for reflective subcategories and the nLab page for presheaves tell us that if \mathcal{A} is a reflective subcategory of a (co)complete category \mathcal{B} (i.e. every diagram in \mathcal B has a (co)limit) then \mathcal{A} is itself (co)complete. The nLab page says the category of presheaves on any space with values in the category of sets (or I would imagine in any complete and cocomplete category) is (co)complete, and so the category of sheaves valued in that category is (co)complete because of the above.

This means we can compute any limits we like in most of the frequently-used categories of sheaves (e.g. with values in \text{Set} or \text{Ab}). Moreover, since the inclusion functor from sheaves to presheaves is the right adjoint of sheafification (as we have just seen) it follows that these limits are preserved in presheaves. And by section 2.15 of the Handbook since the presheaf category is just a functor category we just compute limits pointwise, in the sense that (\lim F)(U) = \lim F(U).

Thus if have a morphism \phi: F\rightarrow G of sheaves of abelian groups (or any abelian category) we can compute the kernel \ker \phi (which exists as a sheaf since the category of sheaves is complete and kernel is a limit – it’s an equaliser), and we know by the above argument that it is computed “pointwise”, so (\ker \phi)(U) = \ker \phi_U. This is a nice application of how the abstract functorial machinery can be used to deduce (relatively – we’re still dealing with sheaves!!!) concrete things about how to calculate natural constructions relating to sheaves.

The same result does not hold for cokernels, which are colimits (coequalisers) and generally do not commute with the inclusion functor. To define the cokernel of a sheaf morphism we need to sheafify, so \text{coker}(\phi) = (U\mapsto \text{coker} (\phi_U) )^+ is the sheaf cokernel. This is constructed in two steps – first take the pointwise cokernel, which is a presheaf colimit. Then sheafify and note that since sheafification is a left adjoint it preserves colimits, so this really is a colimit (and has the cokernel UMP).