The Open Mapping Theorem

The open mapping theorem (or the Banach-Schauder theorem, if you prefer) is an incredibly important, relatively straightforward and digestible result in functional analysis which plays a crucial role in a large variety of other interesting theorems.  As an exercise, we’ll prove the open mapping theorem here in the standard fashion.  This will hopefully serve as a useful reference for later posts about metric regularity and linear openness, which serve as a measuring device to quantify the degree to which a map is open.

Stating the Open Mapping Theorem

The open mapping theorem may be stated as follows:

Theorem (Open Mapping)

Let $\mathfrak{X}$ and $\mathfrak{Y}$ be Banach spaces and $T : \mathfrak{X} \rightarrow \mathfrak{Y}$ a bounded linear operator. Then, if $T$ is surjective, $T$ is an open map.

Restating this another way (with $\mathbb{B}$ denoting the open unit ball), the theorem reads:

Theorem (Open Mapping)

Let $\mathfrak{X}$ and $\mathfrak{Y}$ be Banach spaces and $T : \mathfrak{X} \rightarrow \mathfrak{Y}$ a bounded linear operator. Then, if $T$ is surjective, $0 \in int A(\mathbb{B}_\mathfrak{X})$.

At the heart of the open mapping theorem is the notion that there is a link between precision and isomorphism among complete normed spaces (though the theorem holds under the even weaker assumption of local convexity as well): if the equation $Tx=y$ has at least one solution for any $y \in \mathfrak{Y}$, then either $T$ is an isomorphism and $\mathfrak{Y}$ is isomorphic to $\mathfrak{X}$, or $T$ is a quotient map and $\mathfrak{Y}$ is isomorphic to a quotient of $\mathfrak{X}$. Taken another way, the open mapping theorem links precision and estimation — at least in the sense that if $Tx=y$ has a solution for any $y \in \mathfrak{Y}$, then there exists a constant $k>0$ such that $\|x\|_\mathfrak{X} \leq k\|y\|_\mathfrak{Y}$. More specifically (and this keys us in to precisely how open mappings relate to metric regularity), $k^{-1} = \sup{ r > 0 : r\mathbb{B}_\mathfrak{Y} \subseteq T(\mathbb{B}_\mathfrak{X} }.$ We say that the value $k^{-1}$, as defined here, is the Banach constant of $T$, and will prove useful in the future.

We will prove the open mapping theorem in detail to highlight precisely the epsilon-delta argument used, which will hopefully allow us to see how weakening of linear openness may be better understood.

Proving the Open Mapping Theorem

Our proof of the open mapping theorem will rely on the Baire category theorem. Because of this, we will also prove the Baire Category theorem. However, let’s first define a Baire space.

The modern definition of a Baire space is often given in one of the four following equivalent forms:

Definition (Baire Space)

Let $\mathcal{X}$ be a topological space. We say $\mathcal{X}$ is a Baire space if every intersection of a countable collection of dense open subsets of $\mathcal{X}$ is also dense.

That is, Baire spaces preserve density under countable intersections (notice that open need not be specified, as a dense closed set is necessarily the entire space).

Definition (Baire Space)

Let $\mathcal{X}$ be a topological space. We say $\mathcal{X}$ is a Baire space if every union of a countable collection of closed subsets of $\mathcal{X}$ with empty interior has empty interior.

That is, Baire spaces preserve boundary sets under countable unions (notice that closed need not be specified, as an open set with empty interior is necessarily the empty set).

Definition (Baire Space)

Let $\mathcal{X}$ be a topological space. We say $\mathcal{X}$ is a Baire space if the interior of every union of a countable collection of closed, nowhere dense subsets of $\mathcal{X}$ is empty.

Definition (Baire Space)

Let $\mathcal{X}$ be a topological space. We say $\mathcal{X}$ is a Baire space if, whenever any union of countably many closed subsets of $\mathcal{X}$ has an interior point, then one of the closed subsets has an interior point.

The historical definition of a Baire space involves Baire’s notion of categories (not to be confused with the categories of category theory), but is also equivalent.

Definition (Sets of First and Second Category)

Let $\mathcal{X}$ be a topological space. We say a subset $\mathcal{W}$ of $\mathcal{X}$ is:

1.  of first category (or, often, meagre) in $\mathcal{X}$ if there exist a sequence $\left \{N_i\right \}_{i=1}^\infty$ of nowhere dense subsets of $\mathcal{X}$ such that $\mathcal{W}=\bigcup_{i=1}^\infty N_i$;
2. of second category in $\mathcal{X}$ if $\mathcal{W}$ is not of first category in $\mathcal{X}$.

This leads to Baire’s original definition, which is as follows:

Definition (Baire Space)

We say $\mathcal{X}$ is a Baire space if every non-empty open set in $\mathcal{X}$ is of second category in $\mathcal{X}$.

That is to say, open sets in Baire spaces are, in a sense, suitably ‘substantial’ or ‘large’ — at least, insofar as Baire spaces have no open sets which are meager.

Correspondingly, we also can find one further historical definition of a Baire space using this notion:

Definition (Comeagre set)

Let $\mathcal{X}$ be a topological space. We say that a subset $\mathcal{W}$ of $\mathcal{X}$ is comeagre if its compliment $\mathcal{W}^C$ is meager.

Definition (Baire Space)

We say $\mathcal{X}$ is a Baire space if every comeagre subset of $\mathcal{X}$ is dense in $\mathcal{X}$.

Seeing the straight-up crazy number of different definitions of a Baire space, one might wonder why these spaces deserve so much fuss. To this end, let us observe that Baire spaces enjoy a combinatorial property akin to the pigeonhole principle, which is (very obviously) equivalent to their definition.

Theorem (Interior Pigeonhole Property)

Let $E_1,E_2,\hdots$ be an arbitrary, at most countable, sequence of closed subsets of $\mathcal{X}$. We say $\mathcal{X}$ has the interior pigeonhole property if, whenever $\bigcup_n E_i$ has nonempty interior, then at least one $E_i$ has nonempty interior.

Let $\mathcal{X}$ be a topological space. If $\mathcal{X}$ has the interior pigeonhole property, then $\mathcal{X}$ is a Baire space.

So, Baire spaces are useful in some contexts because they allow us to — in a somewhat combinatorial fashion — extract data about the existence of a subset satisfying a certain property among a collection of subsets by looking at the properties of a larger set which contains them. Determining if a topological space is a Baire space allows us to utilize arguments of this form, so we frequently are interested in conditions dictating whether a topological space does indeed have this desirable property. In particular, we could consider this question in the context of a complete metric space, which leads us to the so-called Baire Category Theorem.

Theorem (Baire Category)

Every complete metric space is a Baire space.

Proof

Let $\mathcal{X}$ be a complete metric space. Using definition the first definition of a Baire space, we seek to show that a countable intersection of dense subsets is dense. To that end, let $\left \{E_n \right \}_{n=1}^\infty$ be a countable collection of dense subsets of $\mathcal{X}$. As a subset is dense if and only if every nonempty open subset intersects it, it is then sufficient to show that any nonempty open subset $W \subset \mathcal{X}$ has a point $x$ which lies in the intersection of $W$ with each $E_n$.

Proceeding in this fashion, observe that since $E_1$ is dense, $E_1\bigcap W \neq \emptyset$. Thus, there exists $x_1 \in E_1\bigcap W$ and real constant $0 such that $\overline{\mathbb{B}(x_1,r_1)} \subseteq E_1\bigcap W$.

Now, observe that, as $E_2$ is dense, $E_2 \bigcap \mathbb{B}(x_1,r_1) \neq \emptyset$, and we may find a point $x_2$ in the intersection and positive radius $0 < r_2<\frac{1}{2}$ such that $\overline{\mathbb{B}(x_2,r_2)} \subset \mathbb{B}(x_1,r_1)$. Continuing recursively, we find a pair of sequences $\left \{x_n\right \}_{n=1}^\infty$ and $\left \{r_n\right \}_{n=1}^\infty$ such that $0 < r_n < \frac{1}{n}$ and $\overline{\mathbb{B}(x_n,r_n} \subset \mathbb{B}(x_{n-1},r_{n-1})\bigcap E_{n}$. Thus, we have a nested sequence of closed and bounded subsets $\left \{\overline{\mathbb{B}(x_n,r_n)}\right \}_{n=1}^\infty$, which are correspondingly compact by the Heine-Borel theorem. Applying Cantor’s intersection theorem, this then yields a fixed point

Moreover, as the sequence $\left \{x_n\right \}_{n=1}^\infty$ is Cauchy and $\mathcal{X}$ is complete, $x_n \rightarrow x \in \mathcal{X}$.

Therefore, we may conclude that the intersection of a countable number of dense open subsets of a complete metric space is dense, and that every complete metric space is correspondingly a Baire space.  $\blacksquare$

We will use this result in a central way to prove the open mapping theorem. However, we first need three lemmas.

Lemma
A normed space $\mathfrak{X}$ is a Banach space if and only if every absolutely convergent series in $\mathfrak{X}$ converges in $\mathfrak{X}$.

Proof

$\mathbf{[\Rightarrow]}$

Let $\mathfrak{X}$ be a Banach space and $\left \{x_n \right \}$ an arbitrary sequence in $\mathfrak{X}$ such that $\sum_{k=1}^\infty\| x_n \|$ converges. It then follows that the partial sums of the series are a Cauchy sequence, and by the completeness of $\mathfrak{X}$, the series $\sum_{k=1}^\infty x_n$ converges to an element of $\mathfrak{X}$.

$\mathbf{[\Leftarrow]}$

Let $\mathfrak{X}$ be a normed space and suppose that every absolutely convergent series in $\mathfrak{X}$ converges in $\mathfrak{X}$. We must now show that every Cauchy sequence in $\mathfrak{X}$ converges. To that end, let $\left \{x_n \right \}$ be an arbitrary Cauchy sequence in $\mathfrak{X}$, and let $\left \{x_{n_k} \right \}$ be a subsequence of $\left \{ x_n \right \}$ such that $\| x_{n_{k+1}} - x_{n_k} \| < 2^k$. It then follows that $\sum_{k=1}^\infty \| x_{n_{k+1}} - x_{n_k} \|$ converges, and by our assumption, that $\sum_{k=1}^\infty x_{n_{k+1}} - x_{n_k}$ converges as well to some $x \in \mathfrak{X}$. Observe that

As this series converges, it then follows that $x_{N_{k+1}} - x_{n_1} \rightarrow x - x_{n_1}$ for some $x \in \mathfrak{X}$. Thus, we have shown that $\left \{ x_{n_k} \right \}$ is a convergent subsequence of the Cauchy sequence $\left \{ x_n \right \}$ and, correspondingly, we may conclude that $\left \{ x_n \right \}$ converges in $\mathfrak{X}$. Being that $\left \{ x_n \right \}$ was arbitrary, it then follows that $\mathfrak{X}$ is complete and, thus, a Banach space.  $\blacksquare$

Lemma

Let $\mathfrak{X}$ be a Banach space, $\mathcal{Y}$ a normed space, and $T \in \mathscr{B}(\mathfrak{X},\mathcal{Y})$. If $r,s > 0$ is are constants such that $s\mathbb{B}_\mathcal{Y} \subset \overline{T(r\mathbb{B}_\mathfrak{X})}^o$, then $s\mathbb{B}_\mathcal{Y}^o \subset T(r\mathbb{B}_\mathfrak{X})^o$.

Proof

Let $\mathfrak{X}$ be a Banach space, $\mathcal{Y}$ a normed space, and $T \in \mathscr{B}(\mathfrak{X},\mathcal{Y})$. Further, suppose we have a constant $r > 0$ such that $r\mathbb{B}_\mathcal{Y} \subset \overline{T(\mathbb{B}_\mathfrak{X})}^o$. Noting that scaling is a homeomorphism, without loss of generality we take $r=s=1$ by alternatively considering $\frac{r}{s}T$.

Having done this, let us first choose an arbitrary $z \in \mathbb{B}_\mathcal{Y}$, and choose $\delta>0$ such that $\|z\|_\mathcal{Y} < 1-\delta< 1$ (that is, $z \in (1-\delta)\mathbb{B}_\mathcal{Y} \subseteq \mathbb{B}_\mathcal{Y}^o$). Then, choose $y \in Y$ by $y = (1-\delta)^{-1}z$, observing that $\|y\|_\mathcal{Y} = \|(1-\delta)^{-1}z\|_\mathcal{Y} < 1$. We will demonstrate that $y \in (1-\delta)^{-1}T(\mathbb{B}_\mathfrak{X})^o$, which correspondingly gives that $z \in T(\mathbb{B}_\mathfrak{X})^o$.

To do so, we find a sequence $\left \{ y_n \right \}_{n=0}^\infty \subset \mathcal{Y}$ such that

That is, a sequence which converges to $y$ and has the difference of successive terms in successively smaller contractions of $T(\mathbb{B}_\mathfrak{X})^o$.

As $z \in \overline{T(\mathbb{B}_\mathfrak{X})}^o$, $z$ is the limit of a sequence $\left \{ T(w_n) \right \}_{n=1}^\infty \subset T(\mathbb{B}_\mathfrak{X})^o$ where $w_n \in \mathbb{B}_\mathfrak{X}$. Then, it follows that for all $\epsilon > 0$, there exists $N(\epsilon) \in \mathbb{N}$ such that $\|z-T(w_m)\|_\mathcal{Y} < \epsilon$ for all $m \geq N(\epsilon)$. Following this, set $\epsilon_n = (1-\delta)\delta^n$. Now, let $y_n = (1-\delta)^{-1}T(w_{N(\epsilon_n)})$ for $n \geq 1$ and $y_0=0$ (which does indeed work, as $\|z\|_\mathcal{Y}<(1-\delta)$). Notice that this allows us to yield the following observations:

Thus, as $y_n \in (1-\delta)^{-1}T(\mathbb{B}_\mathfrak{X})^o$, we then have that $y_n-y_{n-1} \in \delta^{n-1}T(\mathbb{B}_\mathfrak{X})^o$, and our desired properties have been fulfilled.

Following this, we find a convergent sequence in $\mathfrak{X}$ which also converges to $y$ under $T$ to demonstrate that $y \in (1-\delta)^{-1}\mathbb{B}_\mathfrak{X}$. That is, we seek a sequence $\left \{ x_n \right \}_{n=1}^\infty \subset \mathfrak{X}$ such that $\|x_n\|_\mathfrak{X} < \delta^{n-1}$ and $T(x_n)=y_n-y_{n-1}$. To do this, let us first notice that, for $x \notin ker(T)$, we have $1 \leq \|T(\frac{x}{\|x\|_\mathfrak{X}})\|_\mathcal{Y}$, as $\mathbb{B}_\mathcal{Y}^o \subseteq \overline{T(\mathbb{B}_\mathfrak{X})}^o$. Setting $x_n = (1-\delta)^{-1}\left (w_{N(\epsilon_n)}-w_{N(\epsilon_{n-1})} \right )$, clearly $T(x_n) = y_n - y_{n-1}$ by linearity, and the following then holds for $n \geq 1$:

Thus, $\|x_n\|_\mathfrak{X} < \delta^{n-1}$. Following this, we may now also notice that

so the series $\sum_{n=1}^\infty x_n$ is absolutely convergent, and correspondingly by the previous lemma, we then have that $\sum_{n=1}^\infty x_n = x^* \in \mathfrak{X}$ by the completeness of the space.

Moreover, note $\|x^*\|_\mathfrak{X} < (1-\delta)^{-1}$, so $x^* \in (1-\delta)^{-1}\mathbb{B}_\mathfrak{X}$. Thus, by the linearity of $T$, we then have that

and, correspondingly, $y \in (1-\delta)^{-1}T(\mathbb{B}_\mathfrak{X})^o$. Hence, it follows that

As such, we have demonstrated that, if $z \in \mathbb{B}_\mathcal{Y}^o \subset \overline{T(\mathbb{B}_\mathfrak{X})}^o$, then $z \in T(\mathbb{B}_\mathfrak{X})^o$ as well. As $z$ was chosen arbitrarily, this yields that $\mathbb{B}_\mathcal{Y}^o \subset T(\mathbb{B}_\mathfrak{X})^o$ as desired.  $\blacksquare$

Lemma

Let $V$ and $W$ be $\mathbb{R}$-vector spaces and $T: V \rightarrow W$ a bounded linear map. If $C \subset V$ is convex in $V$, then $T(V)$ is convex in $W$.

Proof

Let $V$ and $W$ be $\mathbb{R}$-vector spaces, $T: V \rightarrow W$ be a bounded linear map, and $C \subset V$ be convex in $V$. We seek to show that, for all $x,y \in T(C)$, $(1-t)x+ty \in C$ for all $t \in [0,1]$.

To that end, let $x,y \in T(C)$ and $t \in [0,1]$ be arbitrary. Then, $x = T(v)$ and $y = Tu$ for some $v,u \in C$. Moreover, by the convexity of $C$, we then have that $(1-t)v+tu \in C$. But, by the linearity of $T$, we yield the following inclusion

Thus, $T(C)$ is convex as well.  $\blacksquare$

Now, we use these results to prove the open mapping theorem, which we will recall is stated as follows:

Theorem (Open Mapping)

Let $\mathfrak{X}$ and $\mathfrak{Y}$ be Banach spaces and $T : \mathfrak{X} \rightarrow \mathfrak{Y}$ a bounded linear operator. Then, if $T$ is surjective, $0 \in int A(\mathbb{B}_\mathfrak{X})$.

Proof

Let $\mathfrak{X}$ and $\mathfrak{Y}$ be Banach spaces and $T : \mathfrak{X} \rightarrow \mathfrak{Y}$ a surjective bounded linear operator. If $\mathfrak{Y}$ is the trivial space, then we are done. Suppose $\mathfrak{Y}$ is not the trivial space. By the linearity of the spaces, it is sufficient to show that $T$ maps $\mathbb{B}_\mathfrak{X}$ to a neighborhood of the origin of $\mathfrak{Y}$.

First, let us note that $\mathfrak{X} = \bigcup_{n=1}^\infty n\mathbb{B}_\mathfrak{X}$. Correspondingly, by the surjectivity and linearity of $T$, we then have that $\mathfrak{Y} = T(\mathfrak{X}) = \bigcup_{n=1}^\infty nT(\mathbb{B}_\mathfrak{X}).$

As $\mathfrak{X}$ and $\mathfrak{Y}$ are Banach spaces, they are also Baire spaces. Correspondingly, as the whole space $\mathfrak{Y}$ has nonempty interior, there exists $n \in \mathbb{N}$ such that $\overline{nT(\mathbb{B}_\mathfrak{X})}^o \neq \emptyset$. Thus, there then must exist $y_0 \in \mathfrak{Y}$ and $r > 0$ such that $(y_0+r\mathbb{B}_\mathfrak{Y})^o \subset \overline{nT(\mathbb{B}_\mathfrak{X})}^o$.

Moreover, observe that if $y \in (y_0 + \mathbb{B}_\mathfrak{Y})^o \subset \overline{nT(\mathbb{B}_\mathfrak{X})}^o$, then $-y \in \overline{nT(\mathbb{B}_\mathfrak{Y})}^o$ as well by the linearity of $T$. Thus, $(-y_0+r\mathbb{B}_\mathfrak{Y})^o \subseteq \overline{nT(\mathbb{B}_\mathfrak{X})}^o$, and as the image of a convex set under a bounded linear map is convex by the third lemma above, we then have that

By the second lemma, we then may conclude that $r\mathbb{B}_\mathfrak{Y}^o \subset nT(\mathbb{B}_\mathfrak{X})^o$. Therefore, as we have shown that $T$ maps $\mathbb{B}_\mathfrak{X}$ to a neighborhood of the origin of $\mathfrak{Y}$, it follows that $T$ is then an open map.  $\blacksquare$

NOTE

This post draws on a large number of resources that I’ve encountered at various points over the last few years, most of which I didn’t write down at the time.  If you recognize any of the proofs given above, I’d love to know where you’ve seen them so I can properly cite the source.