A tour of functional analysis 1 – Locally convex vector spaces and the Hahn-Banach theorem(s)

One of the pillars of functional analysis is the Hahn-Banach theorem, so it makes sense to dedicate a post to this theorem. On normed spaces, the theorem has a plethora of interesting corollaries, some of which will be stated here. The locally convex spaces are of interest since they are the most rudimentary topological vector spaces on which the the Hahn-Banach theorem can be used to extend continuous linear functionals, and encompasses a sizable chunk of the topological vector spaces one might meet in the wild.

– Locally Convex Topological Vector Spaces –

Locally convex topological vector spaces (LCTVS) can be characterized either as a topological vector space $X$ where the origin has a neighborhood basis consisting of convex open sets $U$ each of which satisfy

\begin{array}{lll}
& \lambda x \in U & \text{for all }\lambda \leq 1 \text{ and all }x\in U & \textbf{(Balanced)}\\
& \bigcup^{\infty}_{t=1}tU = X & & \textbf{(Absorbent)}. \\
\end{array}

(recall that on topological vector spaces, translation is a homeomorphism, by definition, so we only need to specify a local neighborhood basis at one point to determine the topology). Note also that any neighborhood of 0 of a topological vector space must be absorbent, this can be shown since “scaling” is continuous.
Alternatively, one can define it as a topological vector spaces whose topology is “induced” by a family of seminorms, (this will be made precise shortly). Here I will follow the latter route.

Let $X$ be a topological vector space, we say that a map $\sigma : X\to \mathbb{R}$ is a seminorm if for all $x, y \in X$ and $\lambda \in \mathbb{C}$ , it satisfies
\begin{array}{ll}
& \sigma(x) \geq 0 \\
& \sigma(x + y) \leq \sigma(x) + \sigma(y) \\
& \sigma(\lambda x) = |\lambda|\sigma(x)
\end{array}
That is, a seminorm is a norm that might not be “positive definite”, e.i. $\sigma(x) = 0 \not\Rightarrow x= 0$ .

Example
With $p \geq 1$ , let $\mathcal{L}^p(X, \mu)$ be the space of complex valued functions on a measure space $X$ , whose p’th power is $\mu$ -integrable. Check that

$f \mapsto \left( \int_X |f|^p \right)^{1/p}$

is a seminorm on $\mathcal{L}^p(X, \mu)$

The topology induced by an arbitrary family of seminorms $\{ \sigma_\alpha \}$ is the weak topology induced by the seminorm maps $\sigma_\alpha: X\to \mathbb{R}$ which is the weakest (or coarsest) topology for which all the seminorms are continuous. Explicitly, it is the topology induced by the neighborhood subbasis elements of the origin given by

$V_{\epsilon, \alpha} = \{ x\in X ~ |~ \sigma_\alpha(x) < \epsilon \}.$

Norms are of course also seminorms, and Banach spaces are LCTVS’s since their topologies are weak topologies induced by a single “seminorm” (an actual norm this time).

Let $C$ be a convex set containing $0$ . The point 0 is said to be an internal point of $C$ if the intersection of every line through 0 with $C$ is a non-trivial interval. Note that this is satisfied if $0$ is an interior point of $C$ , or if $C$ is balanced and absorbent. This is the minimum structure we need to define the following (non-linear) functional, as it ensures the infimum is taken over a non-empty set:

Definition 1 (Minkowski Functional)
Let $C$ be a convex set containing $0$ as an internal point. The Minkowski functional $\rho_C: X\to \mathbb{R}$ is defined as the map

$\rho_C(x) = \inf\{ r ~ | ~ r> 0, ~ r^{-1}x \in C \}.$

The Minkowski functionals (sometimes called the Gauge functionals or the support functions for C) bridges the gap between the two definitions of LCTVS’s by assigning to each convex absorbent and balanced neighborhood basis element $U$ of $0$ a functional $\rho_U$ , and employing the following fact, which I state here without proof,

Proposition 1
If $\{U_\alpha\}$ is a neighborhood basis of convex absorbent and balanced sets, the associated family of Minkowski functionals $\{ \rho_\alpha \}$ is a family of seminorms which induce the same topology as $\{ U_\alpha\}$ .

The following Lemma in addition to its intrinsic value will be of use later on in the proof of the Hahn-Banach theorem.

Lemma 2
A linear functional $l$ on any complex topological vector which is bounded on an open neighborhood of 0 is continuous.

Proof

Let $\phi$ be a linear functional that is bounded on a neighborhood $U$ of $0$ . We may assume without loss of generality that $|\phi(u)| < 1$ for all $u\in U$ . By linearity $|\phi(\epsilon U)| \subset (-\epsilon, \epsilon)$ , so $\phi(\epsilon U) \subset B_\epsilon(0)$ (the ball of radius $\epsilon$ centered at $0$ ). Since $\epsilon$ was arbitrary, this shows continuity of $\phi$ at 0, so it must be continuous on $V$ .

It is an easy exercise to check that same conclusion holds if $Re(l)$ or $|l|$ are bounded in the above lemma (using the formula $l(v) = Re(l(v)) + i Re(l(iv))$ ), and the lemma holds for real vector spaces as well.

– A word on metrizability –

Since every metrizable space must be (at the very least) first countable and Hausdorff, Proposition 1 has the consequence that if a LCTVS is metrizable it must be possible to induce it’s topology by a countable family of seminorms that separate points, namely the Minkowski functionals associated with the (countable) neighborhood basis of the origin. The first condition comes from the requirement that the space be first countable (i.e. has a countable neighborhood basis at 0) the second from the fact that it must be Hausdorff. Recall that a family $\mathcal{F}$ of seminorms on $V$ is said to separate points if for each distinct $x,y \in V$ there exists a seminorm $\sigma \in \mathcal{F}$ such that $\sigma_n(x -y) > 0$ . The important fact worth remembering is that this is actually a sufficient requirement:

Lemma 1 (metrizability)
A locally convex topological vector space is metrizable if and only if its topology can be induced by a countable family of separating seminorms.

Proof

We have argued that the requirement is necessary. To see it is sufficient, let $\{\sigma_i \}_{i=1}^\infty$ be a separating family of seminorms which induce the topology, then the metric

$d(x, y) = \sum_{n= 1}^\infty\frac{ \sigma_n(x -y)}{2^n (1 + \sigma_n(x - y))}$

can be shown to induce the same topology.

So for LCTVS metrizability is equivalent to being Hausdorff and second countable, which is equivalent to the existence of a countable family of separating seminorms which induce the topology. Be warned, some authors define topological vector spaces to be Hausdorff out of the box, so they might not even mention the “separates points” requirement, as that is baked into the definition of topological vector spaces.

Example (metrizability)
Let $X$ be a $\sigma$ -compact topological space and let $Y$ be a normed topological vector space. The compact open topology on $C(X,Y)$ (all continuous function from $X$ to $Y$ ) is metrizable, since it is induced by a countable separating family of seminorms, namely

$\sigma_K(f) = \max\{ ||f(x)|| ~ :~ x\in K \}$

for each compact $K$ in the (countable) collection of compact sets which cover $X$ .

– The Hahn-Banach Theorem –

Now that we got some of the basic theory of LCTVS’s out of the way, let’s proceed to the main topic of this post. Here is the original Hahn-Banach theorem for real vector spaces. A function $p: V\to \mathbb{R}$ on a real vector space $V$ is said to be sublinear if it satisfies

\begin{array}{cl}
&p(u + v) \leq p(u) + p(v), & \text{for all } u, v\in V\\
&p(r v) = rp(v)& \text{for all } r\geq 0.
\end{array}

Theorem 1 (Hahn-Banach) [real vector spaces version 1]
If $V_0 \subset V$ is a subspace of a real vector space $V$ , and $l_0 :V_0 \to \mathbb{R}$ is a linear functional, such that $l_0(x) \leq p(x)$ for all $x\in V_0$ and for some sublinear functional $p$ on $V$ , then $l_0$ can be extended to a linear functional $l$ on $V$ in such a way that

$l(x) \leq p(x) \qquad \text{for all }x\in V$

Proof

If $a \not\in V_0$ we may define a linear functional $l_a$ on $V \oplus span\{a\}$ by

$l_a(v + ta) = l_0(v) + kt$

for some constant $k \in \mathbb{R}$ . Choosing $k$ cleverly one can ensure that $l_a \leq p$ .
Now form the collection of all linear extensions of $l$ dominated by $p$ and order them by “extension”, that is, $l_\alpha \prec l_\beta$ if $l_\beta$ is an extension of $l_\alpha$ . The collection is non-empty (by the above) and can be shown to be inductively ordered, hence by Zorn’s lemma it has a maximal element, which must be defined on the whole of $V$ , or else one could extend it further (again by the above argument), which contradicts the maximality assumption.

If the vector space is equipped with a countable (Hamel) basis, it may be tempting to rewrite the above proof as an induction argument on the basis, and hope to get away without relying on any of the choice axiom (in particular the axiom of dependent choices). But unfortunately there is no canonical way to choose the value of $k$ at each step in the iteration, so we would still need the axiom. Be warned that the reliance on the axiom of dependent choices is deemed so innocuous that many authors understate or completely miss it.

The Hahn-Banach theorem was first discovered by Hahn and later independently by Banach. At this stage we have not had any use for the condition that the extension be dominated by a sublinear functional, and it may seem strange that both authors came up with this seemingly arbitrary condition, but there is a method to the madness. First of, note that norms, seminorms and the Minkowski functionals are examples of sublinear functionals, so one can think the sublinear functionals as distilling the essential features of these maps, which ensures that an extension of a continuous linear functional on a normed space is also continuous (see the next corollary). Secondly, Hahn never actually formulated the theorem by means of sublinear functionals at all, but something to the extent of the next corollary. The above theorem is due to Banach. Thirdly, both authors likely knew about a previous result of Helly which showed that continuous linear functionals on normed spaces could be extended to continuous linear functionals on the space obtained by adjoining a single basis vector to the domain. So the theorem should be seen as an attempt to generalize Helly’s result to arbitrary continuous extensions of continuous linear functionals on normed spaces.

The above theorem can be extended to complex vector spaces. First we define a complex sublinear functional on $V$ to be a functional satisfying

$\begin{array}{l} & p(u + v) \leq p(u ) + p(v) & \text{for all $u, v \in V$} \\ & p(\alpha u) = |\alpha| p(u) & \text{for all $\alpha \in \mathbb{C}$ and all $u\in V$} \end{array}$

that is, $p$ is just like a seminorm on $V$ that can attain negative values.

Theorem 2 (Hahn-Banach) [complex vector spaces version 1]
If $V_0 \subset V$ is a subspace of a (complex) vector space $V$ , and $l^0 :V_0 \to \mathbb{C}$ is a linear map, such that $|l^0(x)| \leq p(x)$ for all $x\in V_0$ and for some sublinear functional $p$ on $V$ , then $l^0$ can be extended to a linear functional $l$ on $V$ in such a way that

$|l(x)| \leq p(x) \qquad \text{for all }x\in V$

Proof

There is a 1-1 correspondence between real linear functionals (if we pretend for a second $V$ as a real vector space) and complex linear functionals on $V$ . The correspondence is given by

$l^0(v) = l^0_r(v) + il^0_r(iv)$

where $l^0_r = Re(l^0)$ is a real real linear functional on $V$ . We may extend $l^0_r$ to a real linear functional $l_r$ on V such that $l_r(v) \leq p(v)$ (by Theorem 1). The corresponding complex linear functional $l(v) = l_r(v) + il_r(iv)$ does the trick, since for any $v\in V$ , with

$\alpha = \begin{cases} \frac{\overline{l(v)}}{|l(v)|} & \text{if } l(v) \neq 0\\ 1 & else \end{cases}$

we have that

$|l(v)| = l(\alpha v) = l_r(\alpha v) \leq p(\alpha v) = |\alpha |p(v) = p(v).$

The next theorem is sometimes called the analytic version of the Hahn-Banach theorem, or simply just the Hahn-Banach theorem. All vector spaces will henceforth be assumed to be complex.

Corollary 2.1
If $V_0$ is normed vector subpace of a normed vector space $V$ , and let $l_0$ be a continuous linear functional on $V_0$ . Then there exists a continuous linear functional $l$ on $V$ that extends $l_0$ , such that $||l_0|| = ||l||$ .

Proof

The map $p(x)=||l_0|| ||x||$ is the sublinear functional we need. Then employ Theorem 2.

Corollary 2.2
If $Y\subset V$ is a closed subspace of a normed vector space $V$ , and $x\not \in Y$ , then there is a continuous linear functional $l$ on $V$ such that

$||l|| = 1$
$l(x) = d(x, Y):= \inf\{ ||x - y|| ~:~ y \in Y \}$
$l\in Y^0$ (the annihilator of $Y$ )

Proof

On $Y + span\{ x\}$ define the linear functional

$l_0(x + \alpha y) = \alpha d(x, Y)$

and check that it is bounded (by 1) then employ Corollary 2.1. The resulting extension does the trick.

Corollary 2.3
If $V$ is a normed vector space, $x_0\in V$ an arbitrary point in $V$ and $l_0$ an arbitrary functional on $V$ , then

$||x_0|| = \sup\{ |l(x_0)|~ : ~ l \text{ is a functional on } V, \text{ with } ||l||\leq 1 \}$
$||l_0|| = \sup\{ |l_0(x)| ~:~ x \in V, ~ ||x||\leq 1 \}$

The first supremum is always attained.

Proof

The second supremum is just the definition of the dual norm, so that clearly holds. In the first supremum, $||x_0||$ clearly dominates the values over which we take the supremum. Conversely, using Corollary 2.2, we know there exists a functional $l$ such that

$l(x_0) = d(x_0, \{ 0\}) = ||x_0||$

which concludes the proof.

In the above corollary, if $V$ happens to be a Banach space (that is, it is complete with respect to its norm), the second supremum is attained if and only if the space is reflexive, which just means it is isomorphic to its second dual under the canonical imbedding. This is called James Theorem, and is a deep result of functional analysis which will not be covered here, but it should definitely be mentioned.

– Separation Theorems –

Now that some of the corollaries to the Hahn-Banach theorem for normed spaces have been covered we turn to the case of LCTVS. Here we will need the following analogue to Theorem 1,

Theorem 3 (Hahn-Banach) [Vector space version 2]
Let $V$ be a real vector space, and $A_1, A_2 \subset V$ nonempty disjoint convex subsets of $V$ . Assume that $A_1$ contains an internal point. Then there is a (non-constant) real linear functional $l$ on $V$ such that

$\sup\{ Re(l(x))~| ~ v\in A_1\} \leq \inf\{ l(u) ~ |~ u\in A_2 \}.$

Proof

Let $v_1 \subset A_1$ be the internal point, and $v_2\subset A_2$ arbitrary, and let $v_0 = v_1 - v_2$ . Then $A := v_0 + A_1 - A_2$ is a convex set with 0 as an internal point and $v_0 \not\in A$ . Let $p_A$ be the Minkowski functional associated with $A$ . Clearly $p_A(v_0) \geq 1$ , so the functional $l_0$ defined on the span of $v_0$ by $l(\alpha v_0) = \alpha$ is dominated by $p_A$ since,

$l_0(\alpha v_0) = \alpha \leq \alpha p_A(v_0) = p(\alpha v_0).$

Using Theorem 1 we may extend $l_0$ to a (real) linear functional $l$ on $V$ such that $l(x) \leq p_A(x)$ , for all $x\in V$ . Check that this functional has the desired properties.

The preceding “separation” theorem, sometimes called the Hahn-Banach separation theorem, though to the best of my knowledge neither Hahn nor Banach actually stated it, has the following interesting consequence on locally convex spaces

Corollary 3.1 (Hahn-Banach) [LCTVS]
Let $F$ be a nonempty closed and convex subset of a LCTVS $V$ , and $x_0 \not\in F$ . Then there is a continuous linear functional $l$ on $V$ such that

$Re(l(x_0)) < \inf\{ Re(l(x)) ~ |~ x\in F \}.$

Proof

Since $0$ is an external point of the closed set $F - x_0$ and $V$ is a LCTVS, there is an open convex neighborhood $U$ of $0$ disjoint from $F - x_0$ . Applying Theorem 3, there exists an non-trivial linear functional $l$ such that

$\sup \{ Re(l(x)) ~ |~ x\in U \} \leq \inf \{ Re(l(x)) ~ |~ x \in F - x_0 \} .$

By Lemma 2, $l$ is continuous. Now $\sup\{ Re(l(x))~ |~ x \in U\} \geq 0$ (since $l(0) = 0$ ). This inequality is strict. To see this, recall that earlier we noted that any neighborhood of $0$ of a topological vector space is absorbent, hence there is for any $z \in V$ a scalar $\epsilon$ such that $\epsilon z \in U$ . But then with $|\alpha| = 1$ such that $|l( z)| = l(\alpha z)$ , if we pick $\epsilon$ such that $\epsilon \alpha z \in U$ , we get $|l(\epsilon z)| = 0$ , which implies $l(z) = 0$ and, since $z$ was arbitrary, that $l = 0$ . This contradicts that $l$ was non-trivial so we conclude that

$\inf \{ Re(l(v)) ~|~ v \in F-x_0 \} > 0$

which concludes the proof.

Corollary 3.2 Separating sets
Let $C$ and $K$ be disjoint convex sets in a LCTVS $V$ , with $C$ closed and $K$ compact. Then there is a continuous linear functional $l$ on $V$ such that

$\max\{ Re(l(x)) ~ |~ x\in K\} < \inf\{ Re(l(x))~|~ x\in C \}$

Proof

Check that $C-K$ is closed (using nets). $0\not\in C-K$ and since $V$ is locally convex, there exists an open convex neighborhood $U$ of $0$ disjoint from $C-K$ . By Theorem 3, there is a continuous linear functional $l$ on $V$ such that $\inf\{ Re(l(x)) ~|~ x\in C- F \} > 0$ , which (since $C$ and $F$ are disjoint), in turn yields

$0 < \inf\{Re(l(x)) ~|~ x\in F \} - \max\{ Re(l(y)) ~|~ y\in K \}$

which concludes the proof

Corollary 3.3
If $V$ is a LCTVS, $K\subset V$ is a closed linear subspace, and $x \not\in K$ , then there is a continuous linear functional $l_0 \in K^0$ (the annihilator of $K$ ) such that

$l_0(x_0) > 0.$

Proof

From Corollary 3.1 choose a non-trivial continuous linear functional $l$ such that

$Re(l(x)) < \inf\{ Re(l(y)) ~|~ y\in K \}.$

Assume there is a $y\in K$ such that $Re(l(y)) \neq 0$ . Then for any $b\in R$ we have $bR(l(y)) \in K$ . But then $Re(l(x)) < bR(l(y))$ for all $b\in \mathbb{R}$ , which, since $Re(l(x))$ is finite, implies that $l(y) = 0$ . Hence $l|_K = 0$ . Scaling $l$ by $\frac{\overline{l(x)}}{|l(x)|}$ concludes the proof.