On the Convolution of Distributions with Compactly Supported Smooth Functions

Apr 30

Theorem: Let $\Lambda\in\mathcal{D}’(\mathbb{R}^n),$ $\phi\in\mathcal{D}(\mathbb{R}^n),$ and $\psi\in\mathcal{D}(\mathbb{R}^n).$ Then the following hold:

(a) $\tau_x(\Lambda*\phi)=(\tau_x\Lambda)*\phi=\Lambda*(\tau_x\phi)$ for all $x\in\mathbb{R}^n$

(b) $\Lambda*\phi\in\mathcal{C}^\infty(\mathbb{R}^n)$ and for every multi-index $\alpha,$ $$D^\alpha(\Lambda*\phi)=(D^\alpha\Lambda)*\phi=\Lambda*(D^\alpha\phi)$$

Motivation: The goal of this post is to prove part (c) of the above theorem (theorem 6.30 of Rudin’s “Functional Analysis”) without using any vector-valued integration (specifically, $\mathcal{D}_K$-valued integration) as Rudin uses. We will use Folland’s technique of Reimann approximation in his proof of the same theorem. I will explain every nitty-gritty required for the said approximation technique (through Lemma 1 and Lemma 2 below). Those who have worked a lot with dyadic cubes will find Lemma 1 pretty straightforward to prove. However, I would still recommend the read to skim through the proof of Lemma 1, to appreciate exactly why compactness is needed. Lemma 2 uses Lemma 1 to construct the said Riemann approximation. Folland doesn’t explicitly provide any information about these two lemmas and leaves the construction an exercise for the reader. I urge every reader to not underestimate this exercise since the “uniform convergence” argument is not, to me atleast, trivial and uses some pretty elegant (though not difficult) steps that come together, almost magically, to produce the final proof (see how the proof of Lemma 2 uses part (iii) of Lemma 1). So for those who are not reading the chapters of Grandpa Rudin in order (like me) and don’t yet know vector-valued integration, fear not! This post is for you.

Definition: A dyadic cube $Q$ of side length $2^{-k}$ in $\mathbb{R}^n$ for any $k\in \mathbb{Z}$ is of the form $$Q=\prod_{j=1}^n[z_j2^{-k},(z_j+1)2^{-k})\text{ where }z_1,z_2,\dots,z_n\in \mathbb{Z}$$

We define $\mathcal{D}_k$ to be the collection of all such dyadic cubes of length $2^{-k}.$

Lemma 1: Let $K$ be a compact subset of $\mathbb{R}^n.$ For each $m\in\mathbb{N},$ define $$\mathcal{C}^K_{m}=\{Q\in\mathcal{D}_m:Q\cap K\neq \varnothing\}$$

Let $\mu$ denote the Lebesgue measure. Then, the following three properties hold:

(i) $K\subset \cup_{Q\in \mathcal{C}^K_{m}}Q\subset K+\overline{B}(0,2^{-m}\sqrt{n})$ for all $m\in\mathbb{N}.$

(ii) $\lim_{m\to\infty}\mu\left( \cup_{Q\in \mathcal{C}^K_{m}}Q\right)=\mu(K).$

(iii) the number of cubes in $\mathcal{C}^K_{m}$ is bounded by $2^nL_K^n2^{mn}$ for some $L_K<\infty.$

Note: We, technically, don’t need (ii) for the rest final proof. However, I have included it for completion.

Proof of Lemma:

Proof of (i) $K\subset \cup_{Q\in \mathcal{C}^K_{m}}Q$ is obvious from the definition of $\mathcal{C}^K_{m}$ and the fact that $\cup_{Q\in \mathcal{D}_{m}}Q=\mathbb{R}^n.$ Now, consider any $x\in Q\in\mathcal{C}^K_{m}.$ Consider any $y\in Q\cap K,$ which is guaranteed to exist by definition. Now, $||x-y||\le\text{diam}(Q)=2^{-m}\sqrt{n}$ by the Pythagorean theorem. Since $x=y+(x-y),$ we conclude that $x\in K+\overline{B}(0,2^{-m}\sqrt{n}).$ Hence, $\cup_{Q\in \mathcal{C}^K_{m}}Q\subset K+\overline{B}(0,2^{-m}\sqrt{n}).$

Proof of (ii) Since $\mathbb{R}^n$ has the Heine-Borel property, $K$ must be bounded. Say, $K\subset \overline{B}(0,r)$ for some $r_K>0.$ Then, $K+\overline{B}(0,2^{-1}\sqrt{n})\subset \overline{B}(0,2^{-1}\sqrt{n}+r_K).$ Since the latter has finite measure, so does $K+\overline{B}(0,2^{-1}\sqrt{n}).$ Hence, using continuity of measures from above, $$\lim_{m\to\infty}\mu\left(K+\overline{B}(0,2^{-m}\sqrt{n})\right)=\mu\left(\cap_{m=1}^\infty \big(K+\overline{B}(0,2^{-m}\sqrt{n})\big)\right)$$

Obviously $K\subset \cap_{m=1}^\infty \big(K+\overline{B}(0,2^{-m}\sqrt{n})\big.$ Now, let $x\in \cap_{m=1}^\infty \big(K+\overline{B}(0,2^{-m}\sqrt{n})\big)$. For any large $m\in \mathbb{N},$ since $x\in K+\overline{B}(0,2^{-m}\sqrt{n},$ we have $\text{dist}(K,x)=\inf_{y\in K}||x-y||\le 2^{-m}\sqrt{n}.$ Thus, $\text{dist}(K,x)=0$ and since $K$ is closed (Heine-Borel property) it must contain its limit point $x.$ Hence, $K=\cap_{m=1}^\infty \big(K+\overline{B}(0,2^{-m}\sqrt{n})\big.$

So $\lim_{m\to\infty}\mu\left(K+\overline{B}(0,2^{-m}\sqrt{n})\right)=\mu(K).$ Now, a simple application of the Squeeze theorem, along with (i), shows (ii).

Proof of (iii) Note that $K+\overline{B}(0,2^{-m}\sqrt{n})\subset \overline{B}(0,2^{-1}\sqrt{n}+r_K)\subset [-L_K,L_K]^n$ where $L_K=2^{-1}\sqrt{n}+r_K$ and $r_K$ has the same meaning as in the proof of (ii). Without loss of generality assume $r_K\ge 1.$ The number of dyadic cubes in $\mathcal{D}_m$ that intersect the latter big cube is bounded by $$\left(\left\lceil\frac{L_K}{2^{-m}}\right\rceil+1\right)^n\le \left(L_K2^m+2\right)^n\le \left(2L_K2^m\right)^n\le 2^nL_K^n2^{mn}$$

Now (iii) follows easily from (i). Note that the “$+1$” in the first term comes from the fact that the sides of the cubes in $\mathcal{D}_m$ are “half-open and half-closed” while the big cube has closed sides. Also $L_K\ge r_K\ge 1$ so $L_K2^{m}\ge 2$ which gives the second-to-last inequality.

Lemma 2: Let $f,g\in\mathcal{C}^1(\mathbb{R}^n)$ such that $\text{supp}(f)\subset K$ where $K$ is compact (equivalently, $f$ is compactly supported). For every $m\in \mathbb{N}$ and every $Q\in\mathcal{C}^{K}_{m}$ pick an $a_Q\in Q\cap K.$ Construct the sequence of functions $(S^m)_{m=1}^\infty$ defined as $$S^m(x)=\sum_{Q\in\mathcal{C}^{K}_{m}}\mu(Q)f(a_Q)\tau_{a_Q}g(x)\text{ for all }x\in\mathbb{R}^n$$

(i) $(S^m)_{m=1}^\infty$ converges pointwise to $f*g.$

(ii) If $g$ is also compactly supported then $(S^m)_{m=1}^\infty$ converges uniformly to $f*g.$

Proof of Lemma: Consider any $x\in\mathbb{R}^n.$ Define $h^x:\mathbb{R}^n\to\mathbb{R}$ as $h^x(y)= f(y)\tau_y g(x)$ for all $y\in\mathbb{R}^n.$

Consider any $m\in\mathbb{N}.$ Denote $\cup_{Q\in \mathcal{C}^K_{m}}Q=U_m.$ Note that $$f*g(x)=\int f(y)g(x-y)dy=\int f(y)\tau_y g(x)dy=\int h^x(y)dy=\int_{U_m}h^x(y)dy$$

The justification for the above steps is that $\text{supp}(h^x)\subset\text{supp}(f)=K\subset U_m$ from Lemma 1 (i).

Hence, using triangle inequality, $$\left|f*g(x)-S^m(x)\right|=\left|\int_{U_m} h^x(y)dy-\sum_{Q\in\mathcal{C}^{K}_{m}}\mu(Q)h^x(a_Q)\right|$$

$$=\left|\sum_{Q\in\mathcal{C}^{K}_{m}}\left(\int_{Q} h^x(y)dy-\mu(Q)h^x(a_Q)\right)\right|\le\sum_{Q\in\mathcal{C}^{K}_{m}}\left|\int_{Q} h^x(y)dy-\mu(Q)h^x(a_Q)\right|$$

This follows from fact that all dyadic cubes in $\mathcal{C}^{K}_{m}$ are disjoint. Now,

$$\sum_{Q\in\mathcal{C}^{K}_{m}}\left|\int_{Q} h^x(y)dy-\mu(Q)h^x(a_Q)\right|=\sum_{Q\in\mathcal{C}^{K}_{m}}\left|\int_{Q} \big(h^x(y)-h^x(a_Q)\big)dy\right|$$$$\le\sum_{Q\in\mathcal{C}^{K}_{m}}\int_{Q} |h^x(y)-h^x(a_Q)|dy\le \sum_{Q\in\mathcal{C}^{K}_{m}}\int_{Q}\left[\left(\sup_{\lambda\in [0,1]}\big|\big|Dh^x|_{y+\lambda(a_Q-y)}\big|\big|_{\text{op}}\right)||y-a_Q||\right]dy \text{ }(\star)$$

The last inequality follows from the Mean-Value Theorem. Here, $Dh^x|_y$ denotes the Frechet-derivative of $h^x$ and $\big|\big|Dh^x|_y\big|\big|_{\text{op}}$ denotes its operator norm at $y\in\mathbb{R}^n.$ If $f=f_1+if_2$ and $g=g_1+ig_2$ where $f_1,f_2,g_1,$ and $g_2$ are real functions, then we know that $\partial_kf_j$ and $\partial_kg_j$ exist and a continuous everywhere for all $k\le n,j\le 2$. If $h^x=h^x_1+ih^x_2$ where $h^x_1$ and $h^x_2$ are real functions then $h^x_1(y)=f_1(y)g_1(x-y)-f_2(y)g_2(x-y)$ and $h^x_2(y)=f_1(y)g_2(x-y)+f_2(y)g_1(x-y).$ So $\partial_kh^x_j$ exists and is continuous everywhere for all $k\le n,j\le 2.$ Hence, it is easily verified that $h^x$ is Frechet-differentiable (view $h^x$ as an $\mathbb{R}^2$-valued function and apply the same reasoning as in Theorem 9.21 of Baby Rudin).

We know that the operator norm of any linear operator is bounded by its Frobenius norm (see sec. 9.9 of Baby Rudin). From the above reasoning (using Theorem 9.21), we conclude that $$\big|\big|Dh^x|_y\big|\big|_{\text{op}}\le \sqrt{\sum_{j=1}^2\sum_{k=1}^n |\partial_kh^x_j(y)|^2}\text{ for all }y\in\mathbb{R}^n$$

Fix a $Q\in \mathcal{C}^{K}_{m}.$ For any $y’\in Q,$ note that $$\partial_k h^x_1(y’)=-f_1(y’)\partial_k g_1(x-y’)+g_1(x-y’)\partial_k f_1(y’)+f_2(y’)\partial_k g_2(x-y’)-g_2(x-y’)\partial_k f_2(y’)$$ Define $M_f$ and $M^x_g$ $$M_f=\sup_{k\in\{1,2\}}\sup_{|\alpha|\le 1}\sup_{y\in K}||\partial^\alpha f_k(y)||\text{ }\text{ and }\text{ }M^x_g=\sup_{k\in\{1,2\}}\sup_{|\alpha|\le 1}\sup_{y\in x-K}||\partial^\alpha g_k(y)||$$

Both are finite since $f,g\in\mathcal{C}^1(\mathbb{R})$ and both $K$ and $x-K$ are compact. Since $\text{supp}(h^x)\subset K,$ we have $\partial_k h^x_1(y’)=0$ if $y’\notin K.$ On the other hand, if $y’\in K$ then $|\partial_k h^x_1(y’)|\le 4M_fM^x_g.$ So in either case, the inequality holds. Similarly, $|\partial_k h^x_2(y’)|\le 4M_fM^x_g.$ Hence, there exists a constant $M_x$ dependent only on $x$ (when $f$ and $g$ are fixed) such that $\big|\big|Dh^x|_{y’}\big|\big|_{\text{op}}\le M_x.$

If, however, $g$ has a compact support, we can do better. Define $M_f$ and $M^x_g$ as $$M_f=\sup_{k\in\{1,2\}}\sup_{|\alpha|\le 1}\sup_{y\in \mathbb{R}^n}||\partial^\alpha f_k(y)||\text{ }\text{ and }\text{ }M^x_g=\sup_{k\in\{1,2\}}\sup_{|\alpha|\le 1}\sup_{y\in \mathbb{R}^n}||\partial^\alpha g_k(y)||$$

Again, both terms are finite since both $f$ and $g$ are compactly supported $\mathcal{C}^1$ functions (implying the the supremums are effectively taken over compact sets in both definitions). We compute $M_x$ as before. However, here our constant $M_x$, is the same (say $M$) for all values of $x.$

For any $y\in Q$ and $\lambda\in [0,1]$ since $y,a_Q\in Q$ the convexity of $Q$ implies that $y’=y+\lambda(y-a_Q)\in Q.$ So the above result holds for $y’.$ Hence, we conclude that $$\big|\big|Dh^x|_{y+\lambda(y-a_Q)}\big|\big|_{\text{op}}\le M_x\text{ for all }\lambda\in [0,1]\text{ for all }y\in Q\text{ for all }Q\in \mathcal{C}^{K}_{m}$$

Also note that for any $Q\in\mathcal{C}^{K}_{m}$ and $y\in Q,$ we have $||y-a_Q||\le\text{diam}(Q)=2^{-m}\sqrt{n}.$

From Lemma 1 (iii), ${C}^{K}_{m}$ has at most $2^nL_K^n2^{mn}$ cubes for some $L_K<\infty.$ Hence, continuing from $(\star)$, we get $$\sum_{Q\in\mathcal{C}^{K}_{m}}\left|\int_{Q} h^x(y)dy-\mu(Q)h^x(a_Q)\right|\le M_x\cdot(2^{-m}\sqrt{n})\sum_{Q\in\mathcal{C}^{K}_{m}}\int_{Q}dy$$$$\le M_x\cdot(2^{-m}\sqrt{n})\cdot (2^nL_K^n2^{mn})\cdot 2^{-mn}=(M_xL_K^n\sqrt{n})2^{n-m}$$

Combining all our results, we have $$\left|f*g(x)-S^m(x)\right|\le (M_xL_K^n\sqrt{n})2^{n-m}$$

Hence, $(S^m)_{m=1}^\infty$ converges pointwise to $f*g.$ If $g$ is compactly supported, then $M_x=M$ for all $x$ meaning $(S^m)_{m=1}^\infty$ converges uniformly to $f*g.$

Lemma 3: If $\Lambda\in\mathcal{D}’(\mathbb{R}^n)$ and $\phi\in\mathcal{D}(\mathbb{R}^n)$ then $y\mapsto \langle F,\tau_{y}\phi\rangle$ is a $\mathcal{C}^\infty$ function on $\mathbb{R}^n.$

Note: This follows easily from the fact that $(\tau_{h\mathbf{e}_j}\psi-\psi)/h\rightarrow\partial_j\psi$ in the topology of $\mathcal{D}(\mathbb{R}^n)$ as $h\rightarrow 0$ for any $\psi\in\mathcal{D}(\mathbb{R}^n).$ The reader may see the appendix for the proof of this fact. Using induction, one can easily find all the partial derivatives of the given function in the lemma using the above fact with $\psi=\tau_y\phi$. Hence, we skip the details of the proof.

At this point, we are finally ready to prove the theorem.

Proof of Theorem:

Fix an $x\in\mathbb{R}^n.$ Define $\zeta,\xi\in\mathcal{D}(\mathbb{R}^n)$ as $\zeta=\tau_x\tilde{\psi}$ and $\xi=\tilde{\phi}.$ (I’m using these symbols because they look sexy; also I have never been able to write zeta on paper neatly, so I’m making up for said inadequacy via an opportunity to be able to write it on print at least).

Consider any multi-index $\alpha.$ Then, both $\zeta$ and $\partial^\alpha \xi$ are obviously $\mathcal{C}^1$ and compactly supported. Construct sequence $(S^m_\alpha)_{m=1}^\infty$ as in Lemma 2 with $f=\zeta,g=\partial^\alpha \xi,$ and $K=\text{supp}(\zeta).$ By Lemma 2 (i), it converges uniformly to $\zeta*\partial^\alpha \xi.$ Now, notice two points:

(a) For all $m\in\mathbb{N}$ we have, by definition of the sequences, that $\partial^\alpha S^m=S^m_\alpha$ where $S^m=S^m_{(0,0,\dots,0)}$

(b) Using Proposition 8.10 (Folland), $\partial^\alpha(\zeta*\xi)=\zeta*\partial^\alpha \xi$

Hence, $(\partial^\alpha S^m_\alpha)_{m=1}^\infty$ converges uniformly to $\partial^\alpha(\zeta*\xi)$ for all multi-index $\alpha.$ Note that $(S^m)_{m=1}^\infty$ and $\zeta*\xi$ are both supported in some compact set $K_0.$ To see this note that $\text{supp}(S_m)\subset\text{supp}(\xi)$ and $\text{supp}(\zeta*\xi)\subset \text{supp}(\zeta)+\text{supp}(\xi)$ so simply taking $K_0=\text{supp}(\xi)\cup\big(\text{supp}(\zeta)+\text{supp}(\xi)\big)$ works. Hence, $(\partial^\alpha S^m)_{m=1}^\infty$ converges uniformly to $\zeta*\xi$ in the topology of $\mathcal{D}_{K_0}$ and, thereby, also in the topology of $\mathcal{D}(\mathbb{R}^n)$ by Theorem 6.5(b) (Grandpa Rudin). So,

$$\langle \Lambda,\zeta*\xi\rangle=\lim_{m\to\infty}\langle \Lambda,S^m\rangle=\lim_{m\to\infty}\sum_{Q\in\mathcal{C}^{\text{supp}(\zeta)}_{m}}\mu(Q)\zeta(a_Q)\langle \Lambda,\tau_{a_Q}\xi\rangle$$

Now, using Lemma 3, $y\mapsto \langle \Lambda,\tau_y\xi\rangle$ is smooth. Hence, $\eta_1:y\mapsto \zeta(y)\langle \Lambda,\tau_y\xi\rangle$ is $\mathcal{C}^1.$ Define $\eta_2:y\mapsto 1.$ Then, $\eta_1*\eta_2:y\mapsto \int \eta_1(y)dy.$ Again, we construct a sequence as in Lemma 2 using $f=\eta_1,g=\eta_2$ and $K=\text{supp}(\zeta).$ We can do this because $\text{supp}(\eta_1)\subset\text{supp}(\zeta).$ Using Lemma 2 (ii), this sequence converges point-wise to $\eta_1*\eta_2$ (consider any point, say $0$) and we get

$$\lim_{m\to\infty}\sum_{Q\in\mathcal{C}^{\text{supp}(\zeta)}_{m}}\mu(Q)\zeta(a_Q)\langle \Lambda,\tau_{a_Q}\xi\rangle=\lim_{m\to\infty}\sum_{Q\in\mathcal{C}^{\text{supp}(\zeta)}_{m}}\mu(Q)\eta_1(a_Q)\tau_{a_Q}\eta_1(0)=\eta_1*\eta_2(0)=\int \eta_1(y)dy=\int \zeta(y)\langle \Lambda,\tau_y\xi\rangle dy$$

We have, hence, shown that $\langle \Lambda,\zeta*\xi\rangle=\int \zeta(y)\langle \Lambda,\tau_y\xi\rangle dy.$ $(\star)$ At this point, we just need to put back our original $\phi$ and $\psi$ in the equation. Firstly, $$\zeta*\xi(t)=\int \zeta(y)\xi(t-y)dy=\int \psi(x-y)\phi(y-t)dy=\int \psi(x-t-(y-t))\phi(y-t)dy$$

$$=\int \psi(x-t-z)\phi(z)dz=\phi*\psi(x-t)=\widetilde{\phi*\psi}(t-x)=\tau_x(\widetilde{\phi*\psi})(t)$$

Hence, $\zeta*\xi=\tau_x(\widetilde{\phi*\psi}).$ So $\langle \Lambda,\zeta*\xi\rangle=\langle \Lambda,\tau_x(\widetilde{\phi*\psi})\rangle=\big(\Lambda*(\phi*\psi)\big)(x).$ Secondly, $$\int \zeta(y)\langle \Lambda,\tau_y\xi\rangle dy=\int \psi(x-y)\langle \Lambda,\tau_y\tilde{\phi}\rangle dy=\int \psi(x-y)(\Lambda*\phi)(y)dy=\big((\Lambda*\phi)*\psi\big)(x)$$

Hence, from $(\star)$, we have $\big(\Lambda*(\phi*\psi)\big)(x)=\big((\Lambda*\phi)*\psi\big)(x).$ Since $x$ is arbitrary, we conclude $\Lambda*(\phi*\psi)=(\Lambda*\phi)*\psi.$ Hence, proved.

Appendix:

Here, we prove a simple fact about test functions, concerning the notion of differentiation in $\mathcal{D}(\mathbb{R}^n)$ and $\mathcal{C}^\infty(\mathbb{R}^n).$ To avoid the clutter of notations I will consider test functions on $\mathbb{R}$ (since partial derivatives regard one dimensions anyway). Formally, we will show the following:

$$\lim_{h\to 0}\frac{\tau_h f-f}{h}=f’\text{ in both }\mathcal{C}^\infty(\mathbb{R})\text{ and }\mathcal{D}(\mathbb{R})$$

Note that I also show convergence in $\mathcal{C}^\infty(\mathbb{R})$ (even though its unrelated to this article) because the proofs for both topologies are quite similar (even though the topologies are different).

(A) Convergence in $\mathcal{C}^\infty(\mathbb{R})$

The topology on $\mathcal{C}^\infty(\mathbb{R})$ is defined by the countable collection of semi-norms $$p_N(f)=\sup\Big(\{|f^{(n)}(x)|:n\in\mathbb{N}_0,n\le N,|x|\le n\}\Big)\text{ }\text{ for $N\in\mathbb{N}$}\text{ }\text{ (Grandpa Rudin, sec. 1.46)}$$

Let $A_hg=\frac{\tau_{h}g(x)-g(x)}{h}$ for any $h\in \mathbb{R}$ and $g:\mathbb{R}\to\mathbb{R}.$ Now, consider any $N\in\mathbb{N}_0$ and an $n\in\mathbb{N}_0, n\le N.$ For any $|x|\le n,$ since $f^{(n)}$ is continuous by definition of $C^\infty(\mathbb{R}),$ we can apply the mean value theorem to conclude $$A_hf^{(n)}(x)=\frac{f^{(n)}(x-h)-f^{(n)}(x)}{h}=f^{(n+1)}(x')\text{ for some }x'\text{ strictly between }x-h\text{ and }x$$

Pick an arbitrary $\epsilon>0.$ Now, $f^{(n+1)}$ is also continous everywhere. Since $K_n=\{x\in\mathbb{R}:|x|\le n+1\}$ is compact, it must be uniformly continous on this set. Hence, there exists some $\delta_n>0$ such that $$|f^{(n+1)}(y)-f^{(n+1)}(z)|<\epsilon\text{ for all }y,z\in K_n\text{ such that }|y-z|<\delta_n$$

Now, as $|x|\le n,$ we have $|x'|< |x|+|h|\le |x|+1\le n+1$ if $h\le 1.$ Hence, $x,x'\in K_n.$ So, when $|h|\le \min\{1,\delta_n\},$ we have $$|A_hf^{(n)}(x)-f^{(n+1)}(x)|=|f^{(n+1)}(x')-f^{(n+1)}(x)|<\epsilon\text{ as }|x-x'|<|h|\le \delta_n$$

This holds for arbitrary $|x|\le n,$ and hence, $$\sup\Big(\left\{\left|A_hf^{(n)}(x)-f^{(n+1)}(x)\right|:|x|\le n\right\}\Big)<\epsilon\text{ for all }|h|\le\min\{1,\delta_n\}$$

Now, if we pick any $h$ such that $|h|\le\min\{1,\delta_0,\delta_1,\dots,\delta_N\}$ then, $$p_N(A_hf-f')=\sup\Big(\left\{\left|A_hf^{(n)}(x)-f^{(n+1)}(x)\right|:n\in\mathbb{N}, n\le N,|x|\le n\right\}\Big)<\epsilon$$

Here, we used the fact that $(A_hf)^{(n)}=A_h(f^{(n)}).$ Hence, $\lim_{h\to 0}p_N(A_hf-f')=0.$ This holds for all $N\in\mathbb{N},$ so $$A_hf\longrightarrow f\text{ in }\mathcal{C}^\infty\text{ as }h\longrightarrow 0$$

(B) Convergence in $\mathcal{D}(\mathbb{R})$

Since $f\in\mathcal{D}(\mathbb{R}),$ there exists some compact set $K_0$ such that $f\in \mathcal{D}_{K_0}.$ Let $K_0\subset [-N,N]$ for some $N\in\mathbb{N}$ (this exists because compact sets in $\mathbb{R}$ are bounded). Let $K=[-N-1,N+1]$ which is compact by the Heine-Borel theorem. Since $\text{supp}(f)\subset [-N,N],$ we have $\text{supp}(A_hf^{(n)})\subset K$ for all $n\in\mathbb{N}$ whenever $|h|<1.$

We wish to show that $A_hf$ (with $|h|<1$) converges to $f$ in $\mathcal{D}_K$ as $h$ approaches $0.$ Consider any $\epsilon>0.$ For a fixed $N\in\mathbb{N},$ and any $n\in\mathbb{N}_0$ with $n\le N,$ by the continuity of $f^{(n+1)}$ (and uniform continuity in $[-N-2,N+2]$), we have using the same steps (involving the mean-value theorem) as before, $$|A_hf^{(n)}(x)-f^{(n+1)}(x)|<\epsilon\text{ for all }x\in K\text{ whenever }|h|<\min\{1,\delta_n\}\text{ for some }\delta_n>0$$

Now, the topology on $\mathcal{D}_K$ is defined by the norms $$||f||_N=\sup\Big(\{|f^{(n)}(x)|:x\in K,n\in\mathbb{N}_0,n\le N\}\Big)\text{ for }N\in\mathbb{N}$$

So, if we take $|h|<\min\{1,\delta_0,\delta_1,\dots,\delta_N\},$ we have $$||A_hf-f'||_N=\sup\Big(\{|A_hf^{(n)}(x)-f^{(n+1)}(x)|:x\in K,n\in\mathbb{N}_0,n\le N\}\Big)<\epsilon$$

Hence, we have $\lim_{h\to 0}||A_hf-f'||_N=0$ for all $N\in\mathbb{N}.$ This shows convergence in $\mathcal{D}_K.$ Using Theorem 6.5(b) (Rudin), we have convergence in $\mathcal{D}(\mathbb{R}).$

Pritam Kayal

On the Convolution of Distributions with Compactly Supported Smooth Functions

On Optimal Policies for Finite Horizon Markov Decision Problems

Fundamental Solution of the Laplacian In All Dimensions