Powered by MathJax

Saturday, April 5, 2014

Theoretical Remarks #5 - the derivative of the inverse function

   In today's post (after quite a long time) I am going to discuss the differentiation of an inverse function $f^{-1}$ given that we know the derivative of $f$. I will start by laying down the theorem which will be the basic ingredient for working with the derivatives of inverse functions:
Theorem: Let a $f:\Delta \rightarrow \mathbb{R}$ be a function defined on an interval $\Delta \subseteq \mathbb{R}$ and let $f$ be
  • One-to-one ("1-1") and continuous
  • differentiable at a point $\xi \in \Delta$ 
  • $f'(\xi) \neq 0$
then the inverse function $f^{-1}:f(\Delta) \rightarrow \Delta \subseteq \mathbb{R}$ is also differentiable at the point $\zeta = f(\xi) \in f(\Delta)$ and we have 
\begin{equation} \label{invder1}(f^{-1})'( \zeta ) = \frac{1}{f'( \xi )}\end{equation}
which can be equivalently written (in Leibniz's notation) 
\begin{equation} \label{invder2}  \frac{dx}{dy} |_{\zeta=f(\xi)} = \frac{1}{\frac{dy}{dx} |_{\xi}}\end{equation}
with the understanding (in the notation) that:  $ \ y=f(x) \Leftrightarrow x=f^{-1}(y)$. 

Remark: If the conditions of the above theorem are valid for any point $x_{0} \in \Delta$ then we can write (in a somewhat more simplified notation):
\begin{equation} \label{invder3}  \frac{dx}{dy} |_{y_{0}} = \frac{1}{\frac{dy}{dx} |_{x_{0}}} \end{equation}
for any $x_{0} \in \Delta$ and $y_{0}=f(x_{0})$. In that case it is customary to write $$\frac{dx}{dy} = \frac{1}{\frac{dy}{dx}}$$ in $\Delta$. We (again!) we have to keep in mind that $ \ y=f(x) \Leftrightarrow x=f^{-1}(y)$.

Let us now proceed to a couple of proofs of the above result:

(I). A geometrical proof:
   The following figure, provides us with a  geometrical interpretation of the previous relation between the derivative of a function and the derivative of its inverse function: 
   It is well known that the graphs $C_{f}$ and $C_{f^{-1}}$ are symmetric with respect to the bisector $y=x$ of the first quadrant. 
   One can easily convince himself, that the above described symmetry, implies that 
\begin{equation} \label{geominterpr}θ+φ=π/2 \end{equation}
where $θ \neq 0$ is the angle between the tangent line of $C_{f}$ at $A(\xi,\zeta)$ and the horizontal axis and $φ \neq π/2$ is the angle between the tangent line of $C_{f^{-1}}$ at $B(\zeta,\xi)$ and the horizontal axis. 
   But since $f'(\xi)=tanθ$ and $(f^{-1})'(\zeta)=tanφ$, it suffices to invoke \eqref{geominterpr} together with the well known trigonometrical relation 
to conclude that $(f^{-1})'(\zeta)=\frac{1}{f'(\xi)}$. 

   The proof presented above is based on the geometrical interpretation of the derivative as the slope of the tangent line to the graph of the function. However, we could have proceeded through a completely different road, using the chain rule of differentiation:

(II). A "proof" through the chain rule of differentiation: 
   Under the conditions of the theorem, it is clear that the function $f:\Delta \rightarrow \mathbb{R}$ has an inverse function $f^{-1}:f(\Delta) \rightarrow \Delta \subseteq \mathbb{R}$.
   We can consider either the inverse (or the initial function) function as a composite function in the following sense:
\begin{equation} \label{inverderthrcomp}
x=f^{-1}(y)=f^{-1}(f(x))=(f^{-1} \circ f)(x)
   Now we can straightforwardly apply the chain rule of differentiating composite functions as follows: We differentiate \eqref{inverderthrcomp} with respect to $x$:
1 = \frac{dx}{dx}|_{x_{0}} = [(f^{-1})(y_{0})]' =  (f^{-1})'(y_{0})\cdot f'(x_{0}) \Rightarrow  (f^{-1})'(y_{0})  = \frac{1}{f'(x_{0}) }
which concludes the proof.
   1. The reader should pay particular attention at the notation at this point: the symbol $[(f^{-1})(y_{0})]' \equiv [f^{-1}(f(x_{0}))]' \equiv \frac{dx}{dx}|_{x_{0}}$ denotes the derivative of $f^{-1} \circ f$ with respect to $x$ -computed at $x_{0}$- while $(f^{-1})'(y_{0})$ denotes the derivative of $x=f^{-1}(y)$ with respect to $y$ -computed at $y_{0}$- and $f'(x_{0})$ the derivative of $y=f(x)$ with respect to $x$ (as usually)  computed at $x_{0}$.
   2. Using the Leibniz notation, we could have alternatively written:
1 = \frac{dx}{dx}|_{x_{0}} = \frac{dx}{dy}|_{y_{0}} \cdot \frac{dy}{dx}|_{x_{0}} \Rightarrow \frac{dx}{dy} |_{y_{0}} = \frac{1}{\frac{dy}{dx} |_{x_{0}}}
where $y_{0}=f(x_{0}) \Leftrightarrow x_{0}=f^{-1}(y_{0})$.
3. The discerning reader should notice the following fact in this last "proof": we have only proved \eqref{invder1}, \eqref{invder2}. However we have not shown (in fact we took it for granted) that under the conditions of the theorem the inverse function is actually differentiable or in other words that $(f^{-1})'(y_{0}) = \frac{dx}{dy}|_{y_{0}} $ exists (in the sense that it is a real number). This is the reason for the use of the quotation marks in the word proof.

   Finally we have to mention, that we can supply another proof of the above theorem by straightforward use of the definition of the derivative as the limit of the rate of change. We will discuss about this proof in a subsequent post.

No comments :

Post a Comment