{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Maximum Likelihood  #\n",
    "\n",
    "It is perfectly valid to calculate the fit of the Phat distribution to a univariate dataset using Maximum Likelihood Estimation (MLE) via negative log-likelihood. This process is available via the `fit` method (which inherits from `statsmodels` `GenericLikelihoodModel`.\n",
    "\n",
    "BUT there is one major issue as it pertains to the tails that must be considered.\n",
    "\n",
    "First, let's attempt to fit the Phat distribution to our familiar distribution of S&P 500 index level returns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import seaborn as sns; sns.set(style = 'whitegrid')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[*********************100%***********************]  1 of 1 completed\n",
      "Optimization terminated successfully.\n",
      "         Current function value: -3.184565\n",
      "         Iterations: 160\n",
      "         Function evaluations: 275\n"
     ]
    }
   ],
   "source": [
    "import yfinance as yf\n",
    "import phat as ph\n",
    "\n",
    "sp = yf.download('^GSPC')\n",
    "sp_ret = sp.Close.pct_change()[1:]\n",
    "\n",
    "res = ph.Phat.fit(sp_ret)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.0005961 , 0.00354794, 0.07451353, 0.06369988])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.params"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>PhatFit Results</caption>\n",
       "<tr>\n",
       "  <th>Dep. Variable:</th>           <td>Close</td>       <th>  Log-Likelihood:    </th>  <td>  62558.</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Model:</th>                  <td>PhatFit</td>      <th>  AIC:               </th> <td>-1.251e+05</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Method:</th>           <td>Maximum Likelihood</td> <th>  BIC:               </th> <td>-1.251e+05</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Date:</th>              <td>Fri, 23 Jul 2021</td>  <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Time:</th>                  <td>08:17:39</td>      <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>No. Observations:</th>       <td> 19644</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Df Residuals:</th>           <td> 19643</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Df Model:</th>               <td>     0</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "</table>\n",
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "    <td></td>       <th>coef</th>     <th>std err</th>      <th>z</th>      <th>P>|z|</th>  <th>[0.025</th>    <th>0.975]</th>  \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>const</th> <td>    0.0006</td> <td> 5.14e-05</td> <td>   11.589</td> <td> 0.000</td> <td>    0.000</td> <td>    0.001</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>x1</th>    <td>    0.0035</td> <td> 3.42e-05</td> <td>  103.760</td> <td> 0.000</td> <td>    0.003</td> <td>    0.004</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>xi_l</th>  <td>    0.0745</td> <td>    0.009</td> <td>    8.373</td> <td> 0.000</td> <td>    0.057</td> <td>    0.092</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>xi_r</th>  <td>    0.0637</td> <td>    0.009</td> <td>    7.464</td> <td> 0.000</td> <td>    0.047</td> <td>    0.080</td>\n",
       "</tr>\n",
       "</table>"
      ],
      "text/plain": [
       "<class 'statsmodels.iolib.summary.Summary'>\n",
       "\"\"\"\n",
       "                               PhatFit Results                                \n",
       "==============================================================================\n",
       "Dep. Variable:                  Close   Log-Likelihood:                 62558.\n",
       "Model:                        PhatFit   AIC:                        -1.251e+05\n",
       "Method:            Maximum Likelihood   BIC:                        -1.251e+05\n",
       "Date:                Fri, 23 Jul 2021                                         \n",
       "Time:                        08:17:39                                         \n",
       "No. Observations:               19644                                         \n",
       "Df Residuals:                   19643                                         \n",
       "Df Model:                           0                                         \n",
       "==============================================================================\n",
       "                 coef    std err          z      P>|z|      [0.025      0.975]\n",
       "------------------------------------------------------------------------------\n",
       "const          0.0006   5.14e-05     11.589      0.000       0.000       0.001\n",
       "x1             0.0035   3.42e-05    103.760      0.000       0.003       0.004\n",
       "xi_l           0.0745      0.009      8.373      0.000       0.057       0.092\n",
       "xi_r           0.0637      0.009      7.464      0.000       0.047       0.080\n",
       "==============================================================================\n",
       "\"\"\""
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that both the left and right tail indices are much smaller than we have estimated using the [POT](pot.ipynb) and [Hill Double Bootstrap techniques](dblbs.ipynb). This phenomenon of underfitting in the tails results because the impact of extreme events on the dataset is not large enough to offset the gains from optimization in the body. Hence, we end up with thinner tails masking greater risk.\n",
    "\n",
    "Instead, we can estimate the tails separately and pass them as fixed values to our fit method. This results in just two free parameters, $\\mu$ and $\\sigma$, in the Gaussian body."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xi_left, xi_right = ph.two_tailed_hill_double_bootstrap(sp_ret)\n",
    "res = ph.Phat.fit(sp_ret, xi_left, xi_right)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>PhatFit Results</caption>\n",
       "<tr>\n",
       "  <th>Dep. Variable:</th>           <td>Close</td>       <th>  Log-Likelihood:    </th>  <td>  61926.</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Model:</th>                  <td>PhatFit</td>      <th>  AIC:               </th> <td>-1.238e+05</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Method:</th>           <td>Maximum Likelihood</td> <th>  BIC:               </th> <td>-1.238e+05</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Date:</th>              <td>Fri, 23 Jul 2021</td>  <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Time:</th>                  <td>08:18:30</td>      <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>No. Observations:</th>       <td> 19644</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Df Residuals:</th>           <td> 19643</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Df Model:</th>               <td>     0</td>       <th>                     </th>      <td> </td>    \n",
       "</tr>\n",
       "</table>\n",
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "    <td></td>       <th>coef</th>     <th>std err</th>      <th>z</th>      <th>P>|z|</th>  <th>[0.025</th>    <th>0.975]</th>  \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>const</th> <td>    0.0006</td> <td>  4.8e-05</td> <td>   12.858</td> <td> 0.000</td> <td>    0.001</td> <td>    0.001</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>x1</th>    <td>    0.0032</td> <td>  3.2e-05</td> <td>   98.791</td> <td> 0.000</td> <td>    0.003</td> <td>    0.003</td>\n",
       "</tr>\n",
       "</table>"
      ],
      "text/plain": [
       "<class 'statsmodels.iolib.summary.Summary'>\n",
       "\"\"\"\n",
       "                               PhatFit Results                                \n",
       "==============================================================================\n",
       "Dep. Variable:                  Close   Log-Likelihood:                 61926.\n",
       "Model:                        PhatFit   AIC:                        -1.238e+05\n",
       "Method:            Maximum Likelihood   BIC:                        -1.238e+05\n",
       "Date:                Fri, 23 Jul 2021                                         \n",
       "Time:                        08:18:30                                         \n",
       "No. Observations:               19644                                         \n",
       "Df Residuals:                   19643                                         \n",
       "Df Model:                           0                                         \n",
       "==============================================================================\n",
       "                 coef    std err          z      P>|z|      [0.025      0.975]\n",
       "------------------------------------------------------------------------------\n",
       "const          0.0006    4.8e-05     12.858      0.000       0.001       0.001\n",
       "x1             0.0032    3.2e-05     98.791      0.000       0.003       0.003\n",
       "==============================================================================\n",
       "\"\"\""
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.00061775, 0.00316159])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.params"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The difference may not appear too meaningful but we do get a greater mean and lesser volatility at the first decimal place of the result."
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Edit Metadata",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}