Tuesday, 17 September 2013

Options Strategies - Excel Work Book

Here is an excel workbook which comes handy when you want to try out various strategies using options. 
You can download this at




Also, visit http://optionstrategies.nsecentral.com/strategist/ for some live strategies based on Nifty Index (NSE India)

Another very good reference guide can be downloaded at http://www.nseindia.com/content/ncfm/sm_otsm.pdf

Thursday, 27 September 2012

Monday, 18 June 2012

Heteroscedasticity - Applying correction

Procedure to apply correction for heteroskedasticity


Step 1 : Primary Regressiion
Run the primary regression on yi = const + beta1*x1i + beta2*x21 + ... +ui
var(ui) = (σi)**2

Step 2: Auxilliary Regression to determine σi
Estimate the σi.
Run the regression on the absolute value of the error term |ut|, the 'resid' of the primary regression (|ut| = alpha1*x1i + alpha2*x2i + ...)
The fitted values of this regression is the estimate of σi

Step 3: Run the regression with the weighted values:
y* = y/σi
x1i* = x1i/σi
x2* = x2i/σi
constant* = constant/σi
...

and
y*i  = constant* + beta1*x1i* + beta2* + ... + ut

The resulting regression is BLUE

Here is the sample code:


#! /usr/bin/env python

#--------------------------------------------------------------
# Heteroscedasticity - Applying Correction
#
# Cheryan M. T
# Data File :
Salary-Year.csv Salary v.s Years of exp (220 samples)
#
# Use your data file instead of "Salary-Year.csv"

# Ensure to make changes to regression based on the independent 
# variables in your .csv file
#----------------------------------------------------------------

import numpy as np
import pandas
import statsmodels.api as sm
from scipy.stats import chi2
import matplotlib.pyplot as plt

salYears = pandas.read_csv("Salary-Year.csv")

'''
Data columns:
SALARY    222  non-null values
YEARS     222  non-null values
'''
# Prepare the Independent variable
logSalary = np.log(salYears['SALARY'].values)
years = salYears['YEARS'].values
yearsSquared = years**2
X1 = np.c_[years, yearsSquared]
X = sm.add_constant(X1)
# Run the Regression
# "yt = beta1 + beta2*years + beta3*yearsSquared"
results = sm.OLS(logSalary, X).fit()
print
print " Result of Primary REGRESSION results = sm.OLS(logSalary, X)"
print "------------------------------------------------------------"
print results.summary()
print
print ("x1=years x2=years**2,  const")
print
#------------------------------------------------------------------------------------------------------
# White's Test - To test if the relation is heteroscedastic
'''
Run the auxilliary regression again on the residues
The generic formula is :
ut**2 = alpha1 + alpha2*x1t + alpha3*x2t + alpha4 x1t**2 + alpha5*x2t**2 + alpha6*x1t*x3t + vt
'''
ut = results.resid
utsquared = ut**2
X_temp  = np.c_[years , yearsSquared , yearsSquared**2,  years*yearsSquared ]
X_aux = sm.add_constant(X_temp)
results_white = sm.OLS(utsquared, X_aux).fit()

# Homoscedasticity means the variance in ut IS constant
# i.e, ut**2 = alpha1 + alpha2 x2t + alpha3 x3t + alpha4 x2t + alpha5 x3t + alpha6 x2t x3t + vt
# -  if   alpha2  to alpha7 are ZEROEs, then Ut**2 = alpha1 which is constant
# So the null hypothesis
#       H0: alpha2 = alpha3 = alpha4 ... = alpha7 = 0 (Homoscedasticity)
#       H1: Any one of alpha2 ... alpha7 are non-zeroes
# Running the LM test,
#  if the chi 2 -test statistic is
#  greater than the corresponding value from the statistical table then reject the null
#  hypothesis that the errors are homoscedastic

print
print " Result of Auxilliary ut**2 REGRESSION for White's Test prior correction"
print "-------------------------------------------------------------------------"
print results_white.summary()
T = results_white.nobs # Number of observations
rsquared = results_white.rsquared
degreesOfFreedom = 4 # Numer of regressors in the auxilliary regression exclusing the constant term
chi_value = chi2.isf(0.05, degreesOfFreedom)
# If T * rsquareed > chi_value, then reject the null hypothesis

print "-----------------------------------------------------------------------------"
print " Result of HETEROSCEDASTICITY TEST - WHITE's Test"
print "--------------------------------------------------"
if T * rsquared > chi_value :
        print " - Rejecting the Null Hypothesis that it is Homoscedastic.\n    There is evidence of Heteroscedasticity "
        print " - T*rsquare = %0.2f >  chi2 Value=%0.2f"%(T * rsquared,chi_value)
else:
        print " - T*R-Squared = %0.2f <  chi2 Value=%0.2f"%(T * rsquared,chi_value)
        print " So:"
        print " - Null hypothesis (H0: Homoscedastic) is NOT rejected.\n   Means There is no evidence of Heteroscedasticity"
        print " "
print "-----------------------------------------------------------------------------"
print

fig1 = plt.figure()
fig1.set
p1 = fig1.add_subplot(111)
p1.scatter (yearsSquared,ut )
p2 = fig1.add_subplot(211)
p2.scatter (years, ut )
#------------(end of White's test)---------------

#----------Correction for heteroskedasticity-----------------------
# #Correcton for heteroscedasticity
# Regress |ut| =  alpha1 + alpha2 * years + alpha3 * yearsSquared

# absolute value of ut from the primary regression
uhat_absolute = np.abs(ut)
X_temp = np.c_[years, yearsSquared]
X_correction = sm.add_constant(X_temp)
results_correction = sm.OLS(uhat_absolute, X_correction).fit()

# the fitted values of regression |ut| =  alpha1 + alpha2 * years + alpha3 * yearsSquared
# gives an estimate of sigma-i
sigmaCap = results_correction.fittedvalues

logSalary_weighted = logSalary/sigmaCap
constant_weighted = 1/sigmaCap
years_weighted = years/sigmaCap
yearsSquared_weighted = yearsSquared/sigmaCap
X_weighted = np.c_[years_weighted, yearsSquared_weighted, constant_weighted]
results_weighted = sm.OLS(logSalary_weighted, X_weighted ).fit()

print
print " Result of regression after applying the correction: results_weighted = sm.OLS(logSalary_weighted, X_weighted )"
print "----------------------------------------------------------------------------------------------------------------"
print results_weighted.summary()
ut_weighted = results_weighted.resid
utsquared_weighted = ut_weighted**2
fig2 = plt.figure()
p21 = fig2.add_subplot(111)
p21.scatter (yearsSquared_weighted,ut_weighted )
p22 = fig2.add_subplot(211)
p22.scatter (years_weighted, ut_weighted )
plt.show()




RESULT:


I. Result of Primary REGRESSION results = sm.OLS(logSalary, X)
------------------------------------------------------------
                            OLS Regression Results                          
==============================================================================
Dep. Variable:                      y   R-squared:                       0.536
Model:                            OLS   Adj. R-squared:                  0.532
Method:                 Least Squares   F-statistic:                     126.6
Date:                Tue, 19 Jun 2012   Prob (F-statistic):           2.92e-37
Time:                        12:14:21   Log-Likelihood:                 36.205
No. Observations:                 222   AIC:                            -66.41
Df Residuals:                     219   BIC:                            -56.20
Df Model:                           2                                       
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0439      0.005      9.082      0.000         0.034     0.053
x2            -0.0006      0.000     -5.191      0.000        -0.001    -0.000
const          3.8094      0.041     92.151      0.000         3.728     3.891
==============================================================================
Omnibus:                        5.324   Durbin-Watson:                   1.434
Prob(Omnibus):                  0.070   Jarque-Bera (JB):                6.164
Skew:                          -0.195   Prob(JB):                       0.0459
Kurtosis:                       3.717   Cond. No.                     1.82e+03
==============================================================================

The condition number is large, 1.82e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

x1=years x2=years**2,  const





 II. Result of Auxilliary ut**2 REGRESSION for White's Test prior correction
-------------------------------------------------------------------------
                            OLS Regression Results                          
==============================================================================
Dep. Variable:                      y   R-squared:                       0.090
Model:                            OLS   Adj. R-squared:                  0.073
Method:                 Least Squares   F-statistic:                     5.370
Date:                Tue, 19 Jun 2012   Prob (F-statistic):           0.000384
Time:                        12:14:21   Log-Likelihood:                 286.95
No. Observations:                 222   AIC:                            -563.9
Df Residuals:                     217   BIC:                            -546.9
Df Model:                           4                                       
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0002      0.008      0.021      0.983        -0.016     0.016
x2             0.0006      0.001      0.886      0.377        -0.001     0.002
x3          4.222e-07   2.88e-07      1.467      0.144     -1.45e-07  9.89e-07
x4         -3.264e-05   2.52e-05     -1.297      0.196     -8.22e-05   1.7e-05
const         -0.0018      0.026     -0.068      0.946        -0.053     0.049
==============================================================================
Omnibus:                      232.804   Durbin-Watson:                   1.669
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             8114.202
Skew:                           4.142   Prob(JB):                         0.00
Kurtosis:                      31.436   Cond. No.                     4.32e+06
==============================================================================

The condition number is large, 4.32e+06. This might indicate that there are
strong multicollinearity or other numerical problems.
-----------------------------------------------------------------------------
 Result of HETEROSCEDASTICITY TEST - WHITE's Test
--------------------------------------------------
 - Rejecting the Null Hypothesis that it is Homoscedastic.
   There is evidence of Heteroscedasticity
 - T*rsquare = 20.00 >  chi2 Value=9.49
-----------------------------------------------------------------------------

 


III. Result of regression after applying the correction: results_weighted = sm.OLS(logSalary_weighted, X_weighted )
----------------------------------------------------------------------------------------------------------------
                            OLS Regression Results                          
==============================================================================
Dep. Variable:                      y   R-squared:                       0.991
Model:                            OLS   Adj. R-squared:                  0.991
Method:                 Least Squares   F-statistic:                 1.222e+04
Date:                Tue, 19 Jun 2012   Prob (F-statistic):          2.29e-225
Time:                        12:14:21   Log-Likelihood:                -357.37
No. Observations:                 222   AIC:                             720.7
Df Residuals:                     219   BIC:                             730.9
Df Model:                           2                                       
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0366      0.003     11.872      0.000         0.030     0.043
x2            -0.0004   7.95e-05     -5.118      0.000        -0.001    -0.000
x3             3.8416      0.019    205.476      0.000         3.805     3.878
==============================================================================
Omnibus:                        0.068   Durbin-Watson:                   1.349
Prob(Omnibus):                  0.967   Jarque-Bera (JB):                0.066
Skew:                           0.038   Prob(JB):                        0.967
Kurtosis:                       2.962   Cond. No.                         939.
==============================================================================





Friday, 15 June 2012

Testing for Heteroscedasticity


Testing for Heteroscedasticity by conducting whites test:
Data file : http://goo.gl/ghtdr
Refer the comments in the code


#! /usr/bin/env python

#--------------------------------------------------------------
# Heteroscedasticity - White's Test

#
# Cheryan M. T
# Data file : macro.xls
# Download the file macro.xls. Save the content as macro.csv file
#----------------------------------------------------------------

import numpy as np
import pandas
import statsmodels.api as sm
from scipy.stats import chi2

macro = pandas.read_csv("macro.csv")

'''
#---- Header from the panda data frame
Data columns:
Date                     254  non-null values
Microsoft                254  non-null values
SANDP                    254  non-null values
CPI                      254  non-null values
Industrial production    254  non-null values
USTB3M                   254  non-null values
USTB6M                   254  non-null values
USTB1Y                   254  non-null values
USTB3Y                   254  non-null values
USTB5Y                   254  non-null values
USTB10Y                  254  non-null values
M1MONEY SUPPLY           254  non-null values
CONSUMER CREDIT          254  non-null values
BAA-AAA SPREAD           254  non-null values
-----------------------------------------
Define the Regression Model
Transforming the variables can be done as described above.
dspread=baa aaa_spread-baa_aaa_spread(-1)
Repeat these steps to conduct all of the following transformations:
dcredit = consumer_credit - consumer_credit(-1)
dprod = industrial production - industrial production(-1)
rmsoft = 100*dlog(microsoft)
rsandp = 100*dlog(sandp)
dmoney = m1money supply - m1money supply(-1)
inflation = 100*dlog(cpi)
term = ustb10y - ustb3m
dinflation = inflation - inflation(-1)
mustb3m = ustb3m/12
rterm = term - term(-1)
ermsoft = rmsoft - mustb3m
ersandp = rsandp - mustb3m
'''


# Prepare the Independent variable for REFERESSION
dspread = np.diff(macro['BAA-AAA SPREAD'].values)
dcredit = np.diff(macro['CONSUMER CREDIT'].values)
dprod = np.diff(macro['Industrial production'].values)
returns_microsoft = 100*np.diff(np.log(macro['Microsoft'].values)) # the term rmsoft in the book
returns_sandp = 100*np.diff(np.log(macro['SANDP'].values)) # the term rsandp in the book
dmoney = np.diff(macro['M1MONEY SUPPLY'].values)
inflation = 100*np.diff(np.log(macro['CPI'].values))
term = macro['USTB10Y'].values - macro['USTB3M'].values
#dinflation = np.diff(inflation)
dinflation = inflation
mustb3m_temp = macro['USTB3M'].values/12
mustb3m = mustb3m_temp[1:mustb3m_temp.size]
rterm = np.diff(term)
e_returns_microsoft = returns_microsoft - mustb3m  # the term ermsoft in book
e_returns_sandp = returns_sandp - mustb3m # the term ersandp in the book

X1 = np.c_[e_returns_sandp, dprod, dcredit, dinflation, dmoney, dspread, rterm]
X = sm.add_constant(X1)

# Run the Regression
# "yt = beta1*x1t + beta2*x2t + beta3*x3t + beta4*x4t + beta5*x5t + beta6*x6t + beta7*x7t + constant + ut"

results = sm.OLS(e_returns_microsoft, X).fit()
print results.summary()
print
print ("x1=e_returns_sandp, x2=dprod, x3=dcredit, x4=dinflation,\nx5=dmoney, x6=dspread, x7=rterm, const")
print

# White's Test
'''
Run the regression again on the residues
The generic formula is :
ut**2 = alpha1 + alpha2 x2t + alpha3 x3t + alpha4 x2t + alpha5 x3t + alpha6 x2t x3t + vt
We are ignoring the cross product terms as we have 7 independent variable
We will take only the squares of the independent variable in square(ut) regression
'''
ut = results.resid
utsquared = ut**2
X1 = np.c_[e_returns_sandp**2, dprod**2, dcredit**2, dinflation**2, dmoney**2, dspread**2, rterm**2]
X = sm.add_constant(X1)
results_white = sm.OLS(utsquared, X).fit()

# Homoscedasticity means the variance in ut IS constant
# i.e, ut**2 = alpha1 + alpha2 x2t + alpha3 x3t + alpha4 x2t + alpha5 x3t + alpha6 x2t x3t + vt
# -  if   alpha2  to alpha7 are ZEROEs, then Ut**2 = alpha1 which is constant
# So the null hypothesis
#       H0: alpha2 = alpha3 = alpha4 ... = alpha7 = 0 (Homoscedasticity)
#       H1: Any one of alpha2 ... alpha7 are non-zeroes
# Running the LM test,
#  if the chi 2 -test statistic is
#  greater than the corresponding value from the statistical table then reject the null

#  hypothesis that the errors are homoscedastic

print " Result of ut**2 REGRESSION"
print "---------------------------"
print results_white.summary()
T = ut.size
rsquared = results_white.rsquared
# 5%  with 6 Degress of Freedom
chi_value = chi2.isf(0.05, 6)

# If T * rsquareed > chi_value, then reject the null hypothesis

print "-----------------------------------------------------------------------------"
print " Result of HETEROSCEDASTICITY TEST"
print "----------------------------------"
if T * rsquared > chi_value :
        print " - Rejecting the Null Hypothesis that it is Homoscedastic.\n   There is evidence of Heteroscedasticity "
else:
        print " - Null hypothesis (H0: Homoscedastic) is NOT rejected.\n   Means There is no evidence of Heteroscedasticity"
print "-----------------------------------------------------------------------------"
print




Monday, 11 June 2012

Another Regression Example

 "Introductory Econometrics for Finance" - which uses 'Eviews' for regression. Translated it to python (numpy + statsmodel).
The example considers the situation where an investor wishes to hedge a long position in the S&P500 (or its constituent stocks) using a short position in futures contracts.  The hedge ratio (the number of units of the futures asset to sell per unit of the spot asset held) will be the slope estimate (i.e. β) in a regression where the dependent variable is a time series of spot returns and the independent variable is a time series of futures returns.

Note: Data File Used : http://goo.gl/IDYKN
#! /usr/bin/env python

#--------------------------------------------------------------
# Read : Introductory Econometrics for Finance - CHRIS BROOKS
#
# Data File Used : http://goo.gl/IDYKN 
# (Download and save data file as 'SP500hedge.csv'
#
# Cheryan M.T
#
#----------------------------------------------------------------


import numpy as np
import statsmodels.api as sm
f = open('SP500hedge.csv', 'r')

nestedList = [line.split(',') for line in f]

f.close()

data = nestedList[1:]

np_array =  np.array(data, dtype=object)

spot = np_array[:,1].astype(float)
futures = np_array[:,2].astype(float)

returns_dfutures = 100*np.diff(np.log(futures))
returns_dspot = 100*np.diff(np.log(spot))

# Dependent variable - dspot
# Inependent variable - dfutures
# dspot = alpha + beta * dfutures + ut
returns_dfutures1 = sm.add_constant( returns_dfutures ) 
results = sm.OLS(returns_dspot, returns_dfutures1).fit()
print " I. Summary for the regression on the 'Returns' of Spot and Futures"
print "------------------------------------------------------------------"
print results.summary()

print "\n\n"
# Let's do regresion on the futire and spot raw data
print " Summary for the regression on the raw Spot and Futures"
print "------------------------------------------------------------------"
futures1 = sm.add_constant( futures )
results_series = sm.OLS(spot, futures1).fit()
print results_series.summary()

p1= fig.add_subplot(211)
p2 = fig.add_subplot(212)

p1.scatter(returns_dfutures, returns_dspot)
p1.plot(returns_dfutures,results.fittedvalues)
p1.set_xlabel('diff(log(futures))')
p1.set_ylabel('diff(log(spot))')


p2.scatter(futures, spot)
p2.plot(futures,results_series.fittedvalues)
p2.set_xlabel('futures')
p2.set_ylabel('spot')

plt.show()
RESULT:
 I. Summary for the regression on the 'Returns' of Spot and Futures
------------------------------------------------------------------
                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.013
Model:                            OLS   Adj. R-squared:                 -0.002
Method:                 Least Squares   F-statistic:                    0.8571
Date:                Tue, 12 Jun 2012   Prob (F-statistic):              0.358
Time:                        09:01:14   Log-Likelihood:                -173.51
No. Observations:                  65   AIC:                             351.0
Df Residuals:                      63   BIC:                             355.4
Df Model:                           1
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.1239      0.134      0.926      0.358        -0.143     0.391
const          0.3633      0.444      0.818      0.417        -0.525     1.251
==============================================================================
Omnibus:                       11.985   Durbin-Watson:                   2.117
Prob(Omnibus):                  0.002   Jarque-Bera (JB):               15.302
Skew:                          -0.742   Prob(JB):                     0.000476
Kurtosis:                       4.857   Cond. No.                         3.36
==============================================================================


 II. Summary for the regression on the raw Spot and Futures
------------------------------------------------------------------
                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.964
Model:                            OLS   Adj. R-squared:                  0.964
Method:                 Least Squares   F-statistic:                     1726.
Date:                Tue, 12 Jun 2012   Prob (F-statistic):           5.16e-48
Time:                        09:01:14   Log-Likelihood:                -325.20
No. Observations:                  66   AIC:                             654.4
Df Residuals:                      64   BIC:                             658.8
Df Model:                           1
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.9822      0.024     41.543      0.000         0.935     1.029
const         21.1107     27.711      0.762      0.449       -34.249    76.470
==============================================================================
Omnibus:                        5.056   Durbin-Watson:                   1.902
Prob(Omnibus):                  0.080   Jarque-Bera (JB):                4.696
Skew:                          -0.653   Prob(JB):                       0.0956
Kurtosis:                       3.017   Cond. No.                     7.78e+03
==============================================================================

The Scatter Plot:



Histogram of error series (ut)

(above python code add:
 ut = returns_dspot - results.fittedvalues )

Following are true for ut
(1) E(u t ) = 0
     The errors have zero mean
(2) var(u t )
      The variance of the errors is constant and finite over all values of xt
(3) cov(u i , u j ) = 0
      The errors are linearly independent of one another
(4) cov(u t , xt ) = 0
      There is no relationship between the error and corresponding x variate



u t ∼ N(0, σ 2 )−i.e. that ut is normally distributed



< If there are copyright issues related to the data (SP500hedge.csv) file used, please let me know. I shall remove it from this example>



OLS Estimation

Simple Regression with ORDINARY LEASE SQUARES:

Assume the following Data:

Year Excess Return on Funds Excess Return on Markets
117.813.7
239.023.2
312.86.9
424.216.8
517.212.3
* Excess Return over Risk Free Rate


Independent variable (xt)= Excess Return on Markets
Dependent variable (yt) = Excess return on Funds

Let's model 
yt = α + βxt + u t

True line with OLS regression:
Slope       β= cov(x,y)/var(x)
Intercept α = mean(y) - β*mean(x)


> import numpy as np
> import statsmodels.api as sm
> x1=np.array([13.7,23.2,6.9,16.8,12.3])
> y1=np.array([17.8,39.0,12.8,24.2,17.2])
> x1 = sm.add_constant(x1)

# run the regression
> results = sm.OLS(y, X).fit()

# look at the results
> print results.summary()
>>> print results.summary()
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.928
Model:                            OLS   Adj. R-squared:                  0.903
Method:                 Least Squares   F-statistic:                     38.45
Date:                Mon, 11 Jun 2012   Prob (F-statistic):            0.00845
Time:                        13:41:31   Log-Likelihood:                -11.601
No. Observations:                   5   AIC:                             27.20
Df Residuals:                       3   BIC:                             26.42
Df Model:                           1                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             1.6417      0.265      6.200      0.008         0.799     2.484
const         -1.7366      4.114     -0.422      0.701       -14.829    11.356
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.827
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.650
Skew:                           0.259   Prob(JB):                        0.723
Kurtosis:                       1.312   Cond. No.                         45.1
==============================================================================

 


From the above summary,

==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             1.6417      0.265      6.200      0.008         0.799     2.484
const         -1.7366      4.114     -0.422      0.701       -14.829    11.356
==============================================================================
α = -1.7366,  β= 1.6417

The true line is 

y = -1.7366 + 1.6417 x


Sum of Squared Residuals:
>print results.ssr
30.326005492828795

[ Refer Python Stats Model : http://statsmodels.sourceforge.net/devel/index.html for details ]


Wednesday, 6 June 2012

MODELING STOCK PRICES

-  A random walk is a nonstationary time series.
-  Nonstationary time series are usually transformed to stationary time se-
ries using differencing.
-  The logarithm of the stock price series is usually modeled as a random
walk.  If you take the difference (yt - yt-1) of the logarithm of the stock price, the resultant series will be a white noise

Checking with the following piece of python code :

import  urllib2
from datetime import date
from StringIO import StringIO
import numpy as np
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import pylab


plt.figure(3)
plt.subplot(311)
plt.subplot(312)
plt.subplot(313)

logClose = []
meanArray = [np.array([]), np.array([])]
varArray = [np.array([]),np.array([])]

def calculateVolatility(YahooCode, pairIndex):
       # To download the data from Yahoo
        today = date.today()
        url_part1 = "http://ichart.finance.yahoo.com/table.csv?s="+ YahooCode +\
        "&a=01&b=01&c=2011&"
        url_part2 = "d="+str(today.month) + "&e="+str(today.month) +\
        "&f="+ str(today.year) + "&g=d&ignore=.csv"
        url = url_part1 + url_part2
        f  = StringIO(urllib2.urlopen(url).read())
        nestedList = [line.split(',') for line in f]
        f.close()
        nestedList.reverse() # We want it in ascending order of dates
        data = nestedList[:-1] # Remove the header column at the end
        np_array =  np.array(data, dtype=object) # Get the data to numpy array

        # Get the column 'Close' from the np_array
        close = np_array[:,4].astype(float)
        nifty_dates = np_array[:,0].astype(object)

        logClose.append(np.diff(np.log(close)))


 
        size = np.size(logClose[pairIndex])
        global meanArray
        global varArray
        for i in range (1,size, 1):
                mean_temp = np.mean(logClose[pairIndex][0:i])
                meanArray[pairIndex] = np.append(meanArray[pairIndex], mean_temp)
                var_temp = np.var(logClose[pairIndex][0:i])
                varArray[pairIndex] = np.append(varArray[pairIndex], var_temp)

        s = np.arange(0,size,1)
        #pylab.plot(s,close)

        #plt.figure(3)
        plt.subplot(311)
        plt.title(YahooCode)
        plt.plot(s,logClose[pairIndex])
        plt.ylabel("logClose")

        plt.subplot(312)
        plt.plot(s[1:size],varArray[pairIndex])
        plt.ylabel("Variance-Log(Close)")

        plt.subplot(313)
        plt.plot(s[1:size],meanArray[pairIndex])
        plt.ylabel("Mean-Log(Close)")


 
if __name__ == '__main__':
        # Compute the historical volatility for S&P 500
        #print "S&P 500"
        #calculateVolatility("^GSPC")
        calculateVolatility("IBM", 0)
        global logClose
        print np.mean(logClose)   
        print np.cov(logClose)
        plt.show()
        pylab.show()






Friday, 11 May 2012

Historical Volatility

Investopedia definition of volatility – A statistical measure of the dispersion of returns for a given security or market index. Volatility can either be measured by using the standard deviation or variance between returns from that same security or market index. Commonly higher the standard deviation, higher is the risk”.

A python code sample given below:
  1. Install Python and Numpy ( http://enthought.com/products/epd.php  for installation of python and associated packages)
  2.  Get the daily quotes from Yahoo for say, S&P 500 - Code ^GSPC
  3. Calculate the standard deviation of the log. of daily returns (i.e log(close_Today/Close_Previous_Day).
#---------------------------------------------------------
# Historical Volatility
# Author: Cheryan M.T
# Feel free to use/modify/redistribute the following code
#-----------------------------------------------------------
import  urllib2
from datetime import date
from StringIO import StringIO
import numpy as np


def calculateVolatility(YahooCode):
    # To download the data from Yahoo

    today = date.today()
    url_part1 = "http://ichart.finance.yahoo.com/table.csv?s="+ YahooCode +\

    "&a=01&b=01&c=2011&"
    url_part2 = "d="+str(today.month) + "&e="+str(today.month) +\

    "&f="+ str(today.year) + "&g=d&ignore=.csv"
    url = url_part1 + url_part2


    # Get data from Yahoo
    f  = StringIO(urllib2.urlopen(url).read())
    nestedList = [line.split(',') for line in f]
    f.close()

    nestedList.reverse() # We want it in ascending order of dates
    data = nestedList[:-1] # Remove the header column at the end (after sorting)
    np_array =  np.array(data, dtype=object) # Get the data to numpy array

    
    # Get the column 'Close' from the np_array
    close = np_array[:,4].astype(float)
    nifty_dates = np_array[:,0].astype(object)

    #Let us build subset of data to calculate volatility
    # 1. Volatility based on the Close prices for the last 30 days
    close30 = close[-30::1] # Based on last 30 days

    # Daily Volatility based on last 30 days data
    # It's standard deviation of log of daily returns (w.r.t previous day)
    Daily_volat_30d = np.std(np.diff(np.log(close30))).round(4)

    # Annualized volatility based on daily volatility
    # If you are assuming only trading days in the year (252 days)
    # for annualzed, change 365 with 252 in the following line
    Annualasied_volat_30d = Daily_volat_30d*np.sqrt(365).round(4)

    print "Historical Daily Volatility = %.05f" %(Daily_volat_30d)
    print "Historical Annualized Volatility = %.05f" %(Annualasied_volat_30d)


if __name__ == '__main__':

     # Compute the historical volatility for S&P 500
     print "S&P 500"
     calculateVolatility("^GSPC")
     print "IBM"
     calculateVolatility("IBM")


(Copy the above code to file with .py say 'volatility.py'  and run 'python volatility.py'  )

Wednesday, 9 May 2012

Rapid Quant

Rapid Quant Release in July - Read on

http://www.lambdafoundry.com/rapidquant.html

(If you're interested to try out RapidQuant, please get in touch by emailing josh@lambdafoundry.com )

Saturday, 17 March 2012

Pricing of European Options

Python Library for pricing of European Options
http://code.mibian.net/

Using the following pricing models:
  • Garman-Kohlhagen
  • Black-Scholes