73
Série Scientifique Scientific Series 95s-49 Stochastic Volatility Eric Ghysels, Andrew Harvey, Eric Renault Montréal novembre 1995

Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Série ScientifiqueScientific Series

95s-49

Stochastic Volatility

Eric Ghysels, Andrew Harvey, Eric Renault

Montréalnovembre 1995

Page 2: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Ce document est publié dans l’intention de rendre accessibles les résultats préliminaires de larecherche effectuée au CIRANO, afin de susciter des échanges et des suggestions. Les idées et lesopinions émises sont sous l’unique responsabilité des auteurs, et ne représentent pas nécessairementles positions du CIRANO ou de ses partenaires.This paper presents preliminary research carried out at CIRANO and aims to encouragediscussion and comment. The observations and viewpoints expressed are the sole responsibilityof the authors. They do not necessarily represent positions of CIRANO or its partners.

CIRANO

Le CIRANO est une corporation privée à but non lucratif constituée en vertu de la Loides compagnies du Québec. Le financement de son infrastructure et de ses activitésde recherche provient des cotisations de ses organisations-membres, d’une subventiond’infrastructure du ministère de l’Industrie, du Commerce, de la Science et de laTechnologie, de même que des subventions et mandats obtenus par ses équipes derecherche. La Série Scientifique est la réalisation d’une des missions que s’estdonnées le CIRANO, soit de développer l’analyse scientifique des organisations et descomportements stratégiques.

CIRANO is a private non-profit organization incorporated under the QuébecCompanies Act. Its infrastructure and research activities are funded through feespaid by member organizations, an infrastructure grant from the Ministère del’Industrie, du Commerce, de la Science et de la Technologie, and grants andresearch mandates obtained by its research teams. The Scientific Series fulfils oneof the missions of CIRANO: to develop the scientific analysis of organizations andstrategic behaviour.

Les organisations-partenaires / The Partner Organizations

•Ministère de l’Industrie, du Commerce, de la Science et de la Technologie.•École des Hautes Études Commerciales.•École Polytechnique.•Université de Montréal.•Université Laval.•McGill University.•Université du Québec à Montréal.•Bell Québec.•La Caisse de dépôt et de placement du Québec.•Hydro-Québec.•Fédération des caisses populaires de Montréal et de l’Ouest-du-Québec.•Téléglobe Canada.•Société d’électrolyse et de chimie Alcan Ltée.•Avenor.•Service de développement économique de la ville de Montréal.•Raymond, Chabot, Martin, Paré

ISSN 1198-8177

Page 3: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

We benefitted from helpful comments from Frank Diebold, René Garcia, Eric Jacquier and Neil%

Shephard on preliminary draft of the paper. The first author would like to acknowledge the financialsupport of FCAR (Québec), SSHRC (Canada) as well as the hospitality and support of CORE (Louvain-la-Neuve, Belgium). The second author wishes to thank the ESRC for financial support. The third authorwould like to thank the Institut Universitaire de France, the Fédération Française des Sociétésd'Assurance as well as CIRANO and C.R.D.E. for financial support. This text was prepared forHandbook of Statistics, Vol. 14 : Statistical Methods in Finance.

C.R.D.E., Université de Montréal & CIRANO†

London School of Economics‡

GREMAQ et IDEI, Université des Sciences Sociales, Toulouse and Institut Universitaire de France§

Stochastic Volatility%%

Eric Ghysels , Andrew Harvey , Eric Renault† ‡ §

Résumé / Abstract

Cet article, préparé pour le * Handbook of Statistics +, vol. 14, StatisticalMethods in Finance, passe en revue les modèles de volatilité stochastique. On traiteles sujets suivants : volatilité des actifs financiers (volatilité instantanée desrendements d'actifs, volatilités implicites dans les prix d'options et régularitésempiriques), modélisation statistique en temps discret et continu et enfin inférencestatistique (méthodes de moments, pseudo-maximum de vraisemblance, méthodesbayesiennes et autres fondées sur la vraisemblance, inférence indirecte).

This paper, prepared for the " Handbook of Statistics ", vol.14,Statistical Methods in Finance, surveys the subject of Stochastic Volatility. Thefollowing subjects are covered : volatility in financial markets (instantaneousvolatility of asset returns, implied volatilities in option prices and related stylizedfacts), statistical modelling in discrete and continuous time and finally statisticalinference ( methods of moments, Quasi-Maximum-Likelihood, Likelihood basedand Bayesian Methods and Indirect Inference).

Mots Clés : rendements d'actifs financiers, hétéroscédasticité conditionnelle, prixd'option, modèle espace-état, processus de diffusion.

Keywords : Asset returns, Conditionnal heteroskedasticity, Option prices, State-Space models, Diffusion processus.

Page 4: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

TABLE OF CONTENTS

1. Introduction

2. Volatility in Financial Markets

2.1 The Black-Scholes Model and Implied Volatilities

2.2 Some Stylized Facts

2.3 Information Sets

2.4 Statistical Modelling of Stochastic Volatility

3. Discrete Time Models

3.1 The Discrete Time SV Model

3.2 Statistical Properties3.3 Comparison with ARCH models

3.4 Filtering, Smoothing and Prediction3.5 Extensions of the Model

4. Continuous Time Models

4.1 From Discrete to Continuous Time Models4.2 Option Pricing and Hedging

4.3 Filtering and Discrete Time Approximations4.4 Extensions of the Model

5. Statistical Inference

5.1 Generalized Method of Moments5.2 Quasi Maximum Likelihood

5.3 Continuous Time GMM5.4 Simulated Method of Moments5.5 Indirect Inference and Moment Matching5.6 Likelihood-based and Bayesian Methods5.7 Inference and Option Price Data

5.8 Regression Models with Stochastic Volatility

6. Conclusions

Page 5: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

1 Introduction

The class of stochastic volatility (SV) models has its roots both in mathematical �nance and �nancial

econometrics. In fact, several variations of SV models originated from research looking at very di�erent

issues. Clark (1973), for instance, suggested to model asset returns as a function of a random process of

information arrival. This so-called time deformation approach yielded a time-varying volatility model of

asset returns. Later Tauchen and Pitts (1983) re�ned this work proposing a mixture of distributions model

of asset returns with temporal dependence in information arrivals. Hull and White (1987) were not directly

concerned with linking asset returns to information arrival but rather were interesting in pricing European

options assuming continuous time SV models for the underlying asset. They suggested a di�usion for asset

prices with volatility following a positive di�usion process. Yet another approach emerged from the workof Taylor (1986) who formulated a discrete time SV model as an alternative to Autoregressive ConditionalHeteroskedasticity (ARCH) models. Until recently estimating Taylor's model, or any other SV model,remained almost infeasible. Recent advances in econometric theory have made estimation of SV models

much easier. As a result, they have become an attractive class of models and an alternative to other classessuch as ARCH.

Contributions to the literature on SV models can be found both in mathematical �nance and econo-metrics. Hence, we face quite a diverse set of topics. We say very little about ARCH models becauseseveral excellent surveys on the subject have appeared recently, including those by Bera and Higgins

(1995), Bollerslev, Chou and Kroner (1992), Bollerslev, Engle and Nelson (1994) and Diebold and Lopez(1995). Furthermore, since this chapter is written for the Handbook of Statistics, we keep the coverageof the mathematical �nance literature to a minimum. Nevertheless, the subject of option pricing �guresprominently out of necessity. Indeed, section 2, which deals with de�nitions of volatility has extensivecoverage of Black-Scholes implied volatilities. It also summarizes empirical stylized facts and concludes

with statistical modelling of volatility. The reader with a greater interest in statistical concepts may wantto skip the �rst three subsections of section 2 which are more �nance oriented and start with section2.4. Section 3 discusses discrete time models, while section 4 reviews continuous time models. Statistical

inference of SV models is the subject of section 5. Section 6 concludes.

2 Volatility in Financial Markets

Volatility plays a central role in the pricing of derivative securities. The Black-Scholes model for the pricingof an European option is by far the most widely used formula even when the underlying assumptions areknown to be violated. Section 2.1 will therefore take the Black-Scholes model as a reference point from

which to discuss several notions of volatility. A discussion of stylized facts regarding volatility and option

prices will appear next in section 2.2. Both sections set the scene for a formal framework de�ning stochasticvolatility which is treated in section 2.3. Finally, section 2.4 introduces the statistical models of stochastic

volatility.

1

Page 6: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

2.1 The Black-Scholes Model and Implied Volatilities

More than half a century after the seminal work of Louis Bachelier (1900), continuous time stochastic

processes have become a standard tool to describe the behavior of asset prices. The work of Black and

Scholes (1973) and Merton (1990) has been extremely in uential in that regard. In section 2.1.1 we review

some of the assumptions that are made when modelling asset prices by di�usions, in particular to present

the concept of instantaneous volatility. In section 2.1.2 we turn to option pricing models and the various

concepts of implied volatility.

2.1.1 An Instantaneous Volatility Concept

We consider a �nancial asset, say a stock, with today's (time t) market price denoted by St.1 Let the

information available at time t be described by It and consider the conditional distribution of the return

St+h=St of holding the asset over the period [t; t+ h] given It.2 A maintained assumption throughout this

chapter will be that asset returns have �nite conditional expectation given It or :

Et (St+h=St) = S�1t EtSt+h < +1 (2.1.1)

and likewise �nite conditional variance given It, namely

Vt (St+h=St) = S�2t VtSt+h < +1 (2.1.2)

The continuously compounded expected rate of return will be characterized by h�1 logEt (St+h=St). Thena �rst assumption can be stated as follows :

Assumption 2.1.1.A : The continuously compounded expected rate of return converges almost surely to-wards a �nite value �S (It) when h > 0 goes to zero.From this assumption one has EtSt+h � St � h�S (It)St or in terms of its di�erential representation :

d

d�Et (S�)

������=t

= �S (It)St almost surely (2.1.3)

where the derivatives are taken from the right. Equation (2.1.3) is sometimes loosely de�ned as : Et (dSt) =�S (It) Stdt. The next assumption pertains to the conditional variance and can be stated as :

Assumption 2.1.1.B : The conditional variance of the return h�1Vt (St+h=St) converges almost surely to-wards a �nite value �2S (It) when h > 0 goes to zero.

Again, in terms of its di�erential representation this amounts to :

d

d�V art (S� )

������=t

= �2S (It)S2t almost surely (2.1.4)

and one loosely associates with the expression Vt (dSt) = �2S (It)S2t dt.

1Here and in the remainder of the paper we will focus on options written on stocks or exchange rates. The large literatureon the term structure of interest rates and related derivative securities will not be covered.

2Section 2.3 will provide a more rigorous discussion of information sets. It should also be noted that we will indi�erentlybe using conditional distributions of asset prices St+h and of returns St+h=St since St belongs to It.

2

Page 7: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Both assumptions 2.1.1.A and B lead to a representation of the asset price dynamics by an equation

of the following form :

dSt = �S(It)Stdt+ �S(It)StdWt (2.1.5)

where Wt is a standard Brownian Motion. Hence, every time a di�usion equation is written for an asset

price process we have automatically de�ned the so-called instantaneous volatility process �S (It) which

from the above representation can also be written as :

�S (It) =

�limh#o

h�1Vt (St+h=St)

� 12

(2.1.6)

Before turning to the next section we would like to provide a brief discussion of some of the founda-

tions for the Assumptions 2.1.1.A and B. It was noted that Bachelier (1900) proposed Brownian Motion

process as a model of stock price movements. In modern terminology this amounts to the random walk

theory of asset pricing which claims that asset returns ought not to be predictable because of the informa-tional e�ciency of �nancial markets. Hence, it assumes returns on consecutive regularly sampled periods[t+ k; t+ k + 1] ; k = 0; 2; :::; h � 1 are independently (identically) distributed. With such a benchmarkin mind, it is natural to view the expectation and the variance of the continuously compounded rate ofreturn log (St+h=St) as proportional to the maturity h of the investment.Obviously we no longer use Brownian motions as a process for asset prices but it is nevertheless worth

noting that Assumptions 2.1.1.A and B also imply that the expected rate of return and the associatedsquared risk (in terms of variance of the rate of return) of an investment over an in�nitely-short interval[t; t+ h] is proportional to h. Sims (1984) provided some rationale for both assumptions through theconcept of \local unpredictability".

To conclude, let us brie y discuss a particular special case of (2.1.5) predominantly used in theoretical

developments and also highlight an implicit restriction we made. When �S (It) = �S and �S (It) = �S areconstants for all t the asset price is a geometric Brownian motion. This process was used by Black andScholes (1973) to derive their well-known pricing formula for European options. Obviously, since �S (It)is a constant we no longer have an instantaneous volatility process but rather a single parameter �S - asituation which undoubtedly greatly simpli�es many things including the pricing of options. A second

point which needs to be stressed is that Assumptions 2.1.1.A and B allow for the possibility of discretejumps in the asset price process. Such jumps are typically represented by a Poisson process and have been

prominent in the option pricing literature since the work of Merton (1976). Yet, while the assumptions

allow in principle for jumps, they do not appear in (2.1.5). Indeed, throughout this chapter we will maintainthe assumption of sample path continuity and exclude the possibility of jumps as we focus exclusively on

SV models.

2.1.2 Option Prices and Implied Volatilities

It was noted in the introduction that SV models originated in part from the literature on the pricing of

options. We have witnessed over the past two decades a spectacular growth in options and other derivative

security markets. Such markets are sometimes characterized as places where \volatilities are traded". Inthis section we will provide the rationale for such statements and study the relationship between so-called

options implied volatilities and the concepts of instantaneous and averaged volatilities of the underlying

asset return process.

3

Page 8: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

The Black-Scholes option pricing model is based on a Log-Normal or Geometric Brownian Motion

model for the underlying asset price:

dSt = �SStdt+ �SStdWt (2.1.7)

where �S and �S are �xed parameters. A European call option with strike price K and maturity t+h has

a payo�:

[St+h �K]+ =

8><>:St+h �K if St+h � K

0 otherwise(2.1.8)

Since the seminal Black and Scholes (1973) paper, there is now a well established literature proposing

various ways to derive the pricing formula of such a contract. Obviously, it is beyond the scope of this

paper to cover this literature in detail.3 Instead, the bare minimum will be presented here allowing us to

discuss the concepts of interest regarding volatility.

With continuous costless trading assumed to be feasible, it is possible to form a portfolio using one calland a short-sale strategy for the underlying stock to eliminate all risk. This is why the option price can becharacterized without ambiguity, using only arbitrage arguments, by equating the market rate of returnof the riskless portfolio containing the call option with the risk-free rate. Moreover, such arbitrage-basedoption pricing does not depend on individual preferences4.

This is the reason why the easiest way to derive the Black-Scholes option pricing formula is via a \risk-neutral world", where asset price processes are speci�ed through a modi�ed probability measure, referredto as the risk neutral probability measure denoted Q (as discussed more explicitly in section 4.2). This�ctitious world where probabilities in general do not coincide with the Data Generating Process (DGP),is only used to derive the option price which remains valid in the objective probability setup. In the riskneutral world we have:

dSt=St = rtdt+ �SdWt (2.1.9)

Ct = C (St;K; h; t) = B(t; t+ h)EQt (St+h �K)+ (2.1.10)

where EQt is the expectation under Q, B(t; t+h) is the price at time t of a pure discount bond with payo�

one unit at time t+ h and

rt = � limh!0

1

hLog B(t; t+ h) (2.1.11)

is the riskless instantaneous interest rate.5 We have implicitly assumed that in this market interest rates

are nonstochastic (Wt is the only source of risk) so that:

B(t; t+ h) = exp

"�Z t+h

tr�d�

#: (2.1.12)

3See however Jarrow and Rudd (1983), Cox and Rubinstein (1985), Du�e (1989), Du�e (1992), Hull (1993) or Hull (1995)among others for more elaborate coverage of options and other derivative securities.

4This is sometimes refered to as preference free option pricing. This terminology may somewhat be misleading sinceindividual preferences are implicitly taken into account in the market price of the stock and of the riskless bond. However,the option price only depends on individual preferences through the stock and bond market prices.

5For notational convenience we denote by the same symbol Wt a Brownian motion under P (in 2.1.7) and under Q (in2.1.9). Indeed, Girsanov's theorem establishes the link between these two processes (see e.g. Du�e (1992) and section 4.2.1).

4

Page 9: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

By de�nition, there are no risk premia in a risk neutral context. Therefore rt coincides with the

instantaneous expected rate of return of the stock and hence the call option price Ct is the discounted

value of its terminal payo� (St+h �K)+ as stated in (2.1.10).

The log-normality of St+h given St allows one to compute the expectation in (2.1.10) yielding the call

price formula at time t:

Ct = St�(d)�KB(t; t+ h)�(d � �ph) (2.1.13)

where � is the cumulative standard normal distribution function while d will be de�ned shortly. Formula

(2.1.13) is the so-called Black-Scholes option pricing formula. Thus, the option price Ct depends on the

stock price St, the strike price K and the discount factor B(t; t+ h). Let us now de�ne :

xt = Log St=KB(t; t+ h) (2.1.14)

Then we have:

Ct=St = �(d)� e�xt�(d� �ph) (2.1.15)

with d =�xt=�

ph�+ �

ph=2. It is easy to see the critical role played by the quantity xt, called the

moneyness of the option.

� If xt = 0, the current stock price St coincides with the present value of the strike price K. In

other words, the contract may appear to be fair to somebody who would not take into account thestochastic changes of the stock price between t and t+ h. We shall say that we have in this case anat the money option.

� If xt > 0 (respectively xt < 0) we shall say that the option is in the money (respectively out themoney)6.

It was noted before that the Black-Scholes formula is widely used among practitioners, even when itsassumption are known to be violated. In particular the assumption of a constant volatility �S is unrealistic(see section 2.2 for empirical evidence). This motivated Hull and White (1987) to introduce an optionpricing model with stochastic volatility assuming that the volatility itself is a state variable independent

of Wt :7

(dSt=St = rtdt+ �StdWt

(�St)t2[0;T ] ; (Wt)t2[0;T ] independent Markovian(2.1.16)

It should be noted that (2..1.16) is still written in a risk neutral context since rt coincides with the

instantaneous expected return of the stock. On the other hand the exogenous volatility risk is not directlytraded, which prevents us from de�ning unambiguously a risk neutral probability measure, as discussed

in more detail in section 4.2. Nevertheless, the option pricing formula (2.1.10) remains valid provided the

expectation is computed with respect to the joint probability distribution of the Markovian process (S; �S),given (St; �St).

8 We can then rewrite (2.1.10) as follows :

6We use here a slightly modi�ed terminology with respect to the usual one. Indeed, it is more common to call at themoney /in the money/ out of the money options, when St = K=St > K=St < K respectively. From an economic point ofview, it is more appealing to compare St with the present value of the strike price K.

7Other stochastic volatility models similar to Hull and White (1987) appear in Johnson and Shanno (1987), Scott (1987),Wiggins (1987), Chesney and Scott (1989), Stein and Stein (1991) and Heston (1993) among others.

8We implicitly assume here that the available information It contains the past values (S� ; �� )��t. This assumption will

be discussed in section 4.2.

5

Page 10: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Ct = B (t; t+ h)Et (St+h �K)+= B (t; t+ h)Et

nEh(St+h �K)

+���(�S�)t���t+h io (2.1.17)

where the expectation inside the brackets is taken with respect to the conditional probability distribution

of St+h given It and a volatility path �S� , t � � � t + h. However, since the volatility process �S� is

independent from Wt, we obtain using (2.1.15) that :

B (t; t+ h)Et

h(St+h �K)

+���(�S�)t���t+h i = StEt

h� (d1)� e�xt� (d2)

i(2.1.18)

where d1 and d2 are de�ned as follows :8<: d1 =�xt= (t; t+ h)

ph�+ (t; t+ h)

ph=2

d2 = d1 � (t; t+ h)ph

where (t; t+ h) > 0 and :

2 (t; t+ h) =1

h

Z t+h

t�2S�d�: (2.1.19)

This yields the so-called Hull and White option pricing formula :

Ct = StEt

h� (d1)� e�xt� (d2)

i; (2.1.20)

where the expectation is taken with respect to the conditional probability distribution (for the risk neutralprobability measure) of (t; t+ h) given �t.

9

In the remainder of this section we will assume that observed option prices obey Hull and White's

formula (2.1.20). Then option prices would yield two types of implied volatility concepts : (1) an instan-taneous implied volatility and (2) an averaged implied volatility. To make this more precise, let us assumethat the risk neutral probability distribution belongs to a parametric family, P�, � 2 �. Then, the Hulland White option pricing formula yields an expression for the option price as a function :

Ct = StF [�St; xt; �o] (2.1.21)

where �o is the true unknown value of the parameters. Formula (2.1.21) reveals why it is often claimedthat \option markets can be thought of as markets trading volatility" (see e.g. Stein (1989)). As a matterof fact, if for any given (xt; �), F (�; xt; �) is one-to-one, then equation (2.1.21) can be inverted to yield an

implied instantaneous volatility :10

�imp� (�) = G [St; Ct; xt; �] (2.1.22)

Bajeux and Rochet (1992), by showing that this one-to-one relationship between option prices andinstantaneous volatility holds, in fact formalize the use of option markets as an appropriate instrument to

hedge volatility risk. Obviously implied instantaneous volatilities (2.1.22) could only be useful in practice

for pricing or hedging derivative instruments when we know the true unknown value �o or , at least, areable to compute a su�ciently accurate estimate of it.

9The conditioning is with respect to �t since it summarizes the relevant information taken from It ( the process � isassumed to be Markovian and independent from W ).

10The fact that F (�; xt; �) is one-to-one is shown to be the case for any di�usion model on �t under certain regularityconditions, see Bajeux and Rochet (1992).

6

Page 11: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

However, the di�culties involved in estimating SV models has for long prevented their wide spread

use in empirical applications. This is the reason why practitioners often prefer another concept of im-

plied volatility, namely the so-called Black-Scholes implied volatility introduced by Latane and Rendleman

(1976). It is a process !imp (t; t+ h) de�ned by :8>><>>:Ct = St [� (d1)� e�xt� (d2)]

d1 =�xt=!

imp (t; t+ h)ph�+ !imp (t; t+ h)

ph=2

d2 = d1 � !imp (t; t+ h)ph

(2.1.23)

where Ct is the observed option price.11

The Hull and White option pricing model can indeed be seen as a theoretical foundation for this

practice; the comparison between (2.1.23) and (2.1.20) allows us to interpret the Black Scholes implied

volatility !imp (t; t+ h) as an implied averaged volatility since !imp (t; t+ h) is something like a conditional

expectation of (t; t+ h) (assuming observed option prices coincide with the Hull and White pricingformula). To be more precise, let us consider the simplest case of at the money options (the general casewill be studied in section 4.2). Since xt = 0 it follows that d2 = �d1 and therefore : � (d1) � e�xt� (d2) =2� (d1)�1: Hence, !imp

o (t; t+ h) (the index o is added to underline that we consider at the money options)is de�ned by :

!impo (t; t+ h)

ph

2

!= Et�

(t; t+ h)

ph

2

!(2.1.24)

Since the cumulative standard normal distribution function is roughly linear in the neighborhood of zero,if follows that (for small maturities h) :

!impo (t; t+ h) � Et (t; t+ h)

This yields an interpretation of the Black-Scholes implied volatility !impo (t; t+ h) as an implied average

volatility :

!impo (t; t+ h) � Et

"1

h

Z t+h

t�2�d�

# 12

(2.1.25)

2.2 Some Stylized Facts

The search for model speci�cation and selection is always guided by empirical stylized facts. A model's

ability to reproduce such stylized facts is a desirable feature and failure to do so is most often a criterion todismiss a speci�cation although one typically does not try to �t or explain all possible empirical regularities

at once with a single model. Stylized facts about volatility have been well documented in the ARCH

literature, see for instance Bollerslev, Engle and Nelson (1994). Empirical regularities regarding derivativesecurities and implied volatilities are also well covered for instance by Bates (1995a). In this section we

will summarize empirical stylized facts, complementing and updating some of the material covered in theaforementioned references.

11We do not explicitly study here the dependence between !imp (t; t+ h) and the various related processes : Ct, St, xt.This is the reason why, for sake of simplicity, this dependence is not apparent in the notation !imp (t; t+ h).

7

Page 12: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

(a) Thick tails

Since the early sixties it was observed, notably by Mandelbrot (1963), Fama (1963, 1965), among others

that asset returns have leptokurtic distributions. As a result, numerous papers have proposed to model

asset returns as i.i.d. draws from fat-tailed distributions such as Paretian or L�evy.

(b) Volatility clustering

Any casual observations of �nancial time series reveals bunching of high and low volatility episodes. In

fact, volatility clustering and thick tails of asset returns are intimately related. Indeed, the latter is a static

explanation whereas a key insight provided by ARCHmodels is a formal link between dynamic (conditional)

volatility behavior and (unconditional) heavy tails. ARCH models, introduced by Engle (1982) and the

numerous extensions thereafter as well as SV models are essentially built to mimic volatility clustering.

It is also widely documented that ARCH e�ects disappear with temporal aggregation, see e.g. Diebold

(1988) and Drost and Nijman (1993).

(c) Leverage e�ectsA phenomenon coined by Black (1976) as the leverage e�ect suggests that stock price movements are

negatively correlated with volatility. Because falling stock prices imply an increased leverage of �rms itis believed that this entails more uncertainty and hence volatility. Empirical evidence reported by Black

(1976), Christie (1982) and Schwert (1989) suggests, however, that leverage alone is too small to explain theempirical asymmetries one observes in stock prices. Others reporting empirical evidence regarding leveragee�ects include Nelson (1991), Gallant, Rossi and Tauchen (1992, 1993), Campbell and Kyle (1993) andEngle and Ng (1993).

(d) Information arrivals

Asset returns are typically measured and modeled with observations sampled at �xed frequencies suchas daily, weekly or monthly observations. Several authors, including Mandelbrot and Taylor (1967) andClark (1973) suggested to link asset returns explicitly to the ow of information arrivals. In fact it wasalready noted that Clark proposed one of the early examples of SV models. Information arrivals are nonuniform through time and quite often not directly observable. Conceptually, one can think of asset price

movements as the realization of a process Yt = Y �Zt

where Zt is a so-called directing process. This positivenondecreasing stochastic process Zt can be thought of as being related to the arrival of information. Thisidea of time deformation or subordinated stochastic processes was used by Mandelbrot and Taylor (1967)

to explain fat tailed returns, by Clark (1973) to explain volatility and was recently re�ned and furtherexplored by Ghysels, Gouri�eroux and Jasiak (1995a). Moreover, Easley and O'Hara (1992) provide a

microstructure model involving time deformation. In practice, it suggests a direct link between marketvolatility and (1) trading volume, (2) quote arrivals, (3) forecastable events such as dividend announcements

or macroeconomic data releases, (4) market closures, among many other phenomena linked to informationarrivals.

Regarding trading volume and volatility there are several papers documenting stylized facts notablylinking high trading volume with market volatility, see for example Karpo� (1987) or Gallant, Rossi

and Tauchen (1992).12 The intraday patterns of volatility and market activity measured for instance by

quote arrivals is also well-known and documented. Wood, McInish and Ord (1985) and Harris (1986)studied this phenomenon for securities markets and found a U-shaped pattern with volatility typically

12There are numerous models, theoretical and empirical, linking trading volume and asset returns which we cannot discussin detail. A partial list includes Foster and Viswanathan (1993a,b), Ghysels and Jasiak (1994a,b), Hausman and Lo (1991),Hu�man (1987), Lamoureux and Lastrapes (1990, 1993), Wang (1993).

8

Page 13: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

high at the open and close of the market. The around the clock trading in foreign exchange markets

also yields a distinct volatility pattern which is tied with the intensity of market activity and produces

strong seasonal patterns. The intradaily patterns for FX markets are analyzed for instance by M�uller et al.

(1990), Baillie and Bollerslev (1991), Harvey and Huang (1991), Dacorogna et al. (1993), Andersen and

Bollerslev (1995), Bollerslev and Ghysels (1994), Ghysels, Gouri�eroux and Jasiak (1995b) among others.

Another related empirical stylized fact is that of overnight and weekend market closures and their e�ect on

volatility. Fama (1965) and French and Roll (1986) have found that information accumulates more slowly

when the NYSE and AMEX are closed resulting in higher volatility on those markets after weekends and

holidays. Similar evidence for FX markets has been reported by Baillie and Bollerslev (1989). Finally,

numerous papers documented increased volatility of �nancial markets around dividend announcements

(Cornell (1978), Patell and Wolfson (1979,1981)) and macroeconomic data releases (Harvey and Huang

(1991, 1992), Ederington and Lee (1993)).

(e) Long memory and persistence

Generally speaking volatility is highly persistent. Particularly for high frequency data one �nds evi-dence of near unit root behavior of the conditional variance process. In the ARCH literature numerousestimates of GARCH models for stock market, commodities, foreign exchange and other asset price seriesare consistent with an IGARCH speci�cation. Likewise, estimation of stochastic volatility models show

similar patterns of persistence (see for instance Jacquier, Polson and Rossi (1994)). These �ndings haveled to a debate regarding modelling persistence in the conditional variance process either via a unit rootor a long memory process. The latter approach has been suggested both for ARCH and SV models, seeBaillie, Bollerslev and Mikkelsen (1993), Breidt et al. (1993), Harvey (1993) and Comte and Renault(1995). Ding, Granger and Engle (1993) studied the serial correlations of jr (t; t+ 1)jc for positive valuesof c where r (t; t+ 1) is a one-period return on a speculative asset. They found jr (t; t+ 1)jc to have quitehigh autocorrelations for long lags while the strongest temporal dependence was for c close to one. Thisresult initially found for daily S&P500 return series was also shown to hold for other stock market indices,commodity markets and foreign exchange series (see Granger and Ding (1994)).

(f) Volatility comovements

There is an extensive literature on international comovements of speculative markets. Concerns whetherglobalization of equity markets increase price volatility and correlations of stock returns has been the sub-ject of many recent studies including, von Fustenberg and Jean (1989), Hamao, Masulis and Ng (1990),King, Sentana and Wadhwani (1994), Harvey, Ruiz and Sentana (1992), Lin, Engle and Ito (1994). Typi-

cally one uses factor models to model the commonality of international volatility, as in Diebold and Nerlove

(1989), Harvey, Ruiz and Sentana (1992), Harvey, Ruiz and Shephard (1994) or explores so-called commonfeatures, see e.g. Engle and Kozicki (1993) and common trends as studied by Bollerslev and Engle (1993).

(g) Implied volatility correlationsStylized facts are typically reported as model-free empirical observations.13 Implied volatilities are

obviously model-based as they are calculated from a pricing equation of a speci�c model, namely the Black

and Scholes model as noted in section 2.1.3. Since they are computed on a daily basis there is obviously

an internal inconsistency since the model presumes constant volatility. Yet, since many option prices are

in fact quoted through their implied volatilities it is natural to study the time series behavior of the latter.Often one computes a composite measure since synchronous option prices with di�erent strike prices and

13This is in some part �ctitious even for macroeconomic data for instance when they are detrended or seasonal adjusted.Both detrending and seasonnaly adjustment are model-based. For the potentially severe impact of detrending on stylizedfacts see Canova (1992) and Harvey and Jaeger (1993) and for the e�ect of seasonal adjustment on empirical regularities seeGhysels et al. (1993).

9

Page 14: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

maturities for the same underlying asset yield di�erent implied volatilities. The composite measure is

usually obtained from a weighting scheme putting more weight on the near-the-money options which are

the most heavily traded on organized markets.14

The time series properties of implied volatilities obtained from stock, stock index and currency options

are quite similar. They appear stationary and are well described by a �rst order autoregressive model

(see Merville and Pieptea (1989) and Sheikh (1993) for stock options, Poterba and Summers (1986), Stein

(1989), Harvey and Whaley (1992) and Diz and Finucane (1993) for the S&P 100 contract and Taylor

and Xu (1994), Campa and Chang (1995) and Jorion (1995) for currency options). It was noted from

equation (2.1.25) that implied (average) volatilities are expected to contain information regarding future

volatility and therefore should predict the latter. One typically tests such hypotheses by regressing realized

volatilities on past implied ones.

The empirical evidence regarding the predictable content of implied volatilities is mixed. The time

series study of Lamoureux and Lastrapes (1993) considers options on non-dividend paying stocks and

compared the forecasting performance of GARCH, implied volatility and historical volatility estimatesand found that implied volatilities forecasts, though they are biased as one would expect from (2.1.25),outperform the others. In sharp contrast, Canina and Figlewski (1993) studied S&P 100 index call optionsfor which there is an extremely active market. They found that implied volatilities were virtually useless

in forecasting future realized volatilities of the S&P 100 index. In a di�erent setting using weekly samplingintervals for S&P 100 option contracts and a di�erent sample Day and Lewis (1992) not only found thatimplied volatilities had a predictive content but also were unbiased. Studies examining options on foreigncurrencies, such as Jorion (1995) also found that implied volatilities were predicting future realizations andGARCH as well as historical volatilities were not outperforming the implied measures of volatility.

(h) The term structure of implied volatilitiesThe Black-Scholes model predicts a at term structure of volatilities. In reality, the term structure of

at-the-money implied volatilities is typically upward sloping when short term volatilities are low and thereverse when they are high (see Stein(1989)). Taylor and Xu (1994) found that the term structure of impliedvolatilities from foreign currency options reverses slope every few months. Stein (1989) also found the actual

sensitivity of medium to short term implied volatilities was greater than the estimated sensitivity from theforecast term structure and concluded that medium term implied volatilities overreacted to information.Diz and Finucane (1993) used di�erent estimation techniques and rejected the overreaction hypothesis,

even reported evidence suggesting underreaction.(i) Smiles

If option prices in the market were conformable with the Black-Scholes formula, all the Black-Scholesimplied volatilities corresponding to various options written on the same asset would coincide with the

volatility parameter � of the underlying asset. In reality this is not the case, and the Black-Scholes impliedvolatility wimp (t; t+ h) de�ned by (2.1.23) heavily depends on the calendar time t, the time to maturityh and the moneyness xt = LogSt=KB (t; t+ h) of the option. This may produce various biases in optionpricing or hedging when BS implied volatilities are used to evaluate new options with di�erent strike prices

K and maturities h . These price distortions, well-known to practitioners, are usually documented in the

empirical literature under the terminology of the smile e�ect, where the so-called \smile" refers to theU-shaped pattern of implied volatilities across di�erent strike prices. More precisely, the following stylized

facts are extensively documented (see for instance Rubinstein (1985), Clewlow and Xu (1993), Taylor and

14Di�erent weighting schemes have been suggested, see for instance Latane and Rendleman (1976), Chiras and Manaster(1978), Beckers (1981), Whaley (1982), Day and Lewis (1988), Engle and Mustafa (1992) and Bates (1995b).

10

Page 15: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Xu (1993)) :

� The U-shaped pattern of wimp (t; t+ h) as a function of K (or logK) has its minimum centered at

near - the - money options (discounted K close to St, i.e. xt close to zero).

� The volatility smile is often but not always symmetric as a function of logK (or of xt).

� The amplitude of the smile increases quickly when time to maturity decreases. Indeed, for short

maturities the smile e�ect is very pronounced (BS implied volatilities for synchronous option prices

may vary between 15% and 25%) while it almost completely disappears for longer maturities.

� The smile can be asymmetric. This skewness e�ect can often be described as the addition of a

monotonic curve to the standard symmetric smile: if a decreasing curve is added, implied volatilities

tend to rise more for decreasing than for increasing strike prices and the implied volatility curve

has its minimum out of the money. In the reverse case (addition of an increasing curve), impliedvolatilities tend to rise more with increasing strike prices and their minimum is in the money.

It is widely believed that volatility smiles have to be explained by a modelling of stochastic volatility.This is natural for several reasons: First, it is tempting to propose a model of stochastically time varying

volatility to account for stochastically time varying BS implied volatilities. Moreover, the decreasingamplitude of the smile being a function of time to maturity is conformable with formula like (2.1.25).Indeed, it shows that, when time to maturity is increased, temporal aggregation of volatilities erasesconditional heteroskedasticity, which decreases the smile phenomenon. Finally, the skewness itself mayalso be attributed to the stochastic feature of the volatility process and overall to the correlation of thisprocess with the price process (the so-called leverage e�ect). Indeed, this e�ect, while sensible for stock

prices data, is small for interest rate and exchange rate series which is why the skewness of the smile ismore often observed for options written on stocks.

Nevertheless, it is important to be cautious about tempting associations: stochastic implied volatilityand stochastic volatility; asymmetry in stocks and skewness in the smile. As will be discussed in section4, such analogies are not always rigorously proven. Moreover, other arguments to explain the smile and

its skewness (jumps, transaction costs, bid-ask spreads, non-synchronous trading, liquidity problems, ...)

have also to be taken in account both for theoretical reasons and empirical ones. For instance, there existsempirical evidence suggesting that the most expensive options (the upper parts of the smile curve) arealso the least liquid; skewness may therefore be attributed to speci�c con�gurations of liquidity in option

markets.

2.3 Information sets

So far we left the speci�cation of information sets vague. This was done on purpose to focus on one issue

at the time. In this section we need to be more formal regarding the de�nition of information since it

will allow us to clarify several missing links between the various SV models introduced in the literature

and also between SV and ARCH models. We know that SV models emerged from research looking at avery diverse set of issues. In this section we will try to de�ne a common thread and a general unifying

framework. We will accomplish this through a careful analysis of information sets and associate notions of

11

Page 16: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

non-causality in the Granger sense. These causality conditions will allows us to characterise in section 2.4

the distinct features of ARCH and SV models.15

2.3.1 State variables and information sets

The Hull and White (1987) model is a simple example of a derivative asset pricing model where the

stock price dynamics are governed by some unobservable state variables, such as random volatility. More

generally, it is convenient to assume that a multivariate di�usion process Ut summarizes the relevant state

variables in the sense that: 8>><>>:dSt/St = �tdt+ �tdWt

dUt = tdt+ �tdWUt

Cov�dWt; dW

Ut

�= �tdt

(2.3.1)

where the stochastic processes �t; �t, t; �t and �t are IUt = [U� ; � � t] adapted (Assumption 2.3.1).

This means that the process U summarizes the whole dynamics of the stock price process S (whichjusti�es the terminology \state" variable) since, for a given sample path (U� )0���T of state variables,

consecutive returns Stk+1

.Stk ; 0 � t1 < t2 < ::: < tk � T are stochastically independent and log-normal

(as in the benchmark BS model).The arguments of section 2.1.2 can be extended to the state variables framework (see Garcia and Renault

(1995)) discussed here. Indeed, such an extension provides a theoretical justi�cation for the common use ofthe Black and Scholes model as a standard method of quoting option prices via their implied volatilities.16

In fact, it is a way of introducing neglected heterogeneity in the BS option pricing models (see Renault(1995) who draws attention to the similarities with introducing heterogeneity in microeconometric modelsof labor markets, etc.).

In continuous time models, available information at time t for traders (whose information determinesoption prices) is characterized by continuous time observations of both the state variable sample path and

stock price process sample path; namely:

It = � [U� ; S� ; � � t] (2.3.2)

2.3.2 Discrete sampling and Granger noncausality

In the next section we will treat explicitly discrete time models. It will necessitate formulating discretetime analogues of equation (2.3.1). The discrete sampling and Granger non causality conditions discussed

here will bring us a step closer to building a formal framework for statistical modelling using discrete time

data.Clearly, a discrete time analogue of equation (2.3.1) is:

logSt+1=St = � (Ut) + � (Ut) "t+1 (2.3.3)

15The analysis in this section has some features in common with Andersen (1992) regarding the use of information sets toclarify the di�erence between SV and ARCH type models.

16Garcia and Renault (1995) argued that Assumption 2.3.1 is essential to ensure the homogeneity of option prices withrespect to the pair (stock price, strike price) which in turn ensures that BS implied volatilities do not depend on monetaryunits. This homogeneity property was �rst emphasized by Merton (1973).

12

Page 17: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

provided we impose some restrictions on the process "t. The restrictions we want to impose must be exible

enough to accommodate phenomena such as leverage e�ects for instance. A setup that does this is the

following :

Assumption 2.3.2.A : The process "t in (2.3.3) is i.i.d. and not Granger caused by the state variable process

Ut.

Assumption 2.3.2.B : The process "t in (2.3.3) does not Granger cause Ut.

Assumption 2.3.2.B is useful for the practical use of BS implied volatilities as it is the discrete time

analogue of Assumption 2.3.1 where it is stated that the coe�cients of the process U are IUt adapted

(for further details see Garcia and Renault (1995)). Assumption 2.3.2.A is important for the statistical

interpretation of the functions � (Ut) and � (Ut) respectively as trend and volatility coe�cients. Namely,

E [logSt+1=St j (S�=S��1; � � t)]= E [E [logSt+1=St j (U� ; "� ; � � t)] j (S�=S��1; � � t)]= E [� (Ut) j (St=St�1; � � t)]

(2.3.4)

since E ["t+1 j (U� ; "� ; � � t)] = E ["t+1 j "t; � � t] = 0 due to the Granger noncausality from Ut to "t of

Assumption 2.3.2.A. Likewise, one can easily show that

V ar [logSt+1=St � � (Ut) j (S�=S��1; � � t)]= E [�2 (Ut) j (S�=S��1; � � t)]

(2.3.5)

Implicitly we have introduced a new information set in (2.3.4) and (2.3.5) which besides It de�nedin (2.3.2) will be useful as well for further analysis. Indeed, one often con�nes (statistical) analysis toinformation conveyed by a discrete time sampling of stock return series which will be denoted by theinformation set

IRt � � [S�=S��1 : � = 0; 1; :::; t� 1; t] (2.3.6)

where the superscript R stands for returns. By extending Andersen (1994), we shall adopt as the mostgeneral framework for univariate volatility modelling, the setup given by the Assumptions 2.3.2.A, 2.3.2.Band:

Assumption 2.3.2.C : � (Ut) is IRt measurable.

Therefore in (2.3.4) and (2.3.5) we have essentially shown that :

EhlogSt+1=St j IRt

i= � (Ut) (2.3.7)

V arh(log St+1=St)

���IRt i = Eh�2 (Ut)

��� IRt i (2.3.8)

2.4 Statistical Modelling of Stochastic Volatility

Financial time series are observed at discrete time intervals while a majority of theoretical models are

formulated in continuous time. Generally speaking there are two statistical methodologies to resolve this

tension. Either one consider for the purpose of estimation statistical discrete time models of the continuous

13

Page 18: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

time processes. Alternatively, the statistical model may be speci�ed in continuous time and inference is

done via a discrete time approximation. In this section we will discuss in detail the former approach while

the latter will be introduced in section 4. The class of discrete time statistical models discussed here is

general. In section 2.4.1 we introduce some notation and terminologies. The next section discuss the

so-called stochastic autoregressive volatility model introduced by Andersen (1994) as a rather general and

exible semi-parametric framework to encompass various representations of stochastic volatility already

available in the literature. Identi�cation of parameters and requested additional restrictions are discussed

in section 2.4.3

2.4.1 Notation and Terminology

In section 2.3, we left unspeci�ed the functional forms which the trend � (�) and volatility � (�) take. Indeed,in some sense we built a nonparametric framework recently proposed by Lezan, Renault and de Vitry (1995)

which they introduced to discuss a notion of stochastic volatility of unknown form.17 This nonparametric

framework encompasses standard parametric models (see section 2.4.2 for more formal discussion). Forthe purpose of illustration let us consider two extreme cases, assuming for simplicity that � (Ut) = 0 :(i) the discrete time analogue of the Hull and White model (2.1.16) is obtained when � (Ut) = �t is astochastic process independent from the stock return standardized innovation process " and (ii) �t maybe a deterministic function h ("t; � � t) of past innovations. The latter is the complete opposite of (i) and

leads to a large variety of choices of parametrized functions for h yielding X-ARCH models (GARCH,EGARCH, QTARCH, Periodic GARCH, etc.).

Besides these two polar cases where Assumption 2.3.2.A is ful�lled in a trivial degenerate way, onecan also accommodate leverage e�ects.18 In particular the contemporaneous correlation structure betweeninnovations in U and the return process can be nonzero, since the Granger non-causality assumptions dealwith temporal causal links rather than contemporaneous ones. For instance, we may have � (Ut) = �t with:

log St+1/St = �t"t+1 (2.4.1)

Cov��t+1; "t+1j IRt

�6= 0 (2.4.2)

A negative covariance in (2.4.2) is a standard case of leverage e�ect, without violating the non-causalityAssumptions 2.3.2.A and B.

A few concluding observations are worth making to deal with the burgeoning variety of terminologies

in the literature. First, we have not considered the distinction due to Taylor (1994) between \lagged

autoregressive random variance models" given by (2.4.1) and \contemporaneous autoregressive random

variance models" de�ned by:

log St+1/St = �t+1"t+1 (2.4.3)

17Lezan, Renault and de Vitry (1995) discuss in detail how to recover phenomena such as volatility clustering in thisframework. As a nonparametric framework it also has certain advantages regarding (robust) estimation. They develop forinstance methods that can be useful as a �rst estimation step for e�cient algorithms assuming a speci�c parametric model(see Section 5).

18Assumption 2.3.2.B is ful�lled in the case (i) but may fail in the GARCH case (ii). When it fails to hold in the lattercase it makes the GARCH framework not very well-suited for option pricing.

14

Page 19: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Indeed, since the volatility process �t is unobservable, the settings (2.4.1) and (2.4.3) are observationally

equivalent as long as they are not completed by precise (non)-causality assumptions. For instance : (i)

(2.4.1) and assumption 2.3.2.A together appear to be a correct and very general de�nition of a SV model

possibly completed by Assumption 2.3.2.B for option pricing and (2.4.2) to introduction leverage e�ects,

(ii) (2.4.3) associated with (2.4.2) would not be a correct de�nition of a SV model since in this case in

general: Ehlog St+1/St j IRt

i6= 0, and the model would introduce via the process � a forecast which is

related not only to volatility but also to the expected return.

For notational simplicity, the framework (2.4.3) will be used in section 3 with the leverage e�ect captured

by Cov (�t+1; "t) 6= 0 instead of Cov (�t+1; "t+1) 6= 0. Another terminology was introduced by Amin and

Ng (1993) for option pricing. Their distinction between \predictable" and \unpredictable" volatility is

very close to the leverage e�ect concept and can also be analyzed through causality concepts as discussed

in Garcia and Renault (1995). Finally, it will not be necessary to make a distinction between weak,

semi-strong and strong de�nitions of SV models in analogy with their ARCH counterparts (see Drost and

Nijman (1993)). Indeed, the class of SV models as de�ned here can accommodate parametrizations whichare closed under temporal aggregation (see also section 4.1 on the subject of temporal aggregation).

2.4.2 Stochastic Autoregressive Volatility

For simplicity, let us consider the following univariate volatility process :

yt+1 = �t + �t"t+1 (2.4.4)

where �t is a measurable function of observables yt 2 IRt , � � t. While our discussion will revolve around(2.4.4), we will discuss several issues which are general and not con�ned to that speci�c model; extensionswill be covered more explicitly in section 3.5. Following the result in (2.3.8) we know that :

V arhyt+1j IRt

i= E

h�2t

��� IRt i (2.4.5)

suggesting (1) that volatility clustering can be captured via autoregressive dynamics in the conditionalexpectation (2.4.5) and (2) that thick tails can be obtained in either one of three ways, namely (a) via

heavy tails of the white noise "t distribution, (b) via the stochastic features of Eh�2t j IRt

iand (c) via speci�c

randomness of the volatility process �t which makes it latent i.e. �t =2 IRt .19 The volatility dynamics that

follow from (1) and (2) are usually an AR(1) model for some nonlinear function of �t. Hence, the volatilityprocess is assumed to be stationary and Markovian of order one but not necessarily linear AR(1) in �t itself.This is precisely what motivated Andersen (1994) to introduce the Stochastic Autoregressive Variance or

SARV class of models where �t (or �2t ) is a polynomial function g (Kt) of a Markov process Kt with the

following dynamic speci�cation :

Kt = w + �Kt�1 + [ + �Kt�1]ut (2.4.6)

where ~ut = ut � 1 is zero-mean white noise with unit variance. Andersen (1994) discusses su�cientregularity conditions which ensure stationarity and ergodicity for Kt. Without entering into the details,

let us note that the fundamental non-causality Assumption 2.3.2A implies that the ut process in (2.4.6)

19Kim and Shephard (1994), using data on weekly returns on the S&P 500 Index , found that a t-GARCH model has analmost identical likelihood as the normal based SV model. This example shows that a speci�c randomness in �t may producethe same level of marginal kurtosis as a heavy tailed student distribution of the white noise ".

15

Page 20: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

does not Granger cause "t in (2.4.4). In fact, the non-causality condition suggests a slight modi�cation

of Andersen's (1994) de�nition. Namely, it suggests assuming "t+1 independent of ut�j , j � 0 for the

conditional probability distribution, given "t�j, j � 0 rather than for the unconditional distribution. This

modi�cation does not invalidate Andersen's SARV class of models as the most general parametric statistical

model studied so far in the volatility literature. The GARCH (1,1) model is straightforwardly obtained

from (2.4.6) by letting Kt = �2t ; = 0 and ut = "2t . Note that the deterministic relationship ut = "2tbetween the stochastic components of (2.4.4) and (2.4.6) emphasizes that, in GARCH models, there is no

randomness speci�c to the volatility process. The Autoregressive Random Variance model popularized by

Taylor (1986) also belongs to the SARV class. Here:

log �t+1 = � + � log �t + �t+1 (2.4.7)

where �t+1 is a white noise disturbance such that Cov (�t+1, "t+1) 6= 0 to accommodate leverage e�ects.

This is a SARV model with Kt = log �t, � = 0 and �t+1 = ut+1.20

2.4.3 Identi�cation of parameters

Introducing a general class of processes for volatility, like the SARV class discussed in the previous sectionprompts questions regarding identi�cation. Suppose again that

yt+1 = �t"t+1�qt = g (Kt) , q 2 f1; 2gKt = w + �Kt�1 + [ + �Kt�1]ut .

(2.4.8)

Andersen (1994), noted the model is better interpreted by considering the zero-mean white noise process~ut = ut � 1 :

Kt = (w + ) + (�+ �)Kt�1 + ( + �Kt�1) ~ut . (2.4.9)

It is clear from the latter that it may be di�cult to distinguish empirically the constant w from the"stochastic" constant ut. Similarly, the identi�cation of the � and � parameters separately is also prob-

lematic as (� + �) governs the persistence of shocks to volatility. These identi�cation problems are usuallyresolved by imposing (arbitrary) restrictions on the pairs of parameters (w; ) and (�; �).

The GARCH(1,1) and Autoregressive Random Variance speci�cations assume that = 0 and � = 0

respectively. Identi�cation of all parameters without such restrictions generally requires additional con-straints, for instance via some distributional assumptions on "t+1 and ut, which restrict the semi-parametric

framework of (2.4.6) into a parametric statistical model.To address more rigorously the issue of identi�cation, it is useful to consider, according to Andersen

(1994), the following reparametrisation (assuming for notational convenience that � 6= 0) :8><>:K = (w + ) /(1 � � � �)

� = � + �

� = /�

(2.4.10)

Hence equation (2.4.9) can be rewritten as :

20Andersen (1994) also shows that the SARV framework encompasses another type of random variance model that we haveconsidered as ill-speci�ed since it combines (2.4.2) and (2.4.3).

16

Page 21: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Kt = K + � (Kt�1 �K) + (� +Kt�1) �Ut

where �Ut = �~ut.

It is clear from (2.4.10), that only three functions of the original parameters �; �; ; w may be identi�ed

and that the three parameters K; �; � are identi�ed from the �rst three unconditional moments of the

process Kt for instance.

To give to these identi�cation results an empirical content, it is essential to know : (1) how to go from

the moments of the observable process Yt to the moments of the volatility process �t and (2) how to go

from the moments of the volatility process �t to the moments of the latent process Kt. The �rst point

is easily solved by specifying the corresponding moments of the standardised innovation process ". If we

assume for instance a Gaussian probability distribution, we obtain that :8>><>>:E jytj =

q2/ � E�t

E jytj jyt�jj = 2/ � E (�t�t�j)

E jy2t j jyt�jj =q2/ � E (�2t�t�j)

(2.4.11)

The solution of the second point requires in general the speci�cation of the mapping g and of the

probability distribution of ut in (2.4.6). For the so-called Log-normal SARV model, it is assumed that� = 0 and Kt = log �t (Taylor's autoregressive random variance model) and that ut is normally distributed(Log-normality of the volatility process). In this case, it is easy to show that :8>><>>:

E�nt = exp [n EKt + n2 V arKt/ 2]

E��mt �

nt�j

�= E�mt E�

nt�j exp [mnCov (Kt;Kt�j)]

Cov (Kt;Kt�j) = �jV arKt

(2.4.12)

Without the normality assumption (i.e. QML, mixture of normal, Student distribution ...) this modelwill be studied in much more detail in sections 3 and 5 from both probabilistic and statistical points ofview. Moreover, this is a template for studying other speci�cations of the SARV class of models. Inaddition, various speci�cations will be considered in section 4 as proxies of continuous time models.

3 Discrete Time Models

The purpose of this section will be to discuss the statistical handling of discrete time SV models, using

simple univariate cases. We start by de�ning the most basic SV model corresponding the autoregressive

random variance model discussed earlier in (2.4.7). We study its statistical properties in section 3.2 and

provide a comparison with ARCH models in section 3.3. Section 3.4 is devoted to �ltering, prediction and

smoothing. Various extensions, including multivariate models, are covered in the last section. Estimationof the parameters governing the volatility process is discussed later in section 5.

3.1 The Discrete Time SV Model

The discrete time SV model may be written as

yt = �t"t; t = 1; :::; T; (3.1.1)

17

Page 22: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

where yt denotes the demeaned return process yt = log (St=St�1)�� and log �2t follows an AR(1) process. Itwill be assumed that "t is a series of independent, identically distributed random disturbances. Usually "tis speci�ed to have a standard distribution so its variance, �2" , is unknown. Thus for a normal distribution

�2" is unity while for a t-distribution with � degrees of freedom it will be �= (� � 2). Following a convention

often adopted in the literature we write:

yt = �"te0:5ht (3.1.2)

where � is a scale parameter, which removes the need for a constant term in the stationary �rst-order

autoregressive process

ht+1 = �ht + �t; �t � IID(0; �2�); j�j < 1: (3.1.3)

It was noted before that if "t and �t are allowed to be correlated with each other, the model can pick

up the kind of asymmetric behavior which is often found in stock prices. Indeed a negative correlation

between "t and �t induces a leverage e�ect. As in section 2.4, the timing of the disturbance in (??) ensuresthat the observations are still a martingale di�erence, the equation being written in this way so as to tiein with the state space literature.

It should be stressed that the above model is only an approximation to the continuous time model ofsection 2 observed at discrete intervals. The accuracy of the approximation is examined in Dassios (1995)

using Edgeworth expansions.

3.2 Statistical Properties

The following properties of the SV model hold even if "t and �t are contemporaneously correlated. Firstly,

as noted, yt is a martingale di�erence. Secondly, stationarity of ht implies stationarity of yt. Thirdly, if�t is normally distributed, it follows from the properties of the lognormal distribution that E[exp(aht)] =exp(a2�2h=2), where a is a constant and �2h is the variance of ht. Hence, if "t has a �nite variance, thevariance of yt is given by

V ar(yt) = �2�2" exp(�2h=2) (3.2.1)

where �2h is the variance of ht. Similarly if the fourth moment of "t exists, the kurtosis of yt is �exp( �2h),

where � is the kurtosis of "t; so yt exhibit more kurtosis than "t: Finally all the odd moments are zero.For many purposes we need to consider the moments of powers of absolute values. Again, �t is assumed

to be normally distributed. Then for "t having a standard normal distribution, the following expressionsare derived in Harvey (1993):

E j yt jc= �c2c=2��c2+ 1

2

���12

� exp(c2

8�2h); c > �1; c 6= 0 (3.2.2)

and

V ar j yt jc= �2c2cexp (c2

2�2h)

8><>:��c+1

2

���12

� �24�

�c2+ 1

2

���12

�3529>=>; ; c > �0:5; c 6= 0

Note that �(1=2) =p� and �(1) = 1: Corresponding expressions may be computed for other distributions

of "t including Student's t and the General Error Distribution.

18

Page 23: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Finally, the square of the coe�cient of variation of �2t is often used as a measure of the relative strength

of the SV process. This is V ar(�2t )=[E(�2t )]

2 = exp(�2h)� 1: Jacquier, Polson and Rossi (1994) argue that

this is more easily interpretable than �2�: In the empirical studies they quote it is rarely less than 0.1 or

greater than 2.

3.2.1 Autocorrelation Functions

If we assume that the disturbances "t and �t are mutually independent, and �t is normal, the ACF of the

absolute values of the observations raised to the power c is given by

�(c)� =E(j yt jcj yt�� jc)� fE(j yt jc)g2

E(j yt j2c)� fE(j yt jc)g2 =exp( c

2

4�2h�h;� )� 1

�c exp(c2

4�2h)� 1

; � � 1; c > �0:5; c 6= 0 (3.2.3)

where �c is�c = E(j yt j2c)=fE(j yt jc)g2; (3.2.4)

and �h;� ; � = 0; 1; 2; ::: denotes the ACF of ht . Taylor (1986) gives this expression for c equal to one and

two and "t normally distributed. When c = 2, �c is the kurtosis and this is three for a normal distribution.More generally,

�c = �(c +1

2)�(

1

2)=f�( c

2+1

2)g2; c 6= 0

For Student's t-distribution with � degrees of freedom :

�c =�(c+ 1

2)�(�c+ �

2)�(1

2)�(�

2)

f�( c2+ 1

2)�(� c

2+ �

2)g2 ; jcj < �=2; c 6= 0 (3.2.5)

Note that � must be at least �ve if c is two.The ACF, �(c)� ; has the following features. First, if �2h is small and/or �h;� is close to one,

�(c)� ' �h;�exp( c

2

4�2h)� 1

(�c exp(c2

4�2h)� 1)

; � � 1; (3.2.6)

compare Taylor (1986, p. 74-5). Thus the shape of the ACF of ht is approximately carried over to �(c)�

except that it is multiplied by a factor of proportionality, which must be less than one for c positive as

�c is greater than one. Secondly, for the t-distribution, �c declines as � goes to in�nity. Thus �(c)� is a

maximum for a normal distribution. On the other hand, a distribution with less kurtosis than the normalwill give rise to higher values of �(c)� .

Although (??) gives an explicit relationship between �(c)� and c, it does not appear possible to makeany general statements regarding �(c)� being maximized for certain values of c. Indeed di�erent values of �2hlead to di�erent values of c maximizing �(c)� : If �2h is chosen so as to give values of �(c)� of a similar size to

those reported in Ding, Granger and Engle (1993) then the maximum appears to be attained for c slightlyless than one. The shape of the curve relating �(c)� to c is similar to the empirical relationships reported in

Ding, Granger and Engle, as noted by Harvey (1993).

19

Page 24: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

3.2.2 Logarithmic Transformation

Squaring the observations in (??) and taking logarithms gives

log y2t = log �2 + ht + log "2t : (3.2.7)

Alternatively

log y2t = ! + ht + �t; (3.2.8)

where ! = log �2 + E log "2t ; so that the disturbance �t has zero mean by construction.

The mean and variance of log "2t are known to be -1.27 and �2=2 =4.93 when "t has a standard normal

distribution; see Abramovitz and Stegun (1970). However, the distribution of log "2t is far from being

normal, being heavily skewed with a long tail.

More generally, if "t has a t-distribution with � degrees of freedom, it can be expressed as:

"t = �t��0:5t ;

where �t is a standard normal variate and �t is independently distributed such that ��t is chi-square with� degrees of freedom. Thus

log "2t = log �2t � log �t

and again using results in Abramovitz and Stegun (1970), it follows that the mean and variance of log "2tare -1.27 - (�=2) � log(�=2) and 4:93 + 0(�=2) respectively, where (:) is the digamma function. Notethat the moments of �t exist even if the model is formulated in such a way that the distribution of "t isCauchy, that is � = 1: In fact in this case �t is symmetric with excess kurtosis two, compared with excesskurtosis four when "t is Gaussian.

Since log "2t is serially independent, it is straightforward to work out the ACF of log y2t for ht following

any stationary process:

�(0)� = �h;�=f1 + �2�=�2hg; � � 1 (3.2.9)

The notation �(0)� re ects the fact that the ACF of a power of an absolute value of the observation is thesame as that of the Box-Cox transform, that is fjytjc � 1g=c; and hence the logarithmic transform of an

absolute value, raised to any ( non-zero) power, corresponds to c = 0: (But note that one cannot simply

set c = 0 in (??)).Note that even if �t and "t are not mutually independent, the �t and �t disturbances are uncorrelated

if the joint distribution of "t and �t is symmetric, that is f("t; �t) = f(�"t;��t); see Harvey, Ruiz andShephard (1994). Hence the expression for the ACF in (??) remains valid.

3.3 Comparison with ARCH models

The GARCH(1,1) model has been applied extensively to �nancial time series. The variance in (??) is

assumed to depend on the variance and squared observation in the previous time period. Thus

�2t = + �y2t�1 + ��2t�1; t = 1; :::; T: (3.3.1)

20

Page 25: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

The GARCH model was proposed by Bollerslev (1986) and Taylor (1986), and is a generalization of

the ARCH model formulated by Engle (1982). The ARCH(1) model is a special case of GARCH(1,1) with

� = 0: The motivation comes from forecasting; in an AR(1) model with independent disturbances, the

optimal prediction of the next observation is a fraction of the current observation, and in ARCH(1) it is

a fraction of the current squared observation (plus a constant). The reason is that the optimal forecast is

constructed conditional on the current information and in an ARCH model the variance in the next period

is assumed to be known. This construction leads directly to a likelihood function for the model once a

distribution is assumed for "t: Thus estimation of the parameters upon which �2t depends is straightforward

in principle. The GARCH formulation introduces terms analogous to moving average terms in an ARMA

model, thereby making forecasts a function of a distributed lag of past squared observations.

It is straightforward to show that yt is a martingale di�erence with (unconditional) variance =(1����):Thus � + � < 1 is the condition for covariance stationarity. As shown in Bollerslev (1986), the condition

under which the fourth moment exists in a Gaussian model is 2�2+(�+�)2 < 1. The model then exhibits

excess kurtosis. However, the fourth moment condition may not always be satis�ed in practice. Somewhatparadoxically, the conditions for strict stationarity are much weaker and, as shown by Nelson (1990), eveninclude the case � + � = 1:

The speci�cation of GARCH(1,1) means that we can write

y2t = + �y2t�1 + ��2t�1 + �t = + (� + �)y2t�1 + �t � ��t�1

where �t = y2t � �2t is a martingale di�erence. Thus y2t has the form of an ARMA(1,1) process and soits ACF can be evaluated in the same way. The ACF of the corresponding ARMA model seems to beindicative of the type of patterns likely to be observed in practice in correlograms of y2t :

The GARCH model extends by adding more lags of �2t and y2t : However, GARCH(1,1) seems to be the

most widely used. It displays similar properties to the SV model, particularly if � is close to one. Thisshould be clear from (??) which has the pattern of an ARMA(1,1) process. Clearly � plays a role similarto that of �+�. The main di�erence in the ACFs seems to show up most at lag one. Jacquier et al. (1994,p373) present a graph of the correlogram of the squared weekly returns of a portfolio on the New YorkStock Exchange together with the ACFs implied by �tting SV and GARCH(1,1) models. In this case the

ACF implied by the SV model is closer to the sample values.

The SV model displays excess kurtosis even if � is zero since yt is a mixture of distributions. The�2� parameter governs the degree of mixing independently of the degree of smoothness of the varianceevolution. This is not the case with a GARCH model where the degree of kurtosis is tied to the roots

of the variance equation, � and � in the case of GARCH(1,1). Hence, it is very often necessary to use a

non-Gaussian GARCH model to capture the high kurtosis typically found in a �nancial time series.The basic GARCH model does not allow for the kind of asymmetry captured by an SV model with

contemporaneously correlated disturbances, though it can be modi�ed as suggested in Engle and Ng (1993).The EGARCH model, proposed by Nelson (1991), handles asymmetry by taking log �2t to be a function of

past squares and absolute values of the observations.

3.4 Filtering, Smoothing and Prediction

For the purposes of pricing options, we need to be able to estimate and predict the variance, �2t , whichof course, is proportional to the exponent of ht. An estimate based on all the observations up to, and

possibly including, the one at time t is called a �ltered estimate. On the other hand an estimate based on

21

Page 26: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

all the observations in the sample, including those which came after time t is called a smoothed estimate.

Predictions are estimates of future values. As a matter of historical interest we may wish to examine the

evolution of the variance over time by looking at the smoothed estimates. These might be compared with

the volatilities implied by the corresponding options prices as discussed in section 2.1.3. For pricing `at

the money' options we may be able to simply use the �ltered estimate at the end of the sample and the

predictions of future values of the variance, as in the method suggested for ARCH models by Noh, Engle

and Kane (1994). More generally, it may be necessary to base prices on the full distribution of future

values of the variance, perhaps obtained by simulation techniques; for further discussion see section 4.2.

One can think of constructing �ltered and smoothed estimates in a very simple, but arbitrary way, by

taking functions (involving estimated parameters) of moving averages of transformed observations. Thus :

c�t2 = gfrX

j=t�1

wtjf(yt�j)g; t = 1; ::; T; (3.4.1)

where r = 0 or 1 for a �ltered estimate and r = t� T for a smoothed estimate.Since we have formulated a stochastic volatility model, the natural course of action is to use this as

the basis for �ltering, smoothing and prediction. For a linear and Gaussian time series model, the statespace form can be used as the basis for optimal �ltering and smoothing algorithms. Unfortunately, the SVmodel is nonlinear. This leaves us with three possibilities:

1. a.compute ine�cient estimates based on a linear state space model;

b. use computer intensive techniques to estimate the optimal �lter to a desired level of accuracy;

c.use an (unspeci�ed) ARCH model to approximate the optimal �lter.

We now turn to examine each of these in some detail.

3.4.1 Linear State Space Form

The transformed observations, the log y20t s; can be used to construct a linear state space model as suggested

by Nelson (1988) and Harvey, Ruiz and Shephard (1994). The measurement equation is (??) while (??)is the transition equation. The initial conditions for the state, ht, are given by its unconditional mean andvariance, that is zero and �2�=(1 � �2) respectively.

While it may be reasonable to assume that �t is normal, �t would only be normal if the absolute value

of "t were lognormal. This is unlikely. Thus application of the Kalman �lter and the associated smoothers

yields estimators of the state, ht; which are only optimal within the class of estimators based on linear

combinations of the log y20t s. Furthermore, it is not the h0ts which are required, but rather their exponents.

Suppose htjT denotes the smoothed estimator obtained from the linear state space form. Then exp(htjT ) is of the form (??), multiplied by an estimate of the scaling constant, �2: It can be written as

a weighted geometric mean. This makes the estimates vulnerable to very small observations and is an

indication of the limitations of this approach.

Working with the logarithmic transformation raises an important practical issue, namely how to handle

observations which are zero. This is a re ection of the point raised in the previous paragraph, sinceobviously any weighted geometric mean involving a zero observation will be zero. More generally we wish

to avoid very small observations. One possible solution is to remove the sample mean. A somewhat more

satisfactory alternative, suggested by Fuller, and studied by Breidt and Carriquiry (1995), is to make the

following transformation based on a Taylor series expansion:

22

Page 27: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

log y2t�= log(y2t + cs2y)� cs2y=(y

2t + cs2y); t = 1; � � � ; T; (3.4.2)

where s2y is the sample variance of the y0ts and c is a small number, the suggested value being 0:02. The

e�ect of this transformation is to reduce the kurtosis in the transformed observations by cutting down the

long tail made up of the negative values obtained by taking the logarithms of the `inliers'. In other words

it is a form of trimming. It might be more satisfactory, to carry out this procedure after correcting the

observations for heteroskedasticity by dividing by preliminary estimates, c�t20s: The log b�20t s are then addedto the transformed observations. The c�t20s could be constructed from a �rst round or by using a totally

di�erent procedure, perhaps a nonparametric one.

The linear state space form can be modi�ed so as to deal with asymmetric models. It was noted

earlier that even if �t and "t are not mutually independent, the disturbances in the state space form are

uncorrelated if the joint distribution of "t and �t is symmetric. Thus the above �ltering and smoothing

operations are still valid, but there is a loss of information stemming from the squaring of the observations.

Harvey and Shephard (1993) show that this information may be recovered by conditioning on the signsof the observations. These signs are, of course, the same as the signs of the "t's. Let E+(E�) denote the

expectation conditional on "t being positive (negative), and assign a similar interpretation to variance andcovariance operators. The distribution of �t is not a�ected by conditioning on the signs of the "t's, but,remembering that E(�t j "t) is an odd function of "t;

�� = E+(�t) = E+[E�tj"t] = �E�(�t);

and � = Cov+(�t; �t) = E+(�t�t)�E+(�t)E(�t) = E+(�t�t) = �Cov�(�t; �t);

because the expectation of �t is zero and

E+(�t�t) = E+[E(�tj"t) log "t]� ��E(log "t) = �E�(�t�t):

FinallyV ar+�t = E+(�

2t )� [E+(�t)]

2 = �2� � ��2:

The linear state space form is now

log y2t = ! + ht + �tht+1 = �ht + st�

� + ��t;

�t��t

!j st � ID

0

0

!;

�2� �st �st �2� � ��

2

!!:

(3.4.3)

The Kalman �lter may still be initialized by taking h0 to have mean zero and variance �2�=(1 � �2):

The parameterization in (??) does not directly involve a parameter representing the correlation between"t and �t. The relationship between �

� and � and the original parameters in the model can only be obtainedby making a distributional assumption about "t as well as �t. When "t and �t are bivariate normal with

Corr("t; �t) = �, E(�tj"t) = ���"t, and so

�� = E+(�t) = ���E+("t) = ���

q2=� = 0:7979���: (3.4.4)

Furthermore,

� = ���E(j"tj log "2t )� 0:7979���E(log "2t ) = 1:1061���: (3.4.5)

23

Page 28: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

When "t has a t-distribution, it can be written as �t��0:5t , and �t and �t can be regarded as having a

bivariate normal distribution with correlation �, while �t is independent of both. To evaluate �� and �

one proceeds as before, except that the initial conditioning is on �t rather than on "t; and the required

expressions are found to be exactly as in the Gaussian case.

The �ltered estimate of the log volatility ht; written as ht+1jt; takes the form:

ht+1jt = �htjt�1 +�(ptjt�1 + �st)

ptjt�1 + 2 �st + �2�(log y2t � ! � htjt�1) + st�

�;

where ptjt�1 is the corresponding mean square error of the htjt�1: If � < 0; then � < 0; and the �ltered

estimator will behave in a similar way to the EGARCH model estimated by Nelson (1991), with negative

observations causing bigger increases in the estimated log volatility than corresponding positive values.

3.4.2 Nonlinear Filters

In principle, an exact �lter may be written down for the original (??) and (??), with the former takenas the measurement equation. Evaluating such a �lter requires approximating a series of integrals by

numerical methods. Kitagawa (1987) has proposed a general method for implementing such a �lter andWatanabe (1993) has applied it to the SV model. Unfortunately, it appears to be so time consuming asto render it impractical with current computer technology.

As part of their Bayesian treatment of the model as a whole, Jacquier, Polson and Rossi (1994) showhow it is possible to obtain smoothed estimates of the volatilities by simulation. What is required is the

mean vector of the joint distribution of the volatilities conditional on the observations. However, becausesimulating this joint distribution is not a practical proposition, they decompose it into a set of univariatedistributions in which each volatility is conditional on all the others. These distributions may be denotedp (�tj��t; y), where ��t denotes all the volatilities apart from �t. What one would like to do is to samplefrom each of these distributions in turn, with the elements of ��t set equal to their latest estimates, and

repeat several thousand times. As such this is a Gibbs sampler. Unfortunately, there are di�culties. TheMarkov structure of the SV model may be exploited to write

p (�tj��t; y) = p (�tj�t�1; �t+1; yt) / p (ytjht) p (htj ht�1) p (ht+1jht)

but although the right hand side of the above expression can be written down explicitly, the density is not

of a standard form and there is no analytic expression for the normalising constant. The solution adoptedby Jacquier, Polson and Rossi is to employ a series of Metropolis accept/reject independence chains.

Kim and Shephard (1994) argue that the single mover algorithm employed by Jacquier, Polson and

Rossi will be slow if � is close to one and/or �2� is small. This is because �t changes slowly; in fact when

it is constant, the algorithm will not converge at all. Another approach based on the linear state spaceform, is to capture the non-normal disturbance term in the measurement equation, �t, by a mixture of

normals. Watanabe (1993) suggested an approximate method based on a mixture of two moments. Kimand Shephard (1994) propose a multimove sampler based on the linear state space form. Blocks of the

h0

ts are sampled, rather than taking them one at a time. The technique they use is based on mixing an

appropriate number of normal distributions to get the required level of accuracy in approximating thedisturbance in (3.2.7). Mahieu and Schotman (1994a) extend this approach by introducing more degrees

24

Page 29: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

of freedom in the mixture of normals where the parameters are estimated rather than �xed a priori. Note

that the distribution of the �0

ts can be obtained from the simulated distribution of the h0

ts.

Jacquier, Polson and Rossi (1994, p.416) argue that no matter how many mixture components are used

in the Kim and Shephard method, the tail behavior of log "2t can never be satisfactorily approximated.

Indeed, they note that given the discreteness of the Kim and Shephard state space, not all states can have

been visited enough in the small number of draws mentioned, i.e. the socalled inlier problem (see also

section 3.4.1 and Nelson (1994)) is still present.

As a �nal point it should be noted that when the hyperparameters are unknown, the simulated distri-

bution of the state produced by the Bayesian approach allows for their sampling variability.

3.4.3 ARCH Models as Approximate Filters

The purpose here is to draw attention to a subject that will be discussed in greater detail in section 4.3.

In an ARCH model the conditional variance is assumed to be an exact function of past observations. As

pointed out by Nelson and Foster (1994, p.32) this assumption is ad hoc on both economic and statisticalgrounds. However, because ARCH models are relatively easy to estimate, Nelson (1992) and Nelson andFoster (1994) have argued that a useful strategy is to regard them as �lters which produce estimates ofthe conditional variance. Thus even if we believe we have a continuous time or discrete time SV model, wemay decide to estimate a GARCH(1,1) model and treat the �20t s as an approximate �lter, as in (??). Thus

the estimate is a weighted average of past squared observations. It delivers an estimate of the mean ofthe distribution of �2t ; conditional on the observations at time t-1. As an alternative, the model suggestedby Taylor (1986) and Schwert (1989), in which the conditional standard deviation is set up as a linearcombination of the previous conditional standard deviation and the previous absolute value, could be used.This may be more robust to outliers as it is a linear combination of past absolute values.

Nelson and Foster derive an ARCH model which will give the closest approximation to the continuoustime SV formulation (see section 4.3 for more details). This does not correspond to one of the standardmodels, though it is fairly close to EGARCH. For discrete time SV models the �ltering theory is notas extensively developed. Indeed, Nelson and Foster point out that a change from stochastic di�erentequations to di�erence equations makes a considerable di�erence in the limit theorems and optimalitytheory. They study the case of near di�usions as an example to illustrate these di�erences.

3.5 Extensions of the Model

3.5.1 Persistence and Seasonality

The simplest nonstationary SV model has ht following a random walk. The dynamic properties of this

model are easily obtained if we work in terms of the logarithmically transformed observations, log y2t : All wehave to do is �rst di�erence to give a stationary process. The untransformed observations are nonstationary

but the dynamic structure of the model will appear in the ACF of j yt=yt�1 jc; provided that c < 0:5:The model is an alternative to IGARCH, that is (??) with � + � = 1. The IGARCH model is such

that the squared observations have some of the features of an integrated ARMA process and it is said to

exhibit persistence; see Bollerslev and Engle (1993). However, its properties are not straightforward. Forexample it must contain a constant, ; otherwise, as Nelson (1990) has shown, �2t converges almost surely

to zero and the model has the peculiar feature of being strictly stationary but not weakly stationary. The

nonstationary SV model, on the other hand, can be analyzed on the basis that ht is a standard integrated

of order one process.

25

Page 30: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Filtering and smoothing can be carried out within the linear state space framework, since log y2t is just

a random walk plus noise. The initial conditions are handled in the same way as is normally done with

nonstationary structural time series models, with a proper prior for the state being e�ectively formed from

the �rst observation; see Harvey (1989). The optimal �ltered estimate of ht within the class of estimates

which are linear in past log y2t 's, that is htjt�1, is a constant plus an equally weighted moving average

(EWMA) of past log y2t 's. In IGARCH �2t is given exactly by a constant plus an EWMA of past squared

observations.

The random walk volatility can be replaced by other nonstationary speci�cations. One possibility is

the doubly integrated random walk in which �2ht is white noise. When formulated in continuous time,

this model is equivalent to a cubic spline and is known to give a relatively smooth trend when applied

in levels models. It is attractive in the SV context if the aim is to �nd a weighting function which �ts a

smoothly evolving variance. However, it may be less stable for prediction.

Other nonstationary components can easily be brought into ht: For example, a seasonal or intra-daily

component can be included; the speci�cation is exactly as in the corresponding levels models discussed inHarvey (1989) and Harvey and Koopman (1993). Again the dynamic properties are given straightforwardlyby the usual transformation applied to log y2t ; and it is not di�cult to transform the absolute values suitably.Thus if the volatility consists of a random walk plus a slowly changing, nonstationary seasonal as in Harvey

(1989, p. 40-3), the appropriate transformations are �s log y2t and j yt=yt�s jc where s is the number of

seasons. The state space formulation follows along the lines of the corresponding structural time seriesmodels for levels. Handling such e�ects is not so easy within the GARCH framework.

Di�erent approaches to seasonality can also be incorporated in SV models using ideas of time defor-mation as discussed in a later sub-section. Such approaches may be particularly relevant when dealing

with the kind of abrupt changes in seasonality which seem to occur in high frequency, like �ve minute ortick-by-tick, foreign exchange data.

3.5.2 Interventions and other deterministic e�ects

Intervention variables are easily incorporated into SV models. For example, a sudden structural change inthe volatility process can be captured by assuming that

log �2t = log �2 + ht + �wt

where wt is zero before the break and one after and � is an unknown parameter. The logarithmic transfor-mation gives (??) but with �wt added to the right hand side. Care needs to be taken when incorporatingsuch e�ects into ARCH models. For example, in the GARCH(1,1) a sudden break has to be modelled as

�2t = + �wt � (�+ �)�wt�1 + �y2t�1 + ��2t�1

with � constrained so that �2t is always positive.

More generally observable explanatory variables, as opposed to intervention dummies, may enter intothe model for the variance.

3.5.3 Multivariate Models

The multivariate model corresponding to (??) assumes that each series is generated by a model of the

form

26

Page 31: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

yit = �i"ite0:5hit; t = 1; :::; T; (3.5.1)

with the covariance (correlation) matrix of the vector "t = ("1t; :::; "Nt)0 being denoted by �" : The vector

of volatilities, ht, follows a VAR(1) process, that is

ht+1 = �ht + �t;

where �t � IID(0;��). This speci�cation allows the movements in volatility to be correlated across

di�erent series via ��. Interactions can be picked up by the o�-diagonal elements of �:

The logarithmic transformation of squared observations leads to a multivariate linear state space model

from which estimates of the volatilities can be computed as in section 3.4.1.

A simple nonstationary model is obtained by assuming that the volatilities follow a multivariate random

walk, that is � = I: If �� is singular, of rank K < N , there are only K components in volatility, that is

each hit in (??) is a linear combination of K < N common trends, that is

ht = �hyt+h (3.5.2)

where hyt is the K � 1 vector of common random walk volatilities; h is a vector of constants and � is anN �K matrix of factor loadings. Certain restrictions are needed on � and h to ensure identi�ability; see

Harvey, Ruiz and Shephard (1994). The logarithms of the squared observations are `co-integrated' in thesense of Engle and Granger (1987) since there are N�K linear combinations of them which are white noiseand hence stationary. This implies, for example, that if two series of returns exhibit stochastic volatility,but this volatility is the same with �0 = (1; 1); then the ratio of the series will have no stochastic volatility.The application of the related concept of `co-persistence' can be found in Bollerslev and Engle (1993).However, as in the univariate case there is some ambiguity about what actually constitutes persistence.

There is no reason why the idea of common components in volatility should not extend to stationarymodels. The formulation of (??) would apply, without the need for h, and with hyt modelled, for example,by a VAR(1).

Bollerslev, Engle and Wooldridge (1988) show that a multivariate GARCH model can, in principle, beestimated by maximum likelihood, but because of the large number of parameters involved computational

problems are often encountered unless restrictions are made. The multivariate SV model is much simpler

than the general formulation of a multivariate GARCH. However, it is limited in that it does not modelchanging covariances. In this sense it is analogous to the restricted multivariateGARCHmodel of Bollerslev(1986) in which the conditional correlations are assumed to be constant.

Harvey, Ruiz and Shephard (1994) apply the nonstationary model to four exchange rates and �nd

just two common factors driving volatility. Another application is in Mahieu and Schotman (1994b). Acompletely di�erent way of modelling exchange rate volatility is to be found in the latent factor ARCH

model of Diebold and Nerlove (1989).

3.5.4 Observation intervals, aggregation and time deformation

Suppose that an SV model is observed every � time periods. In this case, h� , where � denotes the new

observation (sampling) interval, is still AR(1) but with parameter ��: The variance of the disturbance, �t;

increases, but �2h remains the same. This property of the SV model makes it easy to make comparisons

27

Page 32: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

across di�erent sampling intervals; for example it makes it clear why if � is around 0.98 for daily observa-

tions, a value of around 0.9 can be expected if an observation is made every week (assuming a week has 5

days).

If averages of observations are observed over the longer period, the comparison is more complicated, as

h� will now follow an ARMA(1,1) process. However, the AR parameter is still ��: Note that it is di�cult

to change the observation interval of ARCH processes unless the structure is weakened as in Drost and

Nijman (1993); see also section 4.4.1.

Since, as noted in section 2.4, one typically uses a discrete time approximation to the continuous time

model, it is quite straightforward to handle irregularly spaced observations by using the linear state space

form as described, for example, in Harvey (1989). Indeed the approach originally proposed by Clark

(1973) based on subordinated processes to describe asset prices and their volatility �ts quite well into this

framework. The techniques for handling irregularly spaced observations can be used as the basis for dealing

with time deformed observations, as noted by Stock (1988). Ghysels and Jasiak (1994a, b) suggest a SV

model in which the operational time for the continuous time volatility equation is determined by the owof information. Such time deformed processes may be particularly suited to dealing with high frequencydata. If � = g(t) is the mapping between calendar time � and operational time t; then

dSt = �Stdt+ � (g(t))StdW1t

and

d log �(� ) = a ((b� log �(� )) d� + cdW2�

where W1t and W2� are standard, independent Wiener processes. The discrete time approximation gener-alizing (??), but including a term which in (??) is incorporated in the constant scale factor �; is then

ht+1 = [1� e�a�g(t)]b+ e�a�g(t)ht + �t

where �g(t) is the change in operational time between two consecutive calendar time observations and �tis normally distributed with mean zero and variance c2(1� e�2a�g(t))=2a: Clearly if �g(t) = 1; � = e�a in

(??). Since the ow of information, and hence �g(t); is not directly observable, a mapping to calendar timemust be speci�ed to make the model operational. Ghysels and Jasiak (1994a) discuss several speci�cationsrevolving around a scaled exponential function relating g(t) to observables such as past volume of trade

and past price changes with asymmetric leverage e�ects. This approach was also used by Ghysels andJasiak (1994b) to model return-volume co-movements and by Ghysels, Gouri�eroux and Jasiak (1995b) for

modeling intra-daily high frequency data which exhibit strong seasonal patterns (cfr. section 3.5.1).

3.5.5 Long Memory

Baillie, Bollerslev and Mikkelsen (1993) propose a way of extending the GARCH class to account for

long memory. They call their models Fractionally Integrated GARCH (FIGARCH), and the key feature

is the inclusion of the fractional di�erence operator, (1 � L)d; where L is the lag operator, in the lagstructure of past squared observations in the conditional variance equation. However, this model can

only be stationary when d = 0 and it reduces to GARCH. In a later paper, Bollerslev and Mikkelsen(1995) consider a generalization of the EGARCH model of Nelson (1991) in which log �2t is modelled

as a distributed lag of past "ts involving the fractional di�erence operator. This FIEGARCH model is

stationary and invertible if j d j< 0:5:

28

Page 33: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Breidt, Crato and de Lima (1993) and Harvey (1993) propose an SV model with ht generated by

fractional noise

ht = �t=(1 � L)d; �t � NID(0; �2�); 0 � d � 1: (3.5.1)

Like the AR(1) model in (??), this process reduces to white noise and a random walk at the boundary

of the parameter space, that is d = 0 and 1 respectively. However, it is only stationary if d<0.5. Thus

the transition from stationarity to nonstationarity proceeds in a di�erent way to the AR(1) model. As

in the AR(1) case it is reasonable to constrain the autocorrelations in (??) to be positive. However, a

negative value of d is quite legitimate and indeed di�erencing ht when it is nonstationary gives a stationary

'intermediate memory' process in which �0:5 � d � 0.

The properties of the long memory SV model can be obtained from the formulae in sub-section 3.2.

A comparison of the ACF for ht following a long memory process with d = 0:45 and �2h = 2 with the

corresponding ACF when ht is AR(1) with � = 0:99 can be found in Harvey (1993). Recall that a

characteristic property of long memory is a hyperbolic rate of decay for the autocorrelations instead of an

exponential rate, a feature observed in the data (see section 2.2(e)). The slower decline in the long memorymodel is very clear and, in fact, for � = 1000; the long memory autocorrelation is still 0.14, whereas inthe AR case it is only 0.000013. The long memory shape closely matches that in Ding, Granger and Engle(1993, p86-8).

The model may be extended by letting �t be an ARMA process and/or by adding more components tothe volatility equation.

As regards smoothing and �ltering, it has already been noted that the state space approach is approx-imate because of the truncation involved and is relatively cumbersome because of the length of the statevector. Exact smoothing and �ltering, which is optimal within the class of estimators linear in the log y20t s

, can be carried out by a direct approach if one is prepared to construct and invert the T�T covariancematrix of the log y20t s .

4 Continuous Time Models

At the end of section 2 we presented a framework for statistical modelling of SV in discrete time anddevoted the entire section 3 to speci�c discrete time SV models. To motivate the continuous time modelswe study �rst of all the exact relationship (i.e. without approximation error) between di�erential equations

and SV models in discrete time. We examine this relationship in section 4.1 via a class of statistical models

which are closed under temporal aggregation and proceed (1) from high frequency discrete time to lower

frequencies and (2) from continuous time to discrete time. Next, in section 4.2, we study option pricing

and hedging with continuous time models and elaborate on features such as the smile e�ect. The practicalimplementation of option pricing formulae with SV often requires discrete time SV and/or ARCH models

as �lters and forecasters of the continuous time volatility processes. Such �lters, covered in section 4.3,are in general discrete time approximations (and not exact discretizations as in section 4.1) of continuous

time SV models. Section 4.4 concludes with extensions of the basic model.

4.1 From discrete to continuous time

The purpose of this section is to provide a rigorous discussion of the relationship between discrete and

continuous time SV models. The presentation will proceed �rst with a discussion of temporal aggregation

in the context of the SARV class of models and focus on speci�c cases including GARCH models. This

29

Page 34: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

material is covered in section 4.1.1. Next we turn our attention to the aggregation of continuous time SV

models to yield discrete time representations. This is the subject matter of section 4.1.2.

4.1.1 Temporal Aggregation of Discrete Time Models

Andersen's SARV class of models was presented in section 2.4 as a general discrete time parametric SV

statistical model. Let us consider the zero-mean case, namely :

yt+1 = �t"t+1 (4.1.1)

and �qt for q = 1 or 2 is a polynomial function g(Kt) of the Markov process Kt with stationary autoregressive

representation :

Kt = ! + �Kt�1 + �t (4.1.2)

where j�j < 1 and

E ["t+1j "� ; �� � � t] = 0

Eh"2t+1

��� "� ; �� � � ti= 1

E [�t+1j "� ; �� � � t] = 0

(4.1.3)

The restrictions (4.1.3a-c) imply that � is a martingale di�erence sequence with respect to the �ltrationFt = � ["� ; �� ; � � t].21 Moreover, the conditional moment conditions in (4.3.1a-c) also imply that " in

(4.1.1) is a white noise process in a semi-strong sense, i.e. E["t+1j "� ; � � t] = 0 and Eh"2t+1

��� "� ; � � ti= 1,

and is not Granger caused by �.22 From the very beginning of section 2 we choose the continuouslycompounded rate of return over a particular time horizon as the starting point for continuous time processes.Therefore, let yt+1 in (4.1.1) be the continuously compounded rate of return for [t, t+ 1] of the asset priceprocess St, consequently :

yt+1 = logSt+1/St (4.1.4)

Since the unit of time of the sampling interval is to a large extend arbitrary, we would surely want the

SV model de�ned by equations (4.1.1) through (4.1.3), (for given q and function g) to be closed undertemporal aggregation. As rates of return are ow variables, closedness under temporal aggregation meansthat for any integer m :

y(m)tm � logStm/Stm�m =

m�1Xk=0

ytm�k

is again conformable to a model of the type (4.1.1) through (4.1.3) for the same choice of q and g involving

suitably adapted parameter values. The analysis in this section follows Meddahi and Renault (1995) who

study temporal aggregation of SV models in detail, particularly the cases (1) �2t = Kt , i.e. q = 2 and g

is the identity function and (2) �2t = exp (Kt) which is the leading discrete time SV model. We will focus

21Note that we do not use here the decomposition appearing in (2.4.9) namely, �t = [ + aKt�1] ~ut.22The Granger noncausality considered here for "t is weaker than Assumption 2.3.2.A as it applies only to the �rst two

conditional moments.

30

Page 35: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

here on the former as it is related to the socalled continuous time GARCH approach of Drost and Werker

(1994). Hence, we have (4.1.1) with:

�2t = ! + ��2t�1 + �t (4.1.5)

With conditional moment restrictions (4.1.3a-c) this model is closed under aggregation. For instance,

for m = 2 :

y(2)t+1 = yt+1 + yt = �

(2)t�1"

(2)t+1

with : ��(2)t�1

�2= w(2) + �(2)

��(2)t�3

�2+ �

(2)t�1

where : 8><>:w(2) = 2! (1 + �)

�(2) = �2

�(2)t�1 = (� + 1) [��t�2 + �t�1] .

Moreover, it also worth nothing that whenever a leverage e�ect is present at the aggregate level, i.e. :

Covh�(2)t�1; "

(2)t�1

i6= 0

with "(2)t�1 = (yt�1 + yt�2)

.�(2)t�3 and �

(2)t�1 = (� + 1) (��t�1 + �t�1), it necessarily appears at the disaggregate

level, i.e. Cov (�t; "t) 6= 0.For the general case Meddahi and Renault (1995) show that model (4.1.5a-b) together with conditional

moment restrictions (4.1.3a-c) is a class of processes closed under aggregation. Given this result, it is

of interest to draw a comparison with the work of Drost and Nijman (1993) on temporal aggregationof GARCH. While establishing this link between Meddahi and Renault (1995) and Drost and Nijman(1993) we will also uncover issues of leverage properties in GARCH models. Indeed, contrary to what isoften believed, we will �nd leverage e�ect restrictions in socalled weak GARCH processes de�ned below.Moreover, we will also �nd from the results of Meddahi and Renault that the class of weak GARCH

processes includes certain SV models.

To �nd a class of GARCH processes which is closed under aggregation Drost and Nijman (1993)weakened the de�nition of GARCH, namely for a positive stationary process ht :

ht = w + ay2t + bht�1 (4.1.6)

where a+ b < 1, they de�ned :

� strong GARCH if yt+1.p

ht is i.i.d. with mean zero and variance 1

� semi-strong GARCH if E[yt+1j y� ; � � t] = 0 and Ehy2t+1 jy� ; � � t

i= ht

� weak GARCH if EL[yt+1j y� ; y2� ; � � t] = 0 ; ELhy2t+1 jy� ; y2� ; � � t

i= ht.

23

23For any Hilbert space H of L2, EL[xt jz; z 2 H ] is the best linear predictor of xt in terms of 1 and z 2 H. It should benoted that a strong GARCH process is a fortiori semi-strong which itself is also a weak GARCH process.

31

Page 36: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Drost and Nijman show that weak GARCH processes temporally aggregate and provide explicit formula

for their coe�cients. In section 2.4 it was noted that the framework of SARV includes GARCH processes

whenever there is no randomness speci�c to the volatility process. This property will allow us to show

that the class of weak GARCH processes -as de�ned above- in fact includes more general SV processes

which are strictly speaking not GARCH. The arguments, following Meddahi and Renault (1995), require a

classi�cation of the models de�ned by (4.1.3) and (4.1.5) according to the value of the correlation between

�t and y2t , namely :

(a) Models with perfect correlation : This �rst class, henceforth denoted C1, is characterized by a

linear correlation between �t and y2t conditional on ("� ; �� ; � < t) which is either 1 or -1 for the model in

(4.1.5a-b).

(b) Models without perfect correlation : This second class, henceforth denoted C2 has the above condi-

tional correlation less than one in absolute value.

The class C1 contains all semi-strong GARCH processes, indeed whenever V ar [y2t j "t; �� ; � < t] is

proportional to V ar [�tj "� ; �� ; � < t] in C1 we have a semi-strong GARCH. Consequently, a semi-strongGARCH processes is a model (4.1.5a-b) with (1) restrictions (4.1.3), (2) a perfect conditional correlationas in C1, and (3) restrictions on the conditional kurtosis dynamics. 24

Let us consider now the following assumption :

Assumption 4.1.1 : The following two conditional expectations are zero :

E ["t�t j"� ; �� ; � < t ] = 0E ["3t j"� ; �� ; � < t ] = 0.

(4.1.7)

This assumption amounts to an absence of leverage e�ects, where the latter is de�ned in a conditionalcovariance sense to capture the notion of instantaneous causality discussed in section 2.4.1 and appliedhere in the context of weak white noise.25 It should also parenthetically be noted that (4.1.7a) and (4.1.7b)

are in general not equivalent except for the processes of class C1.The class C2 allows for randomness proper to the volatility process due to the imperfect correlation.

Yet, despite this volatility-speci�c randomness one can show that under Assumption 4.1.1 processes ofC2 satisfy the weak GARCH de�nition. A fortiori, any SV model conformable to (4.1.3a-c), (4.1.5a-b) and Assumption 4.1.1 is a weak GARCH process. It is indeed the symmetry assumption (4.1.5a-b),

or restrictions on leverage in GARCH, that makes that ELhy2t+1 jy� ; y2� ; � � t

i= �2t (together with the

conditional moment restrictions (4.1.3a-c)) and yields the internal consistency for temporal aggregation

found by Drost and Nijman (1993, example 2, p.915) for the class of socalled symmetric weak GARCH(1,1).

Hence, this class of weak GARCH(1,1) processes can be viewed as a subclass of processes satisfying (4.1.3)

and (4.1.5).26

24In fact, Nelson and Foster (1994) observed that the most commonly used ARCH models e�ectively assume that thevariance of the variance rises linearly in �4t , which is the main drawback of ARCH models to approximate SV models incontinuous time (see also section 4.3).

25The conditional expectation (4.1.7b) can be viewed as a conditional variance between "t and "2t . It is this conditionalcovariance which, if nonzero, produces leverage e�ects in GARCH.

26As noted before, the class of processes satisfying (4.1.3) and (4.1.5) is closed under temporal aggregation, includingprocesses with leverage e�ects not satisfying Assumption 4.1.1.

32

Page 37: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

4.1.2 Temporal aggregation of continuous time models

To facilitate our discussion we will specialize the general continuous time model (2.3.1) to processes with

zero drift, i.e. :

d log St = �tdWt

d�t = tdt+ �tdW�t

Cov (dWt; dW�t ) = �tdt

(4.1.8)

where the stochastic processes �t; t; �t and �t are I�t = [�� ; � � t] adapted. To ensure that �t is a nonneg-

ative process one typically follows either one of two strategies : (1) considering a di�usion for log �2t or (2)

describing �2t as a CEV process (or Constant Elasticity of Variance process following Cox (1975) and Cox

and Ross (1976)).27 The former is frequently encountered in the option pricing literature (see e.g. Wiggins

(1987)) and is also clearly related to Nelson (1991), who introduced EGARCH, and to the log-Normal SV

model of Taylor (1986). The second class of CEV processes can be written as

d�2t = k�� � �2t

�dt+

��2t

��dW �

t (4.1.9)

where � � 12ensures that �2t is a stationary process with nonnegative values. Equation (4.1.9) can be

viewed as the continuous time analogue of the discrete time SARV class of models presented in section2.4. This observation establishes links with the discussion of the previous section 4.1.1 and yields exactdiscretization results of continuous time SV models. Here, as in the previous section it will be tempting

to draw comparisons with the GARCH class of models, in particular the di�usions proposed by Drost andWerker (1994) in line with the temporal aggregation of weak GARCH processes.

Firstly, one should note that the CEV process in (4.1.9) implies an autoregressive model in discretetime for �2t , namely :

�2t+�t = ��1� e�k�t

�+ e�k�t�2t + e�k�t

t+�tZt

ek(u�t) ��2u

��dW �

u (4.1.10)

Meddahi and Renault (1995) show that whenever (4.1.9) and its discretization (4.1.10) govern volatility

then the discrete time process log St+(k+1)�t

.St+k�t; k 2 ZZ is a SV process satisfying the model restric-

tions (4.1.3a-c) and (4.1.5a-b). Hence, from the di�usion (4.1.9) we obtain the class of discrete time SVmodels which is closed under temporal aggregation, as discussed in the previous section. To be more

speci�c, consider for instance �t = 1, then from (4.1.10) it follows that :

yt+1 = logSt+1 /St = �t"t+1

�2t = w + ��2t�1 + �t

(4.1.10)

where from (4.1.10) :

27Occasionally one encounters speci�cations which do not ensure nonnegativity of the �t process. For the sake of compu-tational simplicity some authors for instance have considered Ornstein-Uhlenbeck processes for �t or �

2t (see e.g. Stein and

Stein (1991)).

33

Page 38: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

8>>>><>>>>:� = e�k; w = �

�1 � e�k

�;

�t+1 = e�kt+1Rtek(u�t) (�2u)

�W �

u .

(4.1.11)

It is important to note from (4.1.11) that absence of leverage e�ect in continuous time, i.e. �t = 0 in

(4.1.8c), means no such e�ect at low frequencies and the two symmetry conditions of Assumption 4.1.1

ful�lled. This line of reasoning also explains the temporal aggregation result of Drost and Werker (1994),

but as noted in the previous section does not require absence of leverage. Indeed, following Meddahi and

Renault (1995) one can interpret discrete time SV models with leverage e�ects as exact discretizations of

continuous time SV models with CEV di�usions for volatility.

4.2 Option pricing and hedging

Section 4.2.1 is devoted to the basic option pricing model with SV or Hull and White model. This wasintroduced in section 2 but we are better equipped now to elaborate on its theoretical foundations. Thepractical implications appear in section 4.2.2 while 4.2.3 concludes with some extensions of the basic model.

4.2.1 The Basic Option Pricing Formula

Consider again formula (2.1.8) for a European option contract maturing at time t+ h = T . As noted in

section 2.1.2, we assume continuous and frictionless trading. Moreover no arbitrage pro�ts can be madefrom trading in the underlying asset and riskless bonds ; interest rates are nonstochastic so that B (t; T )de�ned by (2.1.12) denotes the time t price of a unit discount bond maturing at time T . Consider now theprobability space (;F ;P), which is the fundamental space of the underlying asset price process S :

dSt /St = � (t; St; Ut) dt+ �tdWSt

�2t = f (Ut)dUt = a (t; Ut) dt+ b (t; Ut) dW

�t

(4.2.1)

where Wt =�W S

t ;WUt

�, is a standard two dimensional Brownian Motion (W S

t and WUt are independent,

zero-mean and unit variance) de�ned on (;F ;P). The function f , called the volatility function, is assumed

to be one-to-one. In this framework (under suitable regularity conditions) the no free lunch assumptionis equivalent to the existence of a probability distribution Q on (;F), equivalent to P , under whichdiscounted price processes are martingales (see Harrison and Kreps (1979)). Such a probability is called

an equivalent martingale measure and is unique if and only if the markets are complete (see Harrison and

Pliska (1981)).28 From the integral form of martingale representations (see Karatzas and Shreve (1988),

problem 4.16, page 184), the (positive) density process of any probability measure Q equivalent to P can

be written as :

Mt = exp

��Z t

0�SudW

Su �

1

2

Z t

0

��Su

�2du�

Z t

0��udW

�u �

1

2

Z t

0(��u)

2du

�(4.2.2)

28Here, the market is seen as incomplete (before taking into account the market pricing of option) so that we have tocharacterize a set of equivalent martingale measures.

34

Page 39: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

where the processes �S and �� are adapted to the natural �ltration �t = �[W�; � � t]; t � 0, and satisfy

the integrability conditions (almost surely) :Z t

0

��Su

�2du < +1 and

Z t

0(��u)

2du < +1

By Girsanov's theorem the process ~W =�~W S; ~W �

�0de�ned by :

~W St = ~W S

t +Z t

0�Sudu and ~W �

t =W �t +

Z t

0��udu (4.2.3)

is a two dimensional Brownian Motion under Q. The dynamic of the underlying asset price under Q is

obtained directly from (4.2.1) and (4.2.3). Moreover, the discounted asset price process StB (0; t) ; 0 � t �T , is a Q-martingale if and only if for rt de�ned in (2.1.11) :

�St =� (t; St; Ut)� rt

�t(4.2.4)

Since S is the only traded asset, the process �� is not �xed. The process �S de�ned by (4.2.4) is calledthe asset risk premium. By analogy, any process �� satisfying the required integrality condition can beviewed as a volatility risk premium and for any choice of �� , the probability Q (��) de�ned by the densityprocess M in (4.2.2) is an equivalent martingale measure. Therefore, given the volatility risk premium

process �� :

C��

t = B (t; T )EQ(��)t [Max [0; ST �K]] ; 0 � t � T (4.2.5)

is an admissible price process of the European call option.29

The Hull and White option pricing model relies on the following assumption, which restricts the set of

equivalent martingale measures :

Assumption 4.2.1 : The volatility risk premium ��t only depends on the current value of the volatility

process : ��t = �� (t; Ut) ;8t 2 [0; T ].

This assumption is consistent with an intertemporal equilibriummodel where the agent preferences aredescribed by time separable isoelastic utility functions (see He (1993) and Pham and Touzi (1993)). It

ensures that ~W S and ~W � are independent, so that the Q (��) distribution of log ST /St , conditionally onFt and the volatility path (�t; 0 � t � T ) is normal with mean

R Tt rudu� 1

2 2 (t; T ) and variance 2 (t; T ) =R T

t �2udu. Under Assumption 4.2.1 one can compute the expectation in (4.2.5) conditionally on the volatility

path, and we obtain �nally:

C��

t = StEQ(��)t

h� (d1)� e�xt� (d2)

i(4.2.6)

with the same notation as in (2.1.20). To conclude it is worth noting that many option pricing formulaeavailable in the literature have a feature common with (4.2.6) as they can be expressed as an expectation

of the Black-Scholes price over an heterogeneity distribution of the volatility parameter (see Renault (1995)

for an elaborate discussion on this subject).

29Here elsewhere EQt (�) = EQ ( �jFt) stands for the conditional expectation operator given Ft when the price dynamics

are governed by Q.

35

Page 40: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

4.2.2 Pricing and Hedging with the Hull and White model

The Markov feature of the process (S; �) implies that the option price (4.2.6) only depends on the con-

temporaneous values of the underlying asset prices and its volatility. Moreover, under mild regularity

conditions, this function is di�erentiable. Therefore, a natural way to solve the hedging problem in this

stochastic volatility context is to hedge a given option of price C1t by �

�t units of the underlying asset andP

t units of any other option of price C2t where the hedging ratios solve :(

@C1t =@St ���

t �P�

t @C2t =@St = 0

@C1t =@�t �

P�

t @C2t =@�t = 0

(4.2.7)

Such a procedure, known as the delta-sigma hedging strategy, has been studied by Scott (1991). By

showing that any European option completes the market, i.e. @C2t /@�t 6= 0, 0 � t � T , Bajeux and

Rochet justify the existence of an unique solution to the delta-sigma hedging problem (4.2.7) and the

implicit assumption in the previous sections that the available information It contains the past values

(St; �t), � � t.Nevertheless, in practice, option traders focus on the risk due to the underlying asset price variations

and consider the imperfect hedging strategyP

t = 0 and �t = @C1t =@St. Then, the Hull and White option

pricing formula (4.2.6) provides directly the theoretical value of �t :

�t = @C��

t /@St = EQ(��)t � (d1) (4.2.8)

This theoretical value is hard to use in practice since : (1) even if we knew the Q (��) conditionalprobability distribution of d1 given It (summarised by �t), the derivation of the expectation (4.2.8) mightbe computationally intensive and (2) the conditional probability is directly related to the conditional prob-ability distribution of 2 (t; T ) =

R Tt �

2udu given �t, which in turn may involve nontrivially the parameters

of the latent process �t. Moreover, these parameters are those of the conditional probability distributionof 2 (t; T ) given �t under the risk-neutral probability Q (��) which is generally di�erent from the DataGenerating Process P. The statistical inference issues are therefore quite complicated. We will argue insection 5 that only tools like simulation-based inference methods involving both asset and option pricesdata (via an option pricing model) may provide some satisfactory solutions.

Nevertheless, a practical way to avoid these complications is to use the Black-Scholes option pricing

model, even though it is known to be misspeci�ed. Indeed, option traders know that they cannot generallyobtain su�ciently accurate option prices and hedge ratios by using the BS formula with historical estimatesof the volatility parameters based on time series of the underlying asset price. However, the concept of

Black-Scholes implied volatility (2.1.23) is known to improve the pricing and hedging properties of the BS

model. This raises two issues : (1) what is the internal consistency of the simultaneous use of the BS

model (which assumes constant volatility) and of BS implied volatility which is clearly time-varying and

stochastic and (2) how to exploit the panel structure of option pricing errors ?30

Concerning the �rst issue, we noted in section 2 that the Hull and White option pricing model can

indeed be seen as a theoretical foundation for this practice of pricing. Hedging issues and the panel

structure of option pricing errors are studied in detail in Renault and Touzi (1992) and Renault (1995).

30The value of � which equates the BS formula to the observed market price of the option heavily depends on the actualdate t, the strike price K, the time to maturity (T � t) and therefore creates a panel data structure.

36

Page 41: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

4.2.3 Smile or Smurk?

As noted in section 2.2, the smile e�ect is now a well documented empirical stylized fact. Moreover the

smile becomes sometimes a smurk since it appears more or less lopsided (the socalled skewness e�ect). We

cautioned in section 2 that some explanations of the smile/smurk e�ect are often founded on tempting

analogies rather than on rigorous proofs.

To the best of our knowledge, the state of the art is the following : (i) the �rst formal proof that a Hull

and White option pricing formula implies a symmetric smile was provided by Renault and Touzi (1992), (ii)

the �rst complete proof that the smile/smurk e�ects can alternatively be explained by liquidity problems

(the upper parts of the smile curve, i.e. the most expensive options are the least liquid) was provided by

Platten and Schweizer (1994) using a microstructure model, (iii) there is no formal proof that asymmetries

of the probability distribution of the underlying asset price process (leverage e�ect, non-normality,...) are

able to capture the observed skewness of the smile. A di�erent attempt to explain the observed skewness

is provided by Renault (1995). He showed that a slight discrepancy between the underlying asset price~St used to infer BS implied volatilities and the stock price St considered by option traders may generatean empirically plausible skewness in the smile. Such nonsynchronous ~St and St may be related to variousissues : bid-ask spreads, non-synchronous trading between the two markets, forecasting strategies based

on the leverage e�ect, etc.Finally, to conclude it is also worth noting that a new approach initiated by Gouri�eroux, Monfort,

Tenreiro (1994) and followed also by Ait-Sahalia, Bickel, Stoker (1994) is to explain the BS implied volatilityusing a nonparametric function of some observed state variables. Gouri�eroux, Monfort, Tenreiro (1995)obtain for example a good nonparametric �t of the following form :

�t (St;K) = a (K) + b (K) (logSt /St�1 )2 .

A classical smile e�ect is directly observed on the intercept a(K) but an inverse smile e�ect appears forthe path-dependent e�ect parameter b(K). For American options a di�erent nonparametric approach is

pursued by Broadie, Detemple, Ghysels and Torr�es (1995) where besides volatility also exercise boundariesfor the option contracts are nonparametrically obtained.31

4.3 Filtering and Discrete Time Approximations

In section 3.4.3 it was noted that the ARCH class of models could be viewed as �lters to extract the

(continuous time) conditional variance process from discrete time data. Several papers were devoted to

the subject, namely Nelson (1990, 1992, 1995a,b) and Nelson and Foster (1994, 1995). It was one of Nelson'sseminal contributions to bring together ARCH and continuous time SV. Nelson's �rst contribution in his

1990 paper was to show that ARCH models, which model volatility as functions of past (squared) returns,converge weakly to a di�usion process, either a di�usion for log �2t or a CEV process as described in section4.1.2. In particular, it was shown that a GARCH(1,1) model observed at �ner and �ner time intervals

�t = h with conditional variance parameters !h = h!; �h = � (h/ 2)1

2 and �h = 1 � � (h/ 2)1

2 � �h andconditional mean �h = hc�2t converges to a di�usion limit quite similar to equations (4.1.8a) combined

with (4.1.9) with � = 1, namely

d log St = c�2t dt+ �tdWt

d�2t = (! � ��2t ) dt+ �2t dW�t .

31See also Bossaerts and Hillion (1995) for the use of a nonparametric hedging procedure and the smile e�ect.

37

Page 42: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Similarly, it was also shown that a sequence of AR(1)-EGARCH(1,1) models converges weakly to an

Ornstein-Uhlenbeck di�usion for ln�2t :

d ln�2t = ��� � ln�2t

�dt+ dW �

t .

Hence, these basic insights showed that the continuous time stochastic di�erence equations emerging

as di�usion limits of ARCH models were no longer ARCH but instead SV models. Moreover, following

Nelson (1992) even when misspeci�ed ARCH models still kept desirable properties regarding extracting

the continuous time volatility. The argument was that for a wide variety of misspeci�ed ARCH models

the di�erence between the ARCH �lter volatility estimates and the true underlying di�usion volatilities

converges to zero in probability as the length of the sampling time interval goes to zero at an appropriate

rate. For instance the GARCH(1,1) model with !h, �h and �h described before estimates �̂2t as follows :

�̂2t = !h (1 � �h)�1

+1Xi=o

�h�ihy

2t�h(i+1)

where yt = logSt /St�h . This �lter can be viewed as a particular case of equation (3.4.1). The GARCH(1,1)and many other models, e�ectively achieve consistent estimation of �t via a lag polynomial function of

past squared returns close to time t.The fact that a wide variety of misspeci�ed ARCH models consistently extract �t from high frequency

data raises questions regarding e�ciency of �lters. The answers to such questions are provided in Nelson(1995a,b) and Nelson and Foster (1994, 1995). In section 3.4 it was noted that the linear state spaceKalman �lter can also be viewed as a (suboptimal) extraction �lter for �t. Nelson and Foster (1994) show

that the asymptotically optimal linear Kalman �lter has asymptotic variance for the normalized estimation

error h�1

4 [ln (�̂2t )� ln�2t ] equal to �Y(1/ 2)1

2 where Y(x) = d [ln � (x)] /dx and � is scaling factor. A model,closely related to EGARCH of the following form :

ln��̂2t+h

�= ln (�̂2t ) + �� (St+h � St) �̂

�1t

+� (1 � �2)1

2

h� (1/ 2)

1

2 � (3/ 2)1

2 jSt+2 � Stj �̂�1t � 2�1

2

iyields the asymptotically optimal ARCH �lter with asymptotic variance for the normalized estimation

error equal to � [2 (1 � �2)]1

2 where the parameter � measure the leverage e�ect. These results also showthat the di�erences between the most e�cient suboptimal Kalman �lter and the optimal ARCH �lter can

be quite substantial. Besides �ltering one must also deal with smoothing an forecasting. Both of theseissues were discussed in section 3.4 in discrete time SV models. The prediction properties of (misspeci�ed)ARCH models were studied extensively by Nelson and Foster (1995). Nelson (1995) takes ARCH models

a step further by studying smoothing �lters, i.e. ARCH models involving not only lagged squared returns

but also future realizations, i.e. r = t� T in equation (3.4.1).

4.4 Long Memory

We conclude this section with a brief discussion of long memory in continuous time SVmodels. The purpose

is to build continuous time long memory stochastic volatility models which are relevant for high frequency

�nancial data and for (long term) option pricing. The reasons motivating the use of long memory models

were discussed in sections 2.2 and 3.5.5. The advantage of considering continuous time long memory istheir relative ability to provide a more structural interpretation of the parameters governing short term

and long term dynamics. The �rst subsection de�nes fractional Brownian Motion. Next we will turn our

attention to the fractional SV model followed by a section on �ltering and discrete time approximations.

38

Page 43: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

4.4.1 Stochastic integration with respect to fractional Brownian Motion

We recall in this subsection a few de�nitions and properties of fractional and long memory processes in

continuous time, extensively studied for instance in Comte and Renault (1993). Consider the scalar process:

xt =Z t

0a (t� s) dWs (4.4.1)

Such a process is asymptotically equivalent in quadratic mean to the stationary process :

yt =Z t

�1

a (t� s) dWs (4.4.2)

wheneverR+10 a2 (x) dx < +1. Such processes are called fractional processes if a (x) = x�~a (x) /� (1 + �)for

j�j < 1/ 2; 0~a where � (1 + �) is a scaling factor useful for normalizing fractional derivation operators on

[0; T ]. Such processes admit several representations, and in particular that they can also be written :

xt =Z t

0c (t� s) dW�s;W�t =

Z t

0

(t� s)�

� (1 + �)dWs (4.4.3)

where W� is the so-called fractional Brownian Motion of order � (see Mandelbrot and Van Ness (1968)).The relation between the functions a and c is one-to-one. One can show that W� is not a semi-

martingale (see e.g. Rogers (1995)) but stochastic integration with respect to W� can be de�ned properly.The processes xt are long memory if:

limx~a (x) = a1 , 0 < � < 1/ 2 and 0 < a1 < +1.x! +1 (4.4.4)

dxt = �kxtdt+ �dW�t xt = 0; k > 0; 0 < � < 1/ 2 (4.4.5)

with its solution given by :

xt =R t0 (t� s)� (� (1 + �))�1 dx

(�)t

x(�)t =

R t0 e

�k(t�s)�dWs

(4.4.6)

Note that, x(�)t the derivative of order � of xt, is a solution of the usual SDE: dzt = �kztdt+ �dWt.

4.4.2 The fractional SV model

To facilitate comparison with both the FIEGARCH model and the fractional extensions of the log-Normal

SV model discussed in section 3.5.5 let us consider the following fractional SV model (henceforth FSV) :

dSt /St = �tdWt

d log �t = �k log �tdt+ dW�t(4.4.7)

where k > 0 and 0 � � < 1/ 2. If nonzero, the fractional exponent � will provide some degree of freedomin the order of regularity of the volatility process, namely the greater � the smoother the path of the

volatility process. If we denote the autocovariance function of � by r� (�) then:

� > 0) (r� (h)� r� (0)) /h! 0 as h! 0 .

39

Page 44: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

This would be incorrectly interpreted as near-integrated behavior, widely found in high frequency data

for instance, when:

r� (h)� r� (0)/ h =��h � 1

�.h! log � as h! 0.

and �t is a continuous time AR(1) with correlation � near 1.

The long memory continuous time approach allows us to model persistence with the following features

(1) the volatility process itself (and not only its logarithm) has hyperbolic decay of the correlogram ;

(2) the persistence of volatility shocks yields leptokurtic features for return which vanishes with temporal

aggregation at a slow hyperbolic rate of decay.32 Indeed for rate of return on [0; h] :

E [ log St+h/St � E (log St+h/St)]4.�

E [log St+h/St � E (log St+h/St)]2�2 ! 3

as h!1 at a rate h2��1 if � 2 ]0; 1/ 2[ and a rate exp (�kh/ 2) if � = 0.

4.4.3 Filtering and Discrete Time Approximations

The volatility process dynamics are described by the solution to the SDE (4.4.5), namely :

log �t =Z t

0(t� s)

�/� (1 + �) d log �(�)s (4.4.6)

where log �(�) follows the O-U process :

d log �(�)t = �k log �(�)t dt+ dWt (4.4.7)

To compute a discrete time approximation one must evaluate numerically the integral (4.4.6) usingonly values of the process log �(�) on a discrete partition of [o; t] at point j/ n; j = 0; 1 : : : ; [nt].33 A naturalway to proceed is to use step functions, generating the following proxy process:

log �̂nt =[nt]Xj=1

(t� (j � 1)/ n)�.� (1 + �)� log �

(�)

j/n (4.4.8)

where � log �(�)j/n = log �

(�)j/n � log �

(�)(j�1)/n. Comte and Renault (1995) show that log �̂nt converges to the

log �t process for n!1 uniformly on compact sets. Moreover, by rearranging (4.4.8) one obtains:

log �̂nj/n =

24j�1Xi=0

([(i+ 1)� � i�] /n�� (1 + �) )Lin

35 log �(�)j/n (4.4.9)

where Ln is the lag operator corresponding to the sampling scheme j/ n, i.e. LnZ j/n = Z(j�1)/n . Withthis sampling scheme log �(�) is a discrete time AR(1) deduced from the continuous time process with the

following representation :

(1 � �nLn) log �(�)

j/n = u j/n (4.4.10)

32With usual GARCH or SV models, it vanishes at an exponential rate (see Drost and Nijman (1993) and Drost andWerker (1994) for these issues in the short memory case).

33[z] is the integer k such that k � z < k + 1.

40

Page 45: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

where �n = exp (�k/n) and u j/n is the associated innovations process. Since the process is stationary we

are allowed to write (assuming log �(�)j/n = u j/n = 0 for j � 0) :

log �̂(n)j/n =

24j�1Xi=0

(i+ 1)� � i�

n�� (1 + �)Lin

35 (1� �nLn)�1u j/n (4.4.11)

which gives a parametrization of the volatility dynamics in two parts : (1) a long memory part which

corresponds to the �lter+1Pi=o

ai Lin/ n

� with ai = [(i+ 1)� � i�] /� (1 + �) and (2) a short memory part which

is characterized by the AR(1) process : (1� �nLn)�1u j/n. Indeed, one can show that the long memory

�lter is \long-term equivalent" to the usual discrete time long memory �lters (1 � L)��

in the sense that

there is a long term relationship (a cointegration relation) between the two types of processes. However,

this long-term equivalence between the long-memory �lter and the usual discrete time one (1 � L)��

does

no imply that the standard parametrization FARIMA(1; �; 0) is well-suited in our framework. Indeed, onecan show that the usual discrete time �lter (1� L)�� introduces some mixing between long and short termcharacteristics whereas the parsimonious continuous time model doesn't.34 This feature clearly puts theFSV at an advantage with regard to the discrete time SV and GARCH long-memory models.

5 Statistical Inference

Evaluating the likelihood function of ARCH models is a relatively straightforward task. In sharp contrastfor SV models it is impossible to obtain explicit expressions for the likelihood function. This is a generic

feature common to almost all nonlinear latent variable models. The lack of estimation procedures for SVmodels made them for a long time an unattractive class of models in comparison to ARCH. In recentyears, however, remarkable progress has been made regarding the estimation of nonlinear latent variablemodels in general and SV models in particular. A urry of methods are now available and are up andrunning on computers with ever increasing CPU performance. The early attempts to estimate SV models

used a GMM procedure. A prominent example is Melino and Turnbull (1990). Section 5.1 is devotedto GMM estimation in the context of SV models. Obviously, GMM is not designed to handle continuoustime di�usions as it requires discrete time processes satisfying mixing conditions. A continuous time GMM

approach, developed by Hansen and Scheinkman (1994), involves moment conditions directly drawn fromthe continuous time representation of the process. This approach is discussed in Section 5.3. In between,namely in section 5.2, we discuss the QML approach suggested by Harvey, Ruiz and Shephard (1994)

and Nelson (1988). It relies on the fact that the nonlinear (Gaussian) SV model can be transformed into

a linear non-Gaussian state space model as in Section 3, and from this a Gaussian quasi-likelihood canbe computed . None of the methods covered in Sections 5.1 through 5.3 involve simulation. However,

increased computer power has made simulation-based estimation techniques increasingly popular. Thesimulated method of moments, or simulation-based GMM approach proposed by Du�e and Singleton

(1993), is a �rst example which is covered in Section 5.4. Next we discuss the indirect inference approach

of Gouri�eroux, Monfort and Renault (1993) and the moment matching methods of Gallant and Tauchen(1994) in Section 5.5. Finally, Section 5.6 covers a very large class of estimators using computer intensive

Markov Chain Monte Carlo methods applied in the context of SV models by Jacquier, Polson and Rossi

34Namely, (1� Ln)�log �̂nj/n is not an AR(1) process.

41

Page 46: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

(1994) and Kim and Shephard (1994), and simulation based ML estimation proposed in Danielsson (1994)

and Danielsson and Richard (1993).

In each section we will only try to limit our focus to the use of estimation procedures in the context

of SV models and avoid details regarding econometric theory. Some useful references to complement the

material which will be covered are (1) Hansen (1992), Gallant and White (1988), Hall (1993) and Ogaki

(1993) for GMM estimation, (2) Gouri�eroux and Monfort (1993b) and Wooldridge (1994) for QMLE, (3)

Gouri�eroux and Monfort (1995) and Tauchen (1995) for simulation based econometric methods including

indirect inference and moment matching, and �nally (4) Geweke (1995) and Shephard (1995) for Markov

Chain Monte Carlo methods.

5.1 Generalized Method of Moments

Let us consider the simple version of the discrete time SV as presented in equations (3.1.2) and (3.1.3) with

the additional assumption of normality for the probability distribution of the innovation process ("t; �t).This log-normal SV model has been the subject of at least two extensive Monte Carlo studies on GMMestimation of SV models. They were conducted by Andersen and S�rensen (1993) and Jacquier, Polsonand Rossi (1994). The main idea is to exploit the stationary and ergodic properties of the SV model

which yield the convergence of sample moments to their unconditional expectations. For instance, thesecond and fourth moments are simple expressions of �2 and �2h, namely �2exp(�2h=2) and 3�4exp(2�2h)respectively. If these moments are computed in the sample, �2h can be estimated directly from the samplekurtosis, b�; which is the ratio of the fourth moment to the second moment squared. The expression is justb�2h = log ( b�/ 3) : The parameter �2 can then be estimated from the second moment by substituting in this

estimate of �2h: We might also compute the �rst-order autocovariance of y2t , or simply the sample meanof y2t y

2t�1 which has expectation �4exp(f1 + �g�2h) and from which, given the estimate of �2 and �2h , it is

straightforward to get an estimate of �:The above procedure is an example of the application of the method of moments. In general terms, m

moments are computed. For a sample of size T, let gT (�) denotes the m� 1 vector of di�erences between

each sample moment and its theoretical expression in terms of the model parameters �. The generalizedmethod of moments (GMM) estimator is constructed by minimizing the criterion function

�̂T = Argmin�

gT (�)0

WT gT (�)

where WT is an m�m weighting matrix re ecting the importance given to matching each of the moments.

When "t and �t are mutually independent, Jacquier, Polson and Rossi (1994) suggest using 24 moments.

The �rst four are given by (??) for c = 1; 2; 3; 4; while the analytic expression for the others is:

E[j yctyct�� j] =n�2c2c[�

�c2+ 1

2

�]2=�

oexp( c

2

4�2h[1 + �� ]); c = 1; 2; � = 1; 2; ::; 10.35

In the more general case when "t and �t are correlated, Melino and Turnbull (1990) included estimates

of : E[j yt j yt�� ] ,� = 0;�1;�2; :::; 10. They presented an explicit expression in the case of � = 1 and

show that its sign is entirely determined by �:

35A simple way to derive these moment conditions is via a two-step approach similar in spirit to (2.4.8) and (2.4.9) or(3.2.3).

42

Page 47: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

The GMM method may also be extended to handle a non-normal distribution for "t: The required

analytic expressions can be obtained as in section 3.2. On the other hand, the analytic expression of

unconditional moments presented in section 2.4 for the general SARV model may provide the basis of

GMM estimation in more general settings (see Andersen (1994)).

From the very start we expect the GMM estimator not to be e�cient. The question is how much

ine�ciency should be tolerated in exchange for its relative simplicity. The generic setup of GMM leaves

unspeci�ed the number of moment conditions, except for the minimal number required for identi�cation, as

well as the explicit choice of moments. Moreover, the computations of the weighting matrix is also an issue

since many options exist in practice. The extensive Monte Carlo studies of Andersen and S�rensen (1993)

and Jacquier, Polson and Rossi (1994) attempted to answer these many outstanding questions. In general

they �nd that GMM is a fairly ine�cient procedure primarily stemming from the stylized fact, noted in

section 2.2, that � in equation (3.1.3) is quite close to unity in most empirical �ndings because volatility

is highly persistent. For parameter values of � close to unity convergence to unconditional moments is

extremely slow suggesting that only large samples can rescue the situation. The Monte Carlo study ofAndersen and S�rensen (1993) provides some guidance on how to control the extent of the ine�ciency,notably by keeping the number of moment conditions small. They also provide speci�c recommendationsfor the choice of weighting matrix estimators with data-dependent bandwidth using the Bartlett kernel.

5.2 Quasi Maximum Likelihood Estimation

5.2.1 The Basic Model

Consider the linear state space model described in sub-section 3.4.1, in which (??) is the measurementequation and (??) is the transition equation. The QML estimators of the parameters �, �2� and thevariance of �t, �

2� , are obtained by treating �t and �t as though they were normal and maximizing the

prediction error decomposition form of the likelihood obtained via the Kalman �lter. As noted in Harvey,Ruiz and Shephard (1994), the quasi maximum likelihood (QML) estimators are asymptotically normal

with covariance matrix given by applying the theory in Dunsmuir(1979, p. 502). This assumes that �t and�t have �nite fourth moments and that the parameters are not on the boundary of the parameter space.

The parameter ! can be estimated at the same time as the other parameters. Alternatively, it can beestimated as the mean of the log y2t 's, since this is asymptotically equivalent when � is less than one inabsolute value.

Application of the QML method does not require the assumption of a speci�c distribution for "t. Wewill refer to this as unrestricted QML. However, if a distribution is assumed, it is no longer necessary to

estimate �2� , as it is known, and an estimate of the scale factor, �2, can be obtained from the estimate of

!. Alternatively, it can be obtained as suggested in sub-section 3.4.1.If unrestricted QML estimation is carried out, a value of the parameter determining a particular dis-

tribution within a class may be inferred from the estimated variance of �t. For example in the case of the

Student's t; � may be determined from the knowledge that the theoretical value of the variance of �t is

4:93 + 0(�=2) (where (�) is the digamma function introduced in section 3.2.2).

5.2.2 Asymmetric Model

In an asymmetric model, QML may be based on the modi�ed state space form in (??). The parameters�2� , �

2�; �, �

�, and � can be estimated via the Kalman �lter without any distributional assumptions, apart

43

Page 48: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

from the existence of fourth moments of �t and �t and the joint symmetry of �t and �t. However, if an

estimate of � is wanted it is necessary to make distributional assumptions about the disturbances, leading

to formulae like (??) and (??). These formulae can be used to set up an optimization with respect to the

original parameters �2; �2�; � and �. This has the advantage that the constraint j�j < 1 can be imposed.

Note that any t-distribution gives the same relationship between the parameters, so within this class it is

not necessary to specify the degrees of freedom.

Using the QML method with both the original disturbances assumed to be Gaussian, Harvey and

Shephard (1993) estimate a model for the CRSP daily returns on a value weighted US market index for

3rd July 1962 to 31st December 1987. These data were used in the paper by Nelson (1991) to illustrate

his EGARCH model. The empirical results indicate a very high negative correlation, suggesting that the

Black-Scholes option pricing equation will be quite badly biased.

5.2.3 QML in the Frequency Domain

For a long memory SV model, QML estimation in the time domain becomes relatively less attractivebecause the state space form (SSF) can only be used by expressing ht as an autoregressive or movingaverage process and truncating at a suitably high lag. Thus the approach is cumbersome, though theinitial state covariance matrix is easily constructed, and the truncation does not a�ect the asymptoticproperties of the estimators. If the autoregressive approximation, and therefore the SSF, is not used,

time domain QML requires the repeated construction and inversion of the T � T covariance matrix of thelog y20t s; see Sowell (1992). On the other hand, QML estimation in the frequency domain is no more di�cultthan it is in the AR(1) case. Cheung and Diebold (1994) present simulation evidence which suggests thatalthough time domain estimation is more e�cient in small samples, the di�erence is less marked when amean has to be estimated.

The frequency domain (quasi) log-likelihood function is, neglecting constants,

logL = �1

2

T�1Xj=1

log gj � �T�1Xj=1

I(�j)=gj (5.2.1)

where I(�j) is the sample spectrum of the log y20t s and gj is the spectral generating function (SGF), which

for (??) is

gj = �2�[2(1 � cos �j)]�d + �2�

Note that the summation in (??) is from j = 1 rather than j = 0. This is because g0 cannot be evaluated forpositive d . However, the omission of the zero frequency does remove the mean. The unknown parametersare �2�; �

2� and d, but �2� may be concentrated out of the likelihood function by a reparameterisation in

which �2� is replaced by the signal-noise ratio q = �2�=�2� : On the other hand if a distribution is assumed

for "t; then �2� is known. Breidt, Crato and de Lima (1993) show the consistency of the QML estimator.

When d lies between 0.5 and one, ht is nonstationary, but di�erencing the log y20t s yields a zero mean

stationary process, the SGF of which is

gj = �2�[2(1 � cos�j)]1�d + 2(1 � cos �j)�

2�

One of the attractions of long memory models is that inference is not a�ected by the kind of unit rootissues which arise with autoregressions. Thus a likelihood based test of the hypotheses that d = 1 against

the alternative that it is less than one can be constructed using standard theory; see Robinson (1993).

44

Page 49: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

5.2.4 Comparison of GMM and QML

Simulation evidence on the �nite sample performance of GMM and QML can be found in Ruiz (1994),

Harvey and Shephard (1996), Jacquier, Polson and Rossi (1994), Andersen and S�rensen (1993) and Breidt

and Carriquiry (1995). The general conclusion seems to be that QML gives estimates with a smaller MSE

when the volatility is relatively strong as re ected in a high coe�cient of variation. This is because the

normally distributed volatility component in the measurement equation, (??), is large relative to the non-

normal error term. With a lower coe�cient of variation, GMM dominates. However, in this case Jacquier,

Polson and Rossi (1994, p. 383) observe that "...the performance of both the QML and GMM estimators

deteriorates rapidly." In other words the case for one of the more computer intensive methods outlined in

Section 5.6 becomes stronger.

Other things being equal, an AR coe�cient, �; close to one tends to favor QML because the autocor-

relations are slow to die out and are hence captured less well by the moments used in GMM. For the same

reason, GMM is likely to be rather poor in estimating a long memory model.The attraction of QML is that it is very easy to implement and it extends easily to more general models,

for example nonstationary and multivariate ones. At the same time, it provides �ltered and smoothedestimates of the state, and predictions. The one-step ahead prediction errors can also be used to construct

diagnostics, such as the Box-Ljung statistic, though in evaluating such tests it must be remembered thatthe observations are non-normal. Thus even if the hyperparameters are eventually estimated by anothermethod, QML may have a valuable role to play in �nding a suitable model speci�cation.

5.3 Continuous Time GMM

Hansen and Scheinkman (1995) propose to estimate continuous time di�usions using a GMM procedurespeci�cally tailored for such processes. In section 5.1 we discussed estimation of SV models which are eitherexplicitly formulated as discrete time processes or else are discretizations of the continuous time di�usions.In both cases inference is based on minimizing the di�erence between unconditional moments and theirsample equivalent. For continuous time processes Hansen and Scheinkman (1995) draw directly upon the

di�usion rather than its discretization to formulate moment conditions. To describe the generic setup ofthe method they proposed let us consider the following (multivariate) system of n di�usion equations:

dyt = �(yt; �)dt+ �(yt; �)dWt (5.3.1)

A comparison with the notation in section 2 immediately draws attention to certain limitations of the

setup. First, the functions �� (�) � �(�; �) and �� (�) � � (�; �) are parameterized by yt only which restrictsthe state variable process Ut in section 2 to contemporaneous values of yt. The di�usion in (5.3.1) involves

a general vector process yt, hence yt could include a volatility process to accommodate SV models. Yet,the yt vector is assumed observable. For the moment we will leave these issues aside, but return to them

at the end of the section. Hansen and Scheinkman (1995) consider the in�nitesimal operator A de�ned fora class of square integrable functions ': IRn ! IR as follows:

A�' (y) =d' (y)

dy0�� (y) +

1

2Tr

�� (y)�

0

� (y)d2' (y)

dydy0

!: (5.3.2)

Because the operator is de�ned as a limit, namely :

A�' (y)=limt!0

t�1[ IE ('(yt)j yo = y)�y],

45

Page 50: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

it does not necessarily exist for all square integrable functions ' but only for a restricted domain D. A set

of moment conditions can now be obtained for this class of functions ' 2 D. Indeed, as shown for instance

by Revuz and Yor (1991), the following equalities hold :

EA�' (yt) = 0; (5.3.3)

E [A�' (yt+1) ~' (yt)� ' (yt+1)A�

� ~' (yt)] = 0; (5.3.4)

where A�� is the adjoint in�nitesimal operator of A� for the scalar product associated with the invariant

measure of the process y.36 By choosing an appropriate set of functions, Hansen and Scheinkman exploit

moment conditions (5.3.3) and (5.3.4) to construct a GMM estimator of �.

The choice of the function ' 2 D and e' 2 D� determines what moments of the data are used to estimate

the parameters. This obviously raises questions regarding the choice of functions to enhance e�ciency of

the estimator but �rst and foremost also the identi�cation of � via the conditions (5.3.3) and (5.3.4). Itwas noted in the beginning of the section that the multivariate process yt, in order to cover SV models,must somehow include the latent conditional variance process. Gouri�eroux and Monfort (1994, 1995) pointout that since the moment conditions based on ' and e' cannot include any latent process it will often(but not always) be impossible to attain identi�cation of all the parameters, particularly those governing

the latent volatility process. A possible remedy is to augment the model with observations indirectlyrelated to the latent volatility process, in a sense making it observable. One possible candidate would beto include in yt both the security price and the Black-Scholes implied volatilities obtained through optionmarket quotations for the underlying asset. This approach is in fact suggested by Pastorello, Renault andTouzi (1993) although not in the context of continuous time GMM but instead using indirect inference

methods which will be discussed in section 5.5.37 Another possibility is to rely on the time deformationrepresentation of SV models as discussed in the context of continuous time GMM by Conley et al. (1995).

5.4 Simulated Method of Moments

The estimation procedures discussed so far do not involve any simulation techniques. From now on wecover methods combining simulation and estimation beginning with the simulated method of moments

(SMM) estimator, which is covered by Du�e and Singleton (1993) for time series processes.38 In section5.1 we noted that GMM estimation of SV models is based on minimizing the distance between a set ofchosen sample moments and unconditional population moments expressed as analytical functions of the

model parameters. Suppose now that such analytical expressions are hard to obtain. This is particularly

the case when such expressions involve marginalizations with respect to a latent process such a stochasticvolatility process. Could we then simulate data from the model for a particular value of the parameters and

match moments from the simulated data with sample moments as a substitute? This strategy is preciselywhat SMM is all about. Indeed, quite often it is fairly straightforward to simulate processes and therefore

take advantage of the SMM procedure. Let us consider again as point of reference and illustration the

36Please note that A�� is again associated with a domain D� so that ' 2 D and ~' 2 D� in (5.3.4).37It was noted in section 2.1.3 that implied volatilities are biased. The indirect inference procedures used by Pastorello,

Renault and Touzi (1993) can cope with such biases, as will be explained in section 5.5. The use of option price data isfurther discussed in section 5.7.

38SMM was originally proposed for cross-section applications, see Pakes and Pollard (1989) and McFadden (1989). Seealso Gouri�eroux and Monfort (1993a).

46

Page 51: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

(multivariate) di�usion of the previous section (equation (5.3.1)) and conduct H simulations i = 1; :::;H

using a discretization:

�byit (�) = � (byit (�) ; �) + � (byit (�) ; �) "t and i = 1; :::;H and t = 1; :::; T

where byt (�) are simulated given a parameter � and "t is i.i.d. Gaussian.39 Subject to identi�cation and

other regularity conditions one then considers

b�HT = Argmin�k f (y1; :::yT)� 1

H

HXi=1

f�byi1 (�) ; :::; byiT (�)� k

with a suitable choice of norm, i.e. weighting matrix for the quadratic form as in GMM, and function f

of the data, i.e. moment conditions. The asymptotic distribution theory is quite similar to that of GMM,

except that simulation introduces an extra source of random error a�ecting the e�ciency of the SMM

estimator in comparison to its GMM counterpart. The e�ciency loss can be controlled by the choice of

H.40

5.5 Indirect Inference and Moment Matching

The key insight of the indirect inference approach of Gouri�eroux, Monfort and Renault (1993) and the

moment matching approach of Gallant and Tauchen (1994) is the introduction of an auxiliary modelparameterized by a vector, say �, in order to estimate the model of interest. In our case the latter is theSV model.41 In the �rst subsection we will describe the general principle while a second one will focusexclusively on estimating di�usions.

5.5.1 The Principle

We noted at the beginning of section 5 that ARCH type models are relatively easy to estimate in comparisonto SV models. For this reason an ARCH type model may be a possible candidate as an auxiliary model. An

alternative strategy would be to try to summarize the features of the data via a SNP density as developedby Gallant and Tauchen (1989). This empirical SNP density, or more speci�cally its score, could also ful�llthe role of auxiliary model. Other possibilities could be considered as well. The idea is then to use theauxiliary model to estimate �, so that:

b�T = Argmax�

TXt=1

log f� (yt j yt�1; �) (5.5.1)

where we restrict our attention here to a simple dynamic model with one lag for the purpose of illustration.The objective function f� in (5.5.1) can be a pseudo-likelihood function when the auxiliary model isdeliberately misspeci�ed to facilitate estimation. As an alternative f� can be taken from the class of SNP

densities.42 Gouri�eroux, Monfort and Renault then propose to estimate the same parameter vector � not

39We discuss in detail the simulation techniques in the next section. Indeed, to control for the discretization bias, one hasto simulate with a �ner sampling interval.

40The asymptotic variance of the SMM estimator depends on H through a factor(1 + H�1), see e.g. Gouri�eroux andMonfort (1995).

41It is worth noting that the simulation based inference methods we will describe here are applicable to many other typesof models for cross-sectional, time series and panel data.

42The discussion should not leave the impression that the auxiliary model can only be estimated via ML-type estimators.Any root T consistent asymptotically normal estimations procedure may be used.

47

Page 52: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

using the actual sample data but instead using samples fbyit (�)gTt=1 simulated i = 1; :::H times drawn from

the model of interest given �. This yields a new estimator of �, namely:

b�HT (�) = Argmax�

(1=H)HXi=1

TXt=1

log f��byit (�) j byit�1 (�) ; �� . (5.5.2)

The next step is to minimize a quadratic distance using a weighting matrix WT to choose an indirect

estimator of � based on H simulation replications and a sample of T observations, namely:

b�HT = Argmin�

�b�T � b�HT (�)�0WT

� b�T � b�HT (�)�

(5.5.3)

The approach of Gallant and Tauchen (1994) avoids the step of estimating b�HT (�) by computing the

score function of f� and minimizing a quadratic distance similar to (5.5.3) but involving the score function

evaluated at b�T and replacing the sample data by simulated series generated by the model of interest.

Under suitable regularity conditions the estimator b�HT is root T consistent and asymptotically normal. Aswith GMM and SMM there is again an optimal weighting matrix. The resulting asymptotic covariancematrix depends on the number of simulations in the same way the SMM estimator depends on H.

Gouri�eroux, Monfort and Renault (1993) illustrated the use of indirect inference estimator with a simple

example that we would like to brie y discuss here. Typically AR models are easy to estimate while MAmodels require more elaborate procedures. Suppose the model of interest is a moving average model oforder one with parameter �. Instead of estimating the MA parameter directly from the data they proposeto estimate an AR(p.) model involving the parameter vector �. The next step then consists of simulatingdata using the MA model and proceed further as described above.43 They found that the indirect inference

estimator for b�HT appeared to have better �nite sample properties than the more traditional maximumlikelihood estimators for the MA parameter. In fact the indirect inference estimator exhibited featuressimilar to the median unbiased estimator proposed by Andrews (1993). These properties were con�rmedand clari�ed by Gouri�eroux, Renault and Touzi (1994) who studied the second order asymptotic expansionof indirect inference estimators and their ability to reduce �nite sample bias.

5.5.2 Estimating Di�usions

Let us consider the same di�usion equation as in section 5.3 which dealt with continuous time GMM,namely:

dyt = � (yt; �) dt+ � (yt; �) dWt (5.5.4)

In section 5.3 we noted that the above equation holds under certain restrictions such as the functions� and � being restricted to yt as arguments. While these restrictions were binding for the setup of section

5.3 this will not be the case for the estimation procedures discussed here. Indeed, equation (5.5.4) is onlyused as an illustrative example. The di�usion is then simulated either via exact discretizations or some

type of approximate discretization (e.g. Euler or Mil'shtein, see Pardoux and Talay (1985) or Kloeden and

Platten (1992) for further details). More precisely we de�ne the process y(�)t such that:

43Again one could use a score principle here, following Gallant and Tauchen (1994). In fact in a linear Gaussian settingthe SNP approach to �t data generated by a MA (1) model would be to estimate on AR(p) model. Ghysels, Khalaf andVodounou (1994) provide a more detailed discussion of score-based and indirect inference estimators of MA models as wellas their relation with more standard estimators.

48

Page 53: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

y(�)(k+1)� = y

(�)k� + �

�y(�)k� ; �

�� + �

�y(�)k� ; �

��1=2"

(�)(k+1)� (5.5.5)

Under suitable regularity conditions (see for instance Strook and Varadhan (1979)) we know that the

di�usion admits a unique solution (in distribution) and the process y(�)t converges to yt as � goes to zero.

Therefore one can expect to simulate yt quite accurately for � su�ciently small. The auxiliary model may

be a discretization of (5.5.4) choosing � = 1. Hence, one formulates a ML estimator based on the nonlinear

AR model appearing in (5.5.5) setting � = 1. To control for the discretization bias one can simulate the

underlying di�usion with � = 1=10 or 1=20, for instance, and aggregate the simulated data to correspond

with the sampling frequency of the DGP. Broze, Scaillet and Zako��an (1994a) discuss of the simulation

step size on the asymptotic distribution.

The use of simulation-based inference methods becomes particularly appropriate and attractive when

di�usions involve latent processes, such as is the case with SV models. Gouri�eroux and Monfort (1994,

1995) discuss several examples and study their performance via Monte Carlo simulation. It should benoted that estimating the di�usion at a coarser discretization is not the only possible choice of auxiliarymodel. Indeed, Pastorello, Renault and Touzi (1993), Engle and Lee (1994) and Gallant and Tauchen(1994) suggest the use of ARCH-type models.

There have been several successful applications of these methods to �nancial time series. They include

Broze et al.(1994b), Engle and Lee (1994), Gallant, Hsieh and Tauchen (1994), Gallant and Tauchen (1994,1995), Ghysels, Gouri�eroux and Jasiak (1995b), Ghysels and Jasiak (1994a and b), Pastorello et al. (1993),among others.

5.6 Likelihood-based and Bayesian Methods

In a Gaussian linear state space model the likelihood function is constructed from the one step aheadprediction errors. This prediction error decomposition form of the likelihood is used as the criterionfunction in QML, but of course it is not the exact likelihood in this case. The exact �lter proposedby Watanabe (1993) will, in principle, yield the exact likelihood. However, as was noted in section 3.4.2,

because this �lter uses numerical integration, it takes a long time to compute and if numerical optimizationis to be carried out with respect to the hyperparameters it becomes impractical.

Kim and Shephard (1994) work with the linear state space form used in QML but approximate thelog(�2) distribution of the measurement error by a mixture of normals. For each of these normals, a

prediction error decomposition likelihood function can be computed. A simulated EM algorithm is used

to �nd the best mixture and hence calculate approximate ML estimates of the hyperparamaters.The exact likelihood function can also be constructed as a mixture of distributions for the observations

conditional on the volatilities, that is

L(y;�; �2�; �2) =

Zp(yjh)p(h)dh

where y and h contain the T elements of yt and ht respectively. This expression can be written in terms ofthe �20t s; rather than their logarithms, the h0ts; but it makes little di�erence to what follows. Of course theproblem is that the above likelihood has no closed form, so it must be calculated by some kind of simulation

method. Excellent discussions can be found in Shephard (1995) and in Jacquier, Polson and Rossi (1994),

including the comments. Conceptually, the simplest approach is to use Monte Carlo integration by drawingfrom the unconditional distribution of h for given values of the parameters,(�; �2�; �

2); and estimating the

49

Page 54: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

likelihood as the average of the p(yjh)0s. This is then repeated, searching over �; �2� until the maximum

of the simulated likelihood is found. As it stands this procedure is not very satisfactory, but it may be

improved by using ideas of importance sampling. This has been implemented for ML estimation of SV

models by Danielsson and Richard (1993) and Danielsson (1994). However, the method becomes more

di�cult as the sample size increases.

A more promising way of attacking likelihood estimation by simulation techniques is to use Markov

Chain Monte Carlo (MCMC) to draw from the distribution of volatilities conditional on the observations.

Ways in which this can be done were outlined in sub-section 3.4.2 on nonlinear �lters and smoothers. Kim

and Shephard (1994) suggest a method of computing ML estimators by putting their multimove algorithm

within a simulated EM algorithm. Jacquier, Polson and Rossi (1994) adopt a Bayesian approach in

which the speci�cation of the model has a hierarchical structure in which a prior distribution for the

hyperparameters, ' = (��; �; �)0

, joins the conditional distributions, yj h and h j '. (Actually the �0

ts

are used rather than the h0

ts). The joint posterior of h and ' is proportional to the product of these

three distributions, that is p (h; 'jy) / p(yjh)p (hj') p ('). The introduction of h makes the statisticaltreatment tractable and is an example of what is called data augmentation; see Tanner and Wong (1987).From the joint posterior, p (h; 'jy), the marginal p (hjy) solves the smoothing problem for the unobservedvolatilities, taking account of the sampling variability in the hyperparameters. Conditional on h, the

posterior of '; p ('jh; y) is simple to compute from standard Bayesian treatment of linear models. If itwere also possible to sample directly from p (hj'; y) at low cost, it would be straightforward to constructa Markov chain by alternating back and forth drawing from p ('jh; y) and p (hj'; y). This would producea cyclic chain, a special case of which is the Gibbs sampler. However, as was noted in sub-section 3.4.2,Jacquier, Polson and Rossi (1994) show that it is much better to decompose p (hj'; y) into a set of univariatedistributions in which each ht, or rather �t, is conditioned on all the others.

The prior distribution for !, the parameters of the volatility process in JPR (1994), is the standardconjugate prior for the linear model, a (truncated) Normal-Gamma. The priors can be made extremelydi�use while remaining proper. JPR conduct an extensive sampling experiment to document the per-formance of this and more traditional approaches. Simulating stochastic volatility series, they compare

the sampling performances of the posterior mean with that of the QML and GMM point estimates. TheMCMC posterior mean exhibit root mean squared errors anywhere between half and a quarter of the size ofthe GMM and QML point estimates. Even more striking are the volatility smoothing performance results.The root mean squared error of the posterior mean of ht produced by the Bayesian �lter is 10% smaller

than the point estimate produced by an approximate Kalman �lter supplied with the true parameters.

Shephard and Kim in their comment of JPR (1994) point out that for very high � and small ��, the rateof convergence of the JPR algorithm will slow down. More draws will then be required to obtain the same

amount of information. They propose to approximate the volatility disturbance with a discrete mixtureof normals. The bene�t of the method is that a draw of the vector h is then possible, faster than T draws

from each ht. However this is at the cost that the draws navigate in a much higher dimensional space dueto the discretisation e�ected. Also, the convergence of chains based upon discrete mixtures is sensitive to

the number of components and their assigned probability weights. Mahieu and Schotman (1994) add somegenerality to the Shephard and Kim idea by letting the data produce estimates of the characteristics of

the discretized state space (probabilities, mean and variance).

The original implementation of the JPR algorithm was limited to a very basic model of stochasticvolatility, AR(1) with uncorrelated mean and volatility disturbances. In a univariate setup, correlated

disturbances are likely to be important for stock returns, i.e., the so called leverage e�ect. The evidence in

Gallant, Rossi, and Tauchen (1994) also points at non normal conditional errors with both skewness and

50

Page 55: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

kurtosis. Jacquier, Polson, and Rossi (1995a) show how the hierarchical framework allows the convenient

extension of the MCMC algorithm to more general models. Namely, they estimate univariate stochastic

volatility models with correlated disturbances, and skewed and fat-tailed variance disturbance, as well as

multivariate models. Alternatively, the MCMC algorithm can be extended to a factor structure. The

factors exhibit stochastic volatility and can be observable or non observable.

5.7 Inference and Option Price Data

Some of the continuous time SVmodels currently found in the literature were developed to answer questions

regarding derivative security pricing. Given this rather explicit link between derivates and SV di�usions it is

perhaps somewhat surprising that relatively little attention has been paid to the use of option price data to

estimate continuous time di�usions. Melino (1994) in his survey in fact notes: \Clearly, information about

the stochastic properties of an asset's price is contained both in the history of the asset's price and the price

of any options written on it. Current strategies for combining these two sources of information, includingimplicit estimation, are uncomfortably ad hoc. Statistically speaking, we need to model the source of theprediction errors in option pricing and to relate the distribution of these errors to the stock price process.For example implicit estimation, like computation of BS implied volatilities, is certainly uncomfortably

ad hoc from a statistical point of view. In general, each observed option price introduces one source ofprediction error when compared to a pricing model. The challenge is to model the joint nondegenerateprobability distribution of options and asset prices via a number of unobserved state variables. Thisapproach has been pursued in a number of recent papers, including Christensen (1992), Renault and Touzi(1992), Pastorello et al. (1993), Duan (1994) and Renault (1995).

Christensen (1992) considers a pricing model for n assets as a function of a state vector xt whichis (l+ n) dimensional and divided in a l-dimensional observed (zt) and n-dimensional unobserved (!t)component. Let pt be the price vector of the n assets, then:

pt = m (zt; !t; �) (5.7.1)

Equation (5.7.1) provides a one-to-one relationship between the n latent state variables !t and then observed prices pt, for given zt and �. From a �nancial viewpoint, it implies that the n assets areappropriate instruments to complete the markets if we assume that the observed state variables zt arealready mimicked by the price dynamics of other (primitive) assets. Moreover, from a statistical viewpoint

it allows full structural maximum likelihood estimation provided the log-likelihood function for observed

prices can be deduced easily from a statistical model for xt. For instance, in a Markovian setting where,conditionally on x0, the joint distribution of xT1 = (xt)1�t�T is given by the density:

fx�xT1 jx0; �

�=

TYt=1

f (zt; !t jzt�1; !t�1; � ) (5.7.2)

the conditional distribution of data DT1 = (pt; zt)1�t�T givenD0 = (p0; z0) is obtained by the usual Jacobian

formula:

fD�DT

1 jD0; ��

=TQt=1

fhzt;m

�1� (zt; pt)

���zt�1;m�1� (zt�1; pt�1) ; �

ix���r!m

�zt;m

�1� (zt; pt) ; �

�����1 (5.7.3)

51

Page 56: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

where m�1� (z; :) is the !-inverse of m (z; :; �) de�ned formally by m�1

� (z;m (z; !; �)) = ! while r!m (�)represents the columns corresponding to ! of the Jacobian matrix. This MLE using price data of derivatives

was proposed independently by Christensen (1992) and Duan (1994). Renault and Touzi (1992) were

instead more speci�cally interested in the Hull and White option pricing formula with: zt = St observed

underlying asset price, and !t = �t unobserved stochastic volatility process. Then with the joint process

xt = (St; �t) being Markovian we have a call price of the form:

Ct = m (xt; �;K)

where � =��

0

; 0

�involves two types of parameters: (1) the vector � of parameters describing the dynamics

of the joint process xt = (St; �t) which under the equivalent martingale measure allows to compute the

expectation with respect to the (risk-neutral) conditional probability distribution of 2 (t; t+ h) given �t;

and (2) the vector of parameters which characterize the risk premia determining the relation between

the risk neutral probability distribution of the x process and the Data Generating Process.Structural MLE is often di�cult to implement. This motivated Renault and Touzi (1992) and Pas-

torello, Renault and Touzi (1993) to consider less e�cient but simpler and more robust procedures involving

some proxies of the structural likelihood (5.7.3).To illustrate these procedures let us consider the standard log-normal SV model in continuous time:

d log �t = k (a� log �t) dt+ cdW �t . (5.7.4)

Standard option pricing arguments allow us to ignore misspeci�cations of the drift of the underlying

asset price process. Hence, a �rst step towards simplicity and robustness is to isolate from the likelihoodfunction the volatility dynamics, namely:

nYi=1

�2�c2

�� 1

2 exp���2c2��1 �

log �ti � e�k�t log �ti�1 � a�1� e�k�t

���2(5.7.5)

associated with a sample �ti; i = 1; : : : ; n and ti � ti�1 = �t. To approximate this expression one canconsider a direct method, as in Renault and Touzi (1992) or an indirect method, as in Pastorello et al.(1993). The former involves calculating implied volatilities from the Hull and White model to create pseudo

samples �ti parameterized by k, a and c and computing the maximum of (5.7.5) with respect to those three

parameters.44 Pastorello et al. (1993) proposed several indirect inference methods, described in section5.5, in the context of (5.7.5). For instance, they propose to use an indirect inference strategy involvingGARCH(1,1) volatility estimates obtained form the underlying asset (also independently suggested by

Engle and Lee (1994)). This produces asymptotically unbiased but rather ine�cient estimates. Pastorello

et al. indeed �nd that an indirect inference simpli�cation of the Renault and Touzi direct procedureinvolving option prices is far more e�cient. It is a clear illustration of the intuition that the use of option

price data paired with suitable statistical methods should largely improve the accuracy of estimatingvolatility di�usion parameters.

5.8 Regression Models with Stochastic Volatility

A single equation regression model with stochastic volatility in the disturbance term may be written

44The direct maximization of (5.7.5) using BS implied volatilities has also been proposed, see e.g. Heynen, Kemna andVorst (1994). Obviously the use of BS implied volatility induces a misspeci�cation bias due to the BS model assumptions.

52

Page 57: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

yt = x0t� + ut; t = 1; :::; T; (5.8.1)

where yt denotes the t� th observation, xt is a k � 1 vector of explanatory variables, � is a k � 1 vector

of coe�cients and ut = �"t exp (0:5ht) as discussed in section 3: As a special case, the observations may

simply have a non-zero mean so that x0t� =�8t.Since ut is stationary, an OLS regression of yt on xt yields a consistent estimator of �. However it is

not e�cient.

For given values of the SV parameters, � and �2�; a smoothed estimator of ht, htjT ; can be computed

using one of the methods outlined in section 3.4. Multiplying (5.8.1) through by exp(�:5htjT ) gives

~yt = ~x0t� + ~ut; t = 1; :::; T (5.8.2)

where the fut's can be thought of as heteroskedasticity corrected disturbances. Harvey and Shephard (1993)show that these disturbances have zero mean, constant variance and are serially uncorrelated and hencesuggest the construction of a feasible GLS estimator

~� =

"TXt=1

e�htjT xtx0

t

#�1 TXt=1

e�htjT xtyt (5.8.3)

In the classical heteroskedastic regression model ht is deterministic and depends on a �xed number ofunknown parameters. Because these parameters can be estimated consistently, the feasible GLS estimator

has the same asymptotic distribution as the GLS estimator. Here ht is stochastic and the MSE of itsestimator is of O(1). The situation is therefore somewhat di�erent. Harvey and Shephard (1993) showthat, under standard regularity conditions on the sequence of xt, e� is asymptotically normal with mean �and a covariance matrix which can be consistently estimated by

gavar( e�) = "

TXt=1

e�htjT xtx0

t

#�1 TXt=1

(yt � x0t~�)2e�2htjT xtx

0

t

"TXt=1

e�htjT xtx0

t

#�1(5.8.4)

When htjT is the smoothed estimate given by the linear state space form, the analysis in Harvey andShephard (1993) suggests that, asymptotically, the feasible GLS estimator is almost as e�cient as theGLS estimator and considerably more e�cient than the OLS estimator. It would be possible to replaceexp(htjT ) by a better estimate computed from one of the methods described in section 3.4 but this may

not have much e�ect on the e�ciency of the resulting feasible GLS estimator of �:

When ht is nonstationary, or nearly nonstationary, Hansen (1995) shows that it is possible to construct

a feasible adaptive least squares estimator which is asymptotically equivalent to GLS.

6 Conclusions

No survey is ever complete. There are two particular areas we expect to ourish in the years to come but

which we were not able to cover. The �rst is the area of market microstructures which is well surveyed in arecent review paper by Goodhart and O'Hara (1995). With the ever increasing availability of high frequency

data series, we anticipate more work involving game theoretic models. These can now be estimated because

of recent advances in econometric methods, similar to those enabling us to estimate di�usions. Another

area where we expect interesting research to emerge is that involving nonparametric procedures to estimate

53

Page 58: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

SV continuous time and derivative securities models. Recent papers include Ait-Sahalia (1994), Ait-Sahalia

et al. (1994), Bossaerts, Hafner and H�ardle (1995), Broadie et al. (1995), Conley et al. (1995), Elsheimer

et al. (1995), Gouri�eroux, Monfort and Tenreiro (1994), Gouri�eroux and Scaillet (1995), Hutchinson, Lo

and Poggio (1994), Lezan et al. (1995), Lo (1995), Pagan and Schwert (1992).

Research into the econometrics of Stochastic Volatility models is relatively new. As our survey has

shown, there has been a burst of activity in recent years drawing on the latest statistical technology. As

regards the relationship with ARCH, our view is that SV and ARCH are not necessarily direct competitors,

but rather complement each other in certain respects. Recent advances such as the use of ARCH models

as �lters, the weakening of GARCH and temporal aggregation and the introduction of nonparametric

methods to �t conditional variances, illustrate that a uni�ed strategy for modelling volatility needs to

draw on both ARCH and SV.

54

Page 59: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

References

[1] Abramowitz, M. and N.C. Stegun (1970), Handbook of Mathematical Functions. New York : Dover

Publications Inc.

[2] Ait-Sahalia, Y. (1994), \Nonparametric Pricing of Interest Rate Derivative Securities", Discussion

Paper, Graduate School of Business, University of Chicago.

[3] Ait-Sahalia, Y., S.J. Bickel and T.M. Stoker (1994), \Goodness-of-Fit Tests for Regression Using

Kernel Methods", Discussion Paper, University of Chicago.

[4] Amin, K.L. and V. Ng (1993) \Equilibrium Option Valuation with Systematic Stochastic Volatility",Journal of Finance 48, 881-910.

[5] Andersen, T.G. (1992), \Volatility", Discussion paper, Northwestern University.

[6] Andersen, T.G. (1994). \Stochastic Autoregressive Volatility : A Framework for VolatilityModeling",Mathematical Finance 4, 75-102.

[7] Andersen, T.G. and T. Bollerslev (1995), \Intraday Seasonality and Volatility Persistence in Financial

Markets", Discussion Paper, Northwestern University.

[8] Andersen, T.G. and B. S�rensen (1993), \GMM Estimation of a Stochastic Volatility Model : A

Monte Carlo Study", Discussion paper, Northwestern University.

[9] Andrews, D.W.K. (1993), \Exactly Median-Unbiased Estimation of First Order Autoregressive UnitRoot Models", Econometrica, 61, 139-165.

[10] Bachelier, L. (1900): \Th�eorie de la sp�eculation", Ann. Sci. Ecole Norm. Sup. 17, 21-86, [On theRandom Character of Stock Market Prices (Paul H. Cootner, ed.) The MIT Press, Cambridge, Mass.

1964].

[11] Baillie, R.T. and T. Bollerslev (1989), \The Message in Daily Exchange Rates : A Conditional

Variance Tale", Journal of Business and Economic Statistics 7, 297-305.

[12] Baillie, R.T. and T. Bollerslev (1991), \Intra Day and Inter Day Volatility in Foreign Exchange

Rates", Review of Economic Studies 58, 565-585.

[13] Baillie, R.T., T. Bollerslev and H. O. Mikkelsen (1993), \Fractionally integrated generalized autore-gressive conditional heteroskedasticity", Journal of Econometrics (forthcoming).

[14] Bajeux, I. and J.C. Rochet (1992), \Dynamic Spanning : Are Options an Appropriate Instrument?",Mathematical Finance, (forthcoming).

[15] Bates, D.S. (1995a), \Testing Option Pricing Models", Discussion Paper, (Wharton School, Univer-sity of Pennsylvania).

55

Page 60: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[16] Bates, D.S. (1995b), \Jumps and stochastic volatility : exchange rate processes implicit in PHLX

Deutschemark options", Review of Financial Studies, forthcoming.

[17] Beckers, S. (1981), \ Standard deviations implied in option prices as predictors of future stock price

variability", Journal of Banking and Finance 5, 363-381.

[18] Bera, A.K. and M.L. Higgins (1995), \On ARCH models : properties, estimation and testing", In L.

Exley, D.A.R. George, C.J. Roberts and S. Sawyer (eds.), Surveys in Econometrics. (Basil Blackwell

: Oxford) Reprinted from Journal of Economic Surveys.

[19] Black, F. (1976), \Studies in Stock Price Volatility Changes", Proceedings of the 1976 Business

Meeting of the Business and Economic Statistics Section, American Statistical Association, 177-181.

[20] Black, F. and M. Scholes (1973), \The Pricing of Options and Corporate Liabilities", Journal of

Political Economy 81, 637-654.

[21] Bollerslev, T. (1986), \Generalized Autoregressive Conditional Heteroskedasticity", Journal ofEconometrics 31, 307-327.

[22] Bollerslev, T. (1993), \Long Memory in Stochastic Volatility", Discussion paper, Northwestern Un-viersity.

[23] Bollerslev, T., Y.C. Chou and K. Kroner (1992), \ARCH Modelling in Finance : A Selective Reviewof the Theory and Empirical Evidence", Journal of Econometrics 52, 201-224.

[24] Bollerslev, T. and R. Engle (1993), \Common Persistence in Conditional Variances", Econometrica61, 166-187.

[25] Bollerslev, T., R. Engle and D. Nelson (1994), \ARCH Models", in R.F. Engle and D. McFadden(eds.), Handbook of Econometrics, Volume IV, (North-Holland, Amsterdam).

[26] Bollerslev, T., R. Engle and J. Wooldridge (1988), \A Capital Asset Pricing Model with TimeVaryingCovariances", Journal of Political Economy 96, 116-131.

[27] Bollerslev, T. and E. Ghysels (1994), \On Periodic Autoregression Conditional Heteroskedasticity",

Journal of Business and Economic Statistics (forthcoming).

[28] Bollerslev, T. and H. O. Mikkelsen (1995), \Modeling and Pricing Long-Memory in Stock Market

volatility", Journal of Econometrics (forthcoming).

[29] Bossaerts, P., C. Hafner and W. H�ardle (1995), \Foreign Exchange Rates Have Surprising Volatility",

Discussion Paper, CentER, University of Tilburg.

[30] Bossaerts, P. and P. Hillion (1995), \Local Parametric Analysis of Hedging in Discrete Time", Journalof Econometrics (forthcoming).

[31] Breidt, F. J., N. Crato and P. de Lima (1993), \Modeling Long-Memory Stochastic Volatility",Discussion paper, Iowa State University.

56

Page 61: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[32] Breidt, F. J. and A.L. Carriquiry (1995), \Improved Quasi-Maximum Likelihood Estimation for

Stochastic Volatility Models". Mimeo, Department of Statistics, University of Iowa.

[33] Broadie, M., J. Detemple, E. Ghysels and O. Torr�es (1995), \American Options with Stochastic

Volatility : A Nonparametric Approach", Discussion Paper, CIRANO.

[34] Broze, L., O. Scaillet and J.M. Zakoian (1994a), \Quasi Indirect Inference for Di�usion Processes",

Discussion Paper CORE.

[35] Broze, L., O. Scaillet and J.M. Zakoian (1994b), \Testing for continuous time models of the short

term interest rate", Journal of Empirical Finance (forthcoming).

[36] Campa, J. M. and P.H.K. Chang (1995), "Testing the expectations hypothesis on the term structure

of implied volatilities in foreign exchange options", Journal of Finance 50, (forthcoming).

[37] Campbell, J.Y. and A.S. Kyle (1993), \Smart Money, Noise Trading and Stock Price Behaviour",Review of Economics Studies 60, 1-34.

[38] Canina, L. and S. Figlewski (1993), \The informational content of implied volatility", Review ofFinancial Studies 6, 659-682.

[39] Canova, F. (1992), \Detrending and Business Cycle Facts", Discussion Paper, European UniversityInstitute, Florence.

[40] Chesney, M. and L. Scott (1989), \Pricing European Currency Options : A comparison of the Mod-i�ed Black-Scholes Model and a Random Variance Model", Journal of Financial and QuantitativeAnalysis 24, 267-284.

[41] Cheung, Y.-W. and F.X. Diebold (1994), \On Maximum Likelihood Estimation of the Di�erenciaryParameter of Fractionally - Integrated Noise with Unknown Mean", Journal of Econometrics 62,301-316.

[42] Chiras, D.P. and S. Manaster (1978), \The information content of option prices and a test of markete�ciency", Journal of Financial Economics 6, 213-234.

[43] Christensen, B.J. (1992), \Asset Prices and the Empirical Martingale Model", Discussion Paper,New York University.

[44] Christie, A. A. (1982), \The Stochastic Behavior of Common Stock Variances : Value, Leverage, and

Interest Rate E�ects", Journal of Financial Economics 10, 407-432.

[45] Clark, P.K. (1973), \A Subordinated Stochastic Process Model with Finite Variance for Speculative

Prices", Econometrica 41, 135-156.

[46] Clewlow, L and X. Xu (1993), \The Dynamics of Stochastic Volatility", Discussion Paper, University

of Warwick.

[47] Comte, F. and E. Renault (1993), \Long memory continuous time models", Journal of Econometrics

(forthcoming).

57

Page 62: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[48] Comte, F. and E. Renault (1995), \Long memory continuous time Stochastic Volatility Models",

Paper presented at the HFDF-I Conference, Z�urich.

[49] Conley, T., L.P. Hansen, E. Luttmer and J. Scheinkman (1995), \Estimating Subordinated Di�usions

from discrete Time Data" Discussion paper, University of Chicago.

[50] Cornell, B. (1978), \Using the Options Pricing Model to Measure the Uncertainty Producing E�ect

of Major announcements", Financial Management 7, 54-59.

[51] Cox, J.C. (1975), \Notes on Option Pricing I: Constant Elasticity of Variance Di�usions", Discussion

Paper, Stanford University.

[52] Cox, J.C. and S. Ross (1976), \The Valuation of Options for Alternative Stochastic Processes",

Journal of Financial Economics 3, 145-166.

[53] Cox, J.C. and M. Rubinstein (1985), Options Markets, (Englewood Cli�s, Prentice-Hall, New Jersey).

[54] Dacorogna, M.M., U.A. M�uller, R.J. Nagler, R.B. Olsen and O.V. Pictet (1993), \A GeographicalModel for the Daily and Weekly Seasonal Volatility in the Foreign Exchange Market", Journal ofInternational Money and Finance 12, 413-438.

[55] Danielsson, J. (1994), \Stochastic Volatility in asset Prices : Estimation with Simulated MaximumLikelihood", Journal of Econometrics 61, 375-400.

[56] Danielsson, J. and J.F. Richard (1993), \Accelerated Gaussian Importance Sampler with Applicationto Dynamic Latent Variable Models", Journal of Applied Econometrics 3, S153-S174.

[57] Dassios, A. (1995), \Asymptotic expressions for approximations to stochastic variance models".

mimeo, London School of Economics.

[58] Day, T.E. and C.M. Lewis (1988), \The behavior of the volatility implicit in the prices of stock index

options", Journal of Financial Economics 22, 103-122.

[59] Day, T.E. and C.M. Lewis (1992), \Stock market volatility and the information content of stockindex options", Journal of Econometrics 52, 267-287.

[60] Diebold, F.X. (1988), Empirical Modeling of Exchange Rate Dynamics, (Springer Verlag, New York).

[61] Diebold, F.X. and J.A. Lopez (1995), \Modeling Volatility Dynamics", in K. Hoover (ed), Macroe-conomics : Developments, Tensions and Prospects.

[62] Diebold, F.X. and M. Nerlove (1989), \The Dynamics of Exchange Rate Volatility : A MultivariateLatent Factor ARCH Model", Journal of Applied Econometrics 4, 1-22.

[63] Ding, Z., C.W.J. Granger and R.F. Engle (1993), \A Long Memory Property of Stock Market Returns

and a New Model", Journal of Empirical Finance 1, 83-108.

[64] Diz, F. and T.J. Finucane (1993), \Do the options markets really overreact?", Journal of Futures

Markets 13, 298-312.

58

Page 63: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[65] Drost, F.C. and T. E. Nijman (1993), \Temporal Aggregation of GARCH Processes", Econometrica

61, 909-927.

[66] Drost, F.C. and B.J.M. Werker (1994), \Closing the GARCH Gap : Continuous Time GARCH

Modelling", Discussion Paper CentER, University of Tilburg.

[67] Duan, J.C. (1994), \Maximum Likelihood Estimation Using Price Data of the Derivative Contract",

Mathematical Finance 4, 155-167.

[68] Duan, J.C. (1995), \The GARCH Option Pricing Model", Mathematical Finance 5, 13-32.

[69] Du�e, D. (1989), Futures Markets, (Prentice-Hall International Editions).

[70] Du�e, D. (1992), Dynamic Asset Pricing Theory, (Princeton University Press).

[71] Du�e, D. and K. J. Singleton (1993), \Simulated Moments Estimation of Markov Models of AssetPrices", Econometrica 61, 929-952.

[72] Dunsmuir, W. (1979), \A central limit theorem for parameter estimation in stationary vector timeseries and its applications to models for a signal observed with noise", Annuals of Statistics 7, 490-506.

[73] Easley, D. and M. O'Hara (1992), \Time and the Process of Security Price Adjustment", Journal ofFinance, 47, 577-605.

[74] Ederington, L.H. and J.H. Lee (1993), \How markets process information : news releases and volatil-ity", Journal of Finance 48, 1161-1192.

[75] Elsheimer, B., M. Fisher, D. Nychka and D. Zirvos (1995), \Smoothing Splines Estimates of theDiscount Function based on US Bond Prices", Discussion Paper Federal Reserve, Washington, D.C.

[76] Engle, R.F. (1982), \Autoregressive Conditional Heteroskedasticity with Estimates of the Varianceof United Kingdom In ation", Econometrica 50, 987-1007.

[77] Engle, R.F. and C.W.J. Granger (1987), \Co-Integration and Error Correction : Representation,

Estimation and Testing", Econometrica 55, 251-576.

[78] Engle, R.F. and S. Kozicki (1993), \Testing for Common Features", Journal of Business and Eco-

nomic Statistics 11, 369-379.

[79] Engle, R.F. and G.G.J Lee (1994), \Estimating di�usion models of stochastic volatility", Discussion

Paper, Univeristy of California at San Diego.

[80] Engle, R.F. and C. Mustafa (1992), \Implied ARCH models from option prices", Journal of Econo-metrics 52, 289-311.

[81] Engle, R.F. and V.K. Ng (1993), \Measuring and Testing the Impact of News on Volatility", Journalof Finance 48, 1749-1801.

[82] Fama, E.F. (1963), \Mandelbrot and The Stable Paretian Distribution", Journal of Business 36,420-429.

59

Page 64: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[83] Fama, E.F. (1965), \The Behavior of Stock Market Prices", Journal of Business 38, 34-105.

[84] Foster, D. and S. Viswanathan (1993a), \The E�ect of Public Information and Competition on

Trading Volume and Price Volatility", Review of Financial Studies 6, 23-56.

[85] Foster, D. and S. Viswanathan (1993b), \Can Speculative Trading Explain the Volume Volatility

Relation", Discussion paper, Fuqua School of Business, Duke University.

[86] French, K. and R. Roll (1986), \Stock Return Variances : The Arrival of Information and the

Reaction of Traders", Journal of Financial Economics 17, 5-26.

[87] Gallant, A.R., D.A. Hsieh and G. Tauchen (1994), \Estimation of Stochastic Volatility Models with

Suggestive Diagnostics", Discussion paper, Duke University.

[88] Gallant, A.R., P.E. Rossi and G. Tauchen (1992), \Stock Prices and Volume", Review of Financial

Studies, 5, 199-242.

[89] Gallant, A.R., P.E. Rossi and G. Tauchen (1993), \Nonlinear Dynamic Structures", Econometrica

61, 871-907.

[90] Gallant, A.R. and G. Tauchen (1989), \Semiparametric Estimation of Conditionally Constrained

Heterogeneous Processes : Asset Pricing applications ", Econometrica 57, 1091-1120.

[91] Gallant, A.R. and G. Tauchen (1992), \A Nonparametric Approach to Nonlinear Time Series Anal-ysis: Estimation and Simulation", in E. Parzen, D. Brillinger, M. Rosenblatt, M. Taqqu, J. Geweke

and P. Caines (eds.), New Dimensions in Time Series Analysis, Springer-Verlag, New York.

[92] Gallant, A.R. and G. Tauchen (1994), \Which Moments to Match", Econometric Theory (forthcom-ing).

[93] Gallant, A.R. and G. Tauchen (1995), \Estimation of Continuous Time Models for Stock Returnsand Interest Rates", Discussion Paper, Duke University.

[94] Gallant, A.R. and H. White (1988), A Uni�ed Theory of Estimation and Inference for NonlinearDynamic Models, (Basil Blackwell, Oxford).

[95] Garcia, R. and E. Renault (1995), \Risk Aversion, Intertemporal Substitution and Option Pricing",Discussion Paper CIRANO.

[96] Geweke, J. (1994), \Comment on Jacquier, Polson and Rossi", Journal of Business and Economics

Statistics 12, 397-399.

[97] Geweke, J. (1995), \Monte Carlo Simulation and Numerical Integration", in H. Amman, D. Kendrick

and J. Rust (ed.) Handbook of Computational Economics (North Holland).

[98] Ghysels, E., C. Gouri�eroux and J. Jasiak (1995a), \Market Time and Asset Price Movements: Theory

and Estimation", Discussion paper CIRANO and C.R.D.E., Univerist�e de Montr�eal.

[99] Ghysels, E., C. Gouri�eroux and J. Jasiak (1995b), \Trading Patterns, TimeDeformation and Stochas-

tic Volatility in Foreign Exchange Markets", Paper presented at the HFDF Conference, Z�urich.

60

Page 65: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[100] Ghysels, E. and J. Jasiak (1994a), \Comments on Bayesian Analysis of Stochastic Volatility Models",

Journal of Business and Economic Statistics 12, 399-401.

[101] Ghysels, E. and J. Jasiak (1994b), \Stochastic Volatility and Time Deformation An Application

of Trading Volume and Leverage E�ects", Paper presented at the Western Finance Association

Meetings, Santa Fe.

[102] Ghysels, E., L. Khalaf and C. Vodounou (1994), \Simulation Band Inference in Moving Average

Models", Discussion Paper, CIRANO and C.R.D.E.

[103] Ghysels, E., H.S. Lee and P. Siklos (1993), \On the (Mis)Speci�cation of Seasonality and its Conse-

quences : An Empirical Investigation with U.S. Data", Empirical Economics 18, 747-760.

[104] Goodhart, C.A.E. and M. O'Hara (1995), \High Frequency Data in Financial Markets : Issues and

Applications", Paper presented at HFDF Conference , Z�urich.

[105] Gouri�eroux, C. and A. Monfort (1993a), \Simulation Based Inference : A Survey with SpecialReference to Panel Data Models", Journal of Econometrics 59, 5-33.

[106] Gouri�eroux, C. and A. Monfort (1993b), \Pseudo-Likelihood Methods" in Maddala et al. (ed.) Hand-book of Statistics Vol. 11, (North Holland, Amsterdam).

[107] Gouri�eroux, C. and A. Monfort (1994), \Indirect Inference for Stochastic Di�erential Equations",Discussion Paper CREST, Paris.

[108] Gouri�eroux, C. and A. Monfort (1995), Simulation-Based Econometric Methods,(CORE Lecture Se-ries, Louvain-la-Neuve).

[109] Gouri�eroux, C., A. Monfort and E. Renault (1993), \Indirect Inference", Journal of Applied Econo-metrics 8, S85-S118.

[110] Gouri�eroux, C., A. Monfort and C. Tenreiro (1994), \Kernel M-Estimators: Nonparametric Diag-nostics for Structural Models", Discussion Paper, CEPREMAP.

[111] Gouri�eroux, C., A. Monfort and C. Tenreiro (1995), \Kernel M-Estimators and Functional Residual

Plots", Discussion Paper CREST - ENSAE, Paris.

[112] Gouri�eroux, C., E. Renault and N. Touzi (1994), \Calibration by Simulation for Small Sample Bias

Correction", Discussion Paper CREST.

[113] Gouri�eroux, C. and O. Scaillet (1994), \Estimation of the Term Structure from Bond Data", Journal

of Empirical Finance, forthcoming.

[114] Granger, C.W.J. and Z. Ding (1994), \Stylized Facts on the Temporal and distributional Propertiesof Daily Data for Speculative Markets", Discussion Paper, University of California, San Diego.

[115] Hall, A.R. (1993), \Some Aspects of Generalized Method of Moments Estimation" in Maddala et al.(ed.) Handbook of Statistics Vol. 11, (North Holland, Amsterdam).

61

Page 66: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[116] Hamao, Y., R.W. Masulis and V.K. Ng (1990), \ Correlations in Price Changes and Volatility Across

International Stock Markets", Review of Financial Studies 3, 281-307.

[117] Hansen, B.E. (1995), \Regression with Nonstationary Volatility", Econometrica 63, 1113-1132.

[118] Hansen, L.P. (1982), \Large Sample Properties of Generalized Method of Moments Estimators",

Econometrica 50, 1029-1054.

[119] Hansen, L.P. and J.A. Scheinkman (1995), \Back to the Future : Generating Moment Implications

for Continuous-Time Markov Processes", Econometrica 63, 767-804.

[120] Harris, L. (1986), \A Transaction Data Study of Weekly and Intradaily Patterns in Stock Returns",

Journal of Financial Economics 16, 99-117.

[121] Harrison, M. and D. Kreps (1979), \Martingale and Arbitrage in Multiperiod Securities Markets",

Journal of Economic Theory 20, 381-408.

[122] Harrison, J.M. and S. Pliska (1981), \Martingales and Stochastic Ingegrals in the Theory of Contin-uous Trading", Stochastic Processes and Their Applications 11, 215-260.

[123] Harrison, P.J. and C.F. Stevens (1976), \Bayesian Forecasting" (with discussion), J. Royal Statistical

Society, Ser. B, 38, 205-247.

[124] Harvey, A.C. (1989), Forecasting, Structural Time Series Models and the Kalman Filter, (Cambridge

University Press).

[125] Harvey, A.C. and A. Jaeger (1993), \Detrending Stylized Facts and the Business Cycle", Journal ofApplied Econometrics 8, 231-247.

[126] Harvey, A.C. (1993), \Long Memory in Stochastic Volatility", Discussion Paper, London School ofEconomics.

[127] Harvey, A.C. and S.J. Koopman (1993), \Forecasting Hourly Electricity Demand using Time-Varying

Splines", Journal of the American Statistical Association 88, 1228-1236.

[128] Harvey, A.C., E. Ruiz and E. Sentana (1992), \Unobserved Component Time Series Models withARCH Disturbances", Journal of Econometrics 52, 129-158.

[129] Harvey, A.C., E. Ruiz and N. Shephard (1994), \Multivariate Stochastic Variance Models", Reviewof Economic Studies 61, 247-264.

[130] Harvey, A.C. and N. Shephard (1993), \Estimation of Testing of Stochastic Variance Models",

STICERD Econometrics, Discussion paper, EM93/268, London School of Economics.

[131] Harvey, A.C. and N. Shephard (1996), \Estimation of an Asymmetric Stochastic Volatility Model

for Asset Returns", Journal of Business and Economic Statistics (forthcoming).

[132] Harvey, C.R. and R.D. Huang (1991), \Volatility in the Foreign Currency Futures Market", Review

of Financial Studies 4, 543-569.

62

Page 67: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[133] Harvey, C.R. and R.D. Huang (1992), \Information Trading and �xed Income Volatility, Discussion

Paper, Duke University.

[134] Harvey, C.R. and R.E. Whaley (1992), \Market volatility prediction and the e�ciency of the S&P

100 index option market", Journal of Financial Economics 31, 43-74.

[135] Hausman, J.A. and A. W. Lo (1991), \An Ordered Probit Analysis of Transaction Stock Prices",

Discussion paper, Wharton School, University of Pennsylvania.

[136] He, H. (1993), \Option Prices with Stochastic Volatilities : An Equilibrium Analysis", Discussion

Paper, University of California, Berkeley.

[137] Heston, S.L. (1993), \A Closed-Form Solution for Options with Stochastic Volatility with Applica-

tions to Bond and Currency Options", Review of Financial Studies 6, 327-343.

[138] Heynen, R., A. Kemna and T. Vorst (1994), \Analysis of the term structure of implied volatility,Journal of Financial Quantitative Analysis.

[139] Hull, J. (1993), Options, futures and other derivative securities 2nd ed., (Prentice-Hall InternationalEditions, New Jersey).

[140] Hull, J. (1995), Introduction to Futures and Options Markets, 2nd ed., (Prentice-Hall, EnglewoodCli�s, New Jersey).

[141] Hull, J. and A. White (1987), \The Pricing of Options on Assets with Stochastic Volatilities", Journal

of Finance 42, 281-300.

[142] Hu�man, G.W. (1987), \A Dynamic Equilibrium Model of Asset Prices and Transactions Volume",Journal of Political Economy 95, 138-159.

[143] Hutchinson, J.M., A.W. Lo and T. Poggio (1994), \A Nonparametric Approach to Pricing andHedging Derivative Securities via Learning Networks", Journal of Finance 49, 851-890.

[144] Jacquier, E., N.G. Polson and P.E. Rossi (1994), \Bayesian Analysis of Stochastic Volatility Models"(with discussion), Journal of Business and Economic Statistics 12, 371-417.

[145] Jacquier, E., N.G. Polson and P.E. Rossi (1995a), \Multivariate and Prior Distributions for StochasticVolatility Models", Discussion Paper CIRANO.

[146] Jacquier, E., N.G. Polson and P.E. Rossi (1995b),\Stochastic Volatility: Univariate and Multivariate

Extensions", Rodney White Center for Financial Research Working Paper 19-95, The Wharton

School, University of Pennsylvania.

[147] Jacquier, E., N.G. Polson and P.E. Rossi (1995c), \E�cient Option Pricing under Stochastic Volatil-ity", Manuscript, The Wharton School, University of Pennsylvania.

[148] Jarrow, R. and Rudd (1983), Option Pricing, (Irwin, Homewood III).

[149] Johnson, H. and D. Shanno (1987), \Option Pricing when the Variance is Changing", Journal of

Financial and Quantitative Analysis 22, 143-152.

63

Page 68: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[150] Jorion, P. (1995), \Predicting volatility in the foreign exchange market", Journal of Finance 50,

(forthcoming).

[151] Karatzas, I. and S.E. Shreve (1988), Brownian Motion and Stochastic Calculus. Springer-Verlag:

New York, NY.

[152] Karpo�, J. (1987), \The Relation between Price Changes and Trading Volume : A Survey", Journal

of Financial and Quantitative Analysis 22, 109-126.

[153] Kim, S. and N. Shephard (1994), \Stochastic Volatility : Optimal Likelihood Inference and Com-

parison with ARCH Model", Discussion Paper, Nu�eld College, Oxford.

[154] King, M., E. Sentana and S. Wadhwani (1994), \Volatility and Links Between National Stock Mar-

kets", Econometrica 62, 901-934.

[155] Kitagawa, G. (1987), \Non-Gaussian State Space Modeling of Nonstationary Time Series" (withdiscussion), Journal of the American Statistical Association 79, 378-389.

[156] Kloeden, P.E. and E. Platten (1992), Numerical Solutions of Stochastic Di�erential Equations(Springer-Verlag, Heidelberg).

[157] Lamoureux, C. and W. Lastrapes (1990), \Heteroskedasticity in Stock Return Data : Volume versusGARCH E�ect", Journal of Finance 45, 221-229.

[158] Lamoureux, C. and W. Lastrapes (1993), \Forecasting stock-return variance : towards an under-standing of stochastic implied volatilities", Review of Financial Studies 6, 293-326.

[159] Latane, H. and R. Jr. Rendleman (1976), \Standard Deviations of Stock Price Ratios Implied in

Option Prices", Journal of Finance 31, 369-381.

[160] Lezan, G., E. Renault and T. deVitry (1995) \Forecasting Foreign Exchange Risk", Paper presented

at 7th World Congres of the Econometric Society, Tokyo.

[161] Lin, W.L., R.F. Engle and T. Ito (1994), \Do Bulls and Bears Move Across Borders? InternationalTransmission of Stock Returns and Volatility as the World Turns", Review of Financial Studies,

forthcoming.

[162] Lo, A.W. (1995), \Statistical Inference for Technical Analysis Via Nonparametric Estimation", Dis-

cussion Paper, MIT.

[163] Mahieu, R. and P. Schotman (1994a), \Stochastic volatility and the distribution of exchange rate

news", Discussion Paper, University of Limburg.

[164] Mahieu, R. and P. Schotman (1994b), \Neglected Common Factors in Exchange Rate Volatility".

Journal of Empirical Finance 1, 279-311.

[165] Mandelbrot, B.B. (1963), \The Variation of Certain Speculative Prices", Journal of Business 36,

394-416.

64

Page 69: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[166] Mandelbrot, B. and H. Taylor (1967), \On the Distribution of Stock Prices Di�erences", Operations

Research 15, 1057-1062.

[167] Mandelbrot, B.B. and J.W. Van Ness (1968), \Fractal Brownian Motions, Fractional Noises and

Applications", SIAM Review 10, 422-437.

[168] McFadden, D. (1989), \A Method of SimulatedMoments for Estimation of Discrete Response Models

Without Numerical Integration", Econometrica 57, 1027-1057.

[169] Meddahi, N. and E. Renault (1995), \Aggregations and Marginalisations of GARCH and Stochastic

Volatility Models", Discussion Paper, GREMAQ.

[170] Melino, A. and M. Turnbull (1990), \Pricing Foreign Currency Options with Stochastic Volatility",

Journal of Econometrics 45, 239-265.

[171] Melino, A. (1994), \Estimation of Continuous TimeModels in Finance", in C.A. Sims (eds.) Advancesin Econometrics (Cambridge University Press).

[172] Merton, R.C. (1973), \Rational Theory of Option Pricing", Bell Journal of Economics and Manage-ment Science 4, 141-183.

[173] Merton, R.C. (1976), \Option Pricing when Underlying Stock Returns are Discontinuous", Journalof Financial Economics 3, 125-144.

[174] Merton, R.C. (1990), Continuous Time Finance, (Basil Blackwell, Oxford).

[175] Merville, L.J. and D.R. Pieptea (1989), \Stock-price volatility, mean-reverting di�usion, and noise",Journal of Financial Economics 242, 193-214.

[176] Metropolis, N.,A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller and E. Teller (1954), \Equation ofState Calculations by Fast Computing Machines", The Journal of Chemical Physics 21, 1087-1092.

[177] M�uller, U.A., M.M. Dacorogna, R.B. Olsen, W.V.Pictet, M. Schwarz and C. Morgenegg (1990),

\Statistical Study of Foreign Exchange Rates. Empirical Evidence of a Price change Scaling Law andIntraday analysis", Journal of Banking and Finance 14, 1189-1208.

[178] Nelson, D.B. (1988), \Time Series Behavior of Stock Market Volatility and Returns", Ph.D. disser-

tation, MIT.

[179] Nelson, D.B. (1990), \ARCH Models as Di�usion Approximations", Journal of Econometrics 45,7-39.

[180] Nelson, D.B. (1991), \Conditional Heteroskedasticity in Asset returns : A New Approach", Econo-metrica 59, 347-370.

[181] Nelson, D.B. (1992), \Filtering and Forecasting with Misspeci�ed ARCH Models I : Getting the

Right Variance with the Wrong Model", Journal of Econometrics 25, 61-90.

[182] Nelson, D.B. (1994), \Comment on Jacquier, Polson and Rossi", Journal of Business and Economic

Statistics 12, 403-406.

65

Page 70: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[183] Nelson, D.B. (1995a), \Asymptotic Smoothing Theory for ARCHModels", Econometrica (forthcom-

ing).

[184] Nelson, D.B. (1995b), \Asymptotic Filtering Theory for Multivariate ARCH Models", Journal of

Econometrics (forthcoming).

[185] Nelson, D.B. and D.P. Foster (1994), \Asymptotic Filtering Theory for Univariate ARCH Models",

Econometrica 62, 1-41.

[186] Nelson, D.B. and D.P. Foster (1995), \Filtering and Forecasting with Misspeci�ed ARCH Models II

: Making the Right Forecast with the Wrong Model", Journal of Econometrics (forthcoming).

[187] Noh, J., R.F. Engle and A. Kane (1994), \Forecasting volatility and option pricing of the S&P 500

index", Journal of Derivatives, 17-30.

[188] Ogaki, M. (1993), \Generalized Method of Moments : Econometric Applications", in Maddala et al.(ed.) Handbook of Statistics Vol. 11, (North Holland, Amsterdam).

[189] Pagan, A.R. and G.W. Schwert (1990), \Alternative Models for Conditional Stock Volatility", Jour-nal of Econometrics 45, 267-290.

[190] Pakes, A. and D. Pollard (1989), \Simulation and the Asymptotics of Optimization Estimators",Econometrica 57, 995-1026.

[191] Pardoux, E. and D. Talay (1985), \Discretization and Simulation of Stochastic Di�erential Equa-

tions", Acta Applicandae Mathematica 3, 23-47.

[192] Pastorello, S., E. Renault and N. Touzi (1993), \Statistical Inference for Random Variance OptionPricing", Discussion Paper, CREST.

[193] Patell, J.M. and M.A. Wolfson (1981), \The Ex-Ante and Ex-Post Price E�ects of quarterly EarningsAnnouncement Re ected in Option and Stock Price", Journal of Accounting Research 19, 434-458.

[194] Patell, J.M. and M.A. Wolfson (1979), \Anticipated Information Releases Re ected in Call OptionPrices", Journal of Accounting and Economics 1, 117-140.

[195] Pham, H. and N. Touzi (1993), \Intertemporal Equilibrium Risk Premia in a Stochastic VolatilityModel", Mathematical Finance (forthcoming).

[196] Platten, E. and Schweizer (1995), \On Smile and Skewness", Discussion Paper, Australian National

University, Canberra.

[197] Poterba, J. and L. Summers (1986), \The persistence of volatility and stock market uctuations",

American Economic Review 76, 1142-1151.

[198] Renault, E. (1995), \Econometric Models of Option Pricing Errors", Invited Lecture presented at

7th W.C.E.S., Tokyo, August.

[199] Renault, E. and N. Touzi (1992), \Option Hedging and Implicit Volatility", Mathematical Finance

(forthcoming).

66

Page 71: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[200] Revuz, A. and M. Yor (1991), Continuous Martingales and Brownian Motion (Springer Verlag,

Berlin).

[201] Robinson, P. (1993), \E�cient tests of nonstationary hypotheses", mimeo, London School of Eco-

nomics.

[202] Rogers, L.C.G. (1995), \Arbitrage with Fractional Brownian Motion", University of Bath, Discussion

paper.

[203] Rubinstein, M. (1985), \Nonparametric Tests of Alternative Option Pricing Models Using all Re-

ported Trades and Quotes on the 30 Most Active CBOE Option Classes from August 23, 1976

through August 31, 1978", Journal of Finance 40, 455-480.

[204] Ruiz, E. (1994), \Quasi-maximum Likelihood Estimation of Stochastic Volatility Models", Journal

of Econometrics 63, 289-306.

[205] Schwert, G.W. (1989), \Business Cycles, Financial Crises, and Stock Volatility", Carnegie-RochesterConference Series on Public Policy 39, 83-126.

[206] Scott, L.O. (1987), \Option Pricing when the Variance Changes Randomly : Theory, Estimationand an Application", Journal of Financial and Quantitative Analysis 22, 419-438.

[207] Scott, L. (1991), \Random Variance Option Pricing", Advances in Futures and Options Research,Vol. 5, 113-135.

[208] Sheikh, A.M. (1993), \The behavior of volatility expectations and their e�ects on expected returns",Journal of Business 66, 93-116.

[209] Shephard, N. (1995), \Statistical Aspect of ARCH and Stochastic volatility", Discussion Paper 1994,Nu�eld College, Oxford University.

[210] Sims, A. (1984), \Martingale-Like Behavior of Prices", University of Minnesota.

[211] Sowell, F. (1992), "Maximum likelihood estimation of stationary univariate fractionally integrated

time series models", Journal of Econometrics 53, 165-188.

[212] Stein, J. (1989): \Overreactions in the Options Market", The Journal of Finance 44, 1011-1023.

[213] Stein, E.M. and J. Stein (1991), \Stock Price distributions with Stochastic Volatility : An Analytic

Approach", Review of Financial Studies 4, 727-752.

[214] Stock, J.H. (1988), \Estimating Continuous Time Processes Subject to Time Deformation", Journal

of the American Statistical Association 83, 77-84.

[215] Strook, D.W. and S.R.S. Varadhan (1979), Multi-dimensional Di�usion Processes, (Springer Verlag,

Heidelberg).

[216] Tanner, T. and W. Wong (1987), \The Calculation of Posterior Distributions by Data Augmenta-

tion", Journal of the American Statistical Association 82, 528-549.

67

Page 72: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

[217] Tauchen, G. (1995), \New Minimum Chi-Square Methods in Empirical Finance", Invited Paper

presented at the 7th World Congress of the Econometric Society, Tokyo.

[218] Tauchen, G. and M. Pitts (1983), \The Price Variability-Volume Relationship on Speculative Mar-

kets", Econometrica 51, 485-505.

[219] Taylor, S.J. (1986), Modeling Financial Time Series, (John Wiley : Chichester).

[220] Taylor, S.J. (1994), \Modeling Stochastic Volatility : A Review and Comparative Study", Mathe-

matical Finance 4, 183-204.

[221] Taylor, S.J. and X. Xu (1994), \The Term structure of volatility implied by foreign exchange options",

Journal of Financial and Quantitative Analysis 29, 57-74.

[222] Taylor, S.J. and X. Xu (1993), \The Magnitude of Implied Volatility Smiles : Theory and Empirical

Evidence for Exchange Rates", Discussion Paper, University of Warwick.

[223] Von Furstenberg, G.M. and B. Nam Jeon (1989), \International Stock Price Movements: Links andMessages", Brookings Papers on Economic Activity I, 125-180.

[224] Wang, J. (1993), \A Model of Competitive Stock Trading Volume", Discussion Paper, MIT.

[225] Watanabe, T. (1993), \The Time Series Properties of Returns, Volatility and Trading Volume inFinancial Markets", Ph.D. Thesis, Department of Economics, Yale University.

[226] West, M. and J. Harrison (1990), \Bayesian Forecasting and Dynamic Models", (Springer Verlag,Berlin).

[227] Whaley, R.E. (1982), \Valuation of American call options on dividend-paying stocks. Journal ofFinancial Economics 10, 29-58.

[228] Wiggins, J.B. (1987), \Option Values under Stochastic Volatility : Theory and Empirical Estimates",Journal of Financial Economics 19, 351-372.

[229] Wood, R.T. McInish and J.K. Ord (1985), \An Investigation of Transaction Data for NYSE Stocks",

Journal of Finance 40, 723-739.

[230] Wooldridge, J.M. (1994), "Estimation and Inference for Dependent Processes" in R.F. Engle and D.

McFadden (ed.) Handbook of Econometrics Vol. 4, (North Holland, Amsterdam).

68

Page 73: Série Scientifique Scientific Series · of the missions of CIRANO: to develop the scientific analysis of organizations and strategic behaviour. Les organisations-partenaires / The

Liste des publications au CIRANO

Cahiers CIRANO / CIRANO Papers (ISSN 1198-8169)

94c-1 Faire ou faire faire : La perspective de l’économie des organisations / par Michel Patry

94c-2 Commercial Bankruptcy and Financial Reorganization in Canada / par Jocelyn Martel

94c-3 L’importance relative des gouvernements : causes, conséquences, et organisationsalternatives / par Claude Montmarquette

95c-1 La réglementation incitative / par Marcel Boyer

95c-2 Anomalies de marché et sélection des titres au Canada / par Richard Guay, Jean-FrançoisL’Her et Jean-Marc Suret

Série Scientifique / Scientific Series (ISSN 1198-8177)

95s-35 Capacity Commitment Versus Flexibility: The Technological Choice Nexus in a StrategicContext / Marcel Boyer et Michel Moreaux

95s-36 Some Results on the Markov Equilibria of a class of Homogeneous Differential Games /Ngo Van Long et Koji Shimomura

95s-37 Dynamic Incentive Contracts with Uncorrelated Private Information and HistoryDependent Outcomes / Gérard Gaudet, Pierre Lasserre et Ngo Van Long

95s-38 Costs and Benefits of Preventing Worplace Accidents: The Case of ParticipatoryErgonomics / Paul Lanoie et Sophie Tavenas

95s-39 On the Dynamic Specification of International Asset Pricing Models / Maral kichian, RenéGarcia et Eric Ghysels

95s-40 Vertical Integration, Foreclosure and Profits in the Presence of Double Marginalisation /Gérard Gaudet et Ngo Van Long

95s-41 Testing the Option Value Theory of Irreversible Investment / Tarek M. Harchaoui et PierreLasserre

95s-42 Trading Patterns, Time Deformation and Stochastic Volatility in Foreign ExchangeMarkets / Eric Ghysels, Christian Gouriéroux et Joanna Jasiak

95s-43 Empirical Martingale Simulation for Asset Prices / Jin-Chuan Duan et Jean-Guy Simonato

95s-44 Estimating and Testing Exponential-Affine Term Structure Models by Kalman Filter / Jin-Chuan Duan et Jean-Guy Simonato

95s-45 Costs and Benefits of Preventing Workplace Accidents : Going from a Mechanical to aManual Handling System / Paul Lanoie et Louis Trottier

95s-46 Cohort Effects and Returns to Seniority in France / David N. Margolis

95s-47 Asset and Commodity Prices with Multiattribute Durable Goods / Jérôme Detemple etChristos I. Giannikos

95s-48 Is Workers' Compensation Disguised Unemployment Insurance? / Bernard Fortin, PaulLanoie et Christine Laporte

95s-49 Stochastic Volatility / Eric Ghysels, Andrew Harvey et Eric Renault