This article describes a streamlined method for simultaneous integration of an entire family of interpolating functions that uses one and the same interpolation grid in one or more dimensions. A method for creating customized quadrature/cubature rules that takes advantage of certain special features of Mathematica‘s InterpolatingFunction objects is presented. The use of such rules leads to a new and more efficient implementation of the method for optimal stopping of stochastic systems that was developed in [1]. In particular, this new implementation allows one to extend the scope of the method to free boundary optimal stopping problems in higher dimensions. Concrete applications to finance—mainly to American-style financial derivatives—are presented. In particular, the price of an American put option that can be exercised with any one of two uncorrelated underlying assets is calculated as a function of the observed prices. This method is similar in nature to the well-known Longstaff-Schwartz algorithm, but does not involve Monte-Carlo simulation of any kind.
Preliminaries
The most common encounter with the concept of dynamic programming—and here we will be concerned only with continuous time dynamic programming (CTDP)—occurs in the context of the optimal stopping of an observable stochastic process with given termination payoff. This is a special case of stochastic optimal control where the control consists of a simple on-off switch, which, once turned off, cannot be turned on again. A generic example of an optimal stopping problem can be described as this: the investor observes a continuous Markov process with state-space and with known stochastic dynamics and, having observed at time the value , where denotes the observed realization of the random variable , the investor must decide whether to terminate the process immediately and collect the payoff , where is some a priori determined (deterministic) payoff function, or to wait until time for the next opportunity to terminate the process (in the context of CTDP, the time step should be understood as an infinitesimally small quantity). In addition, having observed the quantity , the investor must determine the value of the observable system (associated with that state) at time . In many situations, the stopping of the process cannot occur after some finite deterministic time .
There is a vast literature that deals with both the theoretical and the computational aspects of modeling, analysis, and optimal control of stochastic systems. The works of Bensoussan and Lions [2] and Davis [3] are good examples of classical treatments of this subject.
Many important problems from the realm of finance can be formulated—and solved—as optimal stopping problems. For example, a particular business can be established at the fixed cost of dollars, but the actual market price of that business follows some continuous stochastic process , which is observable. Having observed at time the value , an investor who owns the right to incorporate such a business must decide whether to exercise that right immediately, in which case the investor would collect the payoff of dollars, or to postpone the investment decision until time , with the hope that the business will become more valuable. In addition, having observed the value , the investor may need to determine the market price of the guaranteed right (say, a patent) to eventually incorporate this business—now, or at any time in the future.
Another classical problem from the realm of finance is the optimal exercise of a stock option of American type, that is, an option that can be exercised at any time prior to the expiration date. A stock option is a contract that guarantees the right to buy (a call option) or to sell (a put option) a particular stock (the underlying) at some fixed price (the strike price, stated explicitly in the option contract) at any time prior to the expiration date (the maturity date, also stated in the option contract). Consider, for example, an American style put option which expires at some future date . On date , the holder of the option observes the underlying price and decides whether to exercise the option, that is, to sell the stock at the guaranteed price (and, consequently, collect an immediate payoff of dollars) or to wait, hoping that at some future moment, but no later than the expiration date , the price will fall below the current level . If the option is not exercised before the expiration date, the option is lost if or, ignoring any transaction costs, is always exercised if ; that is, if the option is not exercised prior to the expiration date , on that date the owner of the option collects the amount . In general, stock options are traded in the same way that stocks are traded and, therefore, have a market value determined by the laws of supply and demand. At the same time, the price of such contracts, treated as a function of the observed stock price , can be calculated from general economic principles in conjunction with the principles of dynamic programming.
In general, the solution to any optimal stopping problem has two components: (1) for every moment in time, , one must obtain a termination rule in the form of a termination set so that the process is terminated at the first moment at which the event occurs; and (2) for every moment , one must determine the value function , which measures how “valuable” different possible observations are at time . Clearly, if denotes the termination payoff function (e.g., in the case of an American style put option that would be , ), then for every one must have , while for every one must have ; that is, one would choose to continue only when continuation is more valuable than immediate termination. Consequently, the termination sets , , can be identified with the sets , and the associated optimal stopping time can be described as
Thus, the entire solution to the optimal stopping problem can be expressed in terms of the family of value functions , which may be treated as a single function of the form
It is important to recognize that the solution to the optimal stopping problem, that is, the value function , must be computed before the observation process has begun.
In most practical situations, an approximate solution to the optimal stopping problem associated with some observable process and some fixed termination payoff may be obtained as a finite sequence of approximate value functions
where is the termination date and is some sufficiently large integer number. This approximate solution can be calculated from the following recursive rule, which is nothing but a special discrete version of what is known as the dynamic programming equation: for , we set , that is, at time , the value function coincides with the termination payoff on the entire state-space , and for , , , we set
(1) |
where is the instantaneous discount rate at time , and the probability measure , with respect to which the conditional expectation is calculated, is the so-called pricing measure (in general, the instantaneous discount rate may depend on the observed position and the pricing measure may be different from the probability measure that governs the actual stochastic dynamics of the process ). Furthermore, for every , , , the approximate termination set is given by
If, given any , the limit
(2) |
( denotes the integer part of a real number) exists, in some appropriate topology on the space of functions defined on the state-space , then one can declare that the function
gives the solution to the optimal stopping problem with observable process , payoff function , and pricing measure .
It is common to assume that under the pricing probability measure the stochastic dynamics of the observable process are given by some diffusion equation of the form
(3) |
for some (sufficiently nice) matrix-valued function , vector field , and some -valued Brownian motion process , which is independent from the starting position (note that is Brownian motion with respect to the law ). Given any , let denote the matrix and let denote the second-order field in given by
As is well known, under certain fairly general conditions for the coefficients and , and the termination payoff , the value function
that is, the solution to the associated optimal stopping problem, is known to satisfy the following equation, which is nothing but a special case of the Hamilton-Jacobi-Bellman (HJB) equation
(4) |
with boundary condition , . This equation is a more or less trivial consequence of equations (1) and (2), in conjunction with the Itô formula. It is important to recognize that the dynamic programming equation (1) is primary for the optimal stopping problem, while the HJB equation (4) is only secondary, in that it is nothing more than a computational tool. Unfortunately, equation (4) admits a closed form solution only for some rather special choices of the payoff function , the diffusion coefficients and , and the discount rate . Thus, in most practical situations, the HJB equation (4) happens to be useful only to the extent to which this equation gives rise to some numerical procedure that may allow one to compute the solution approximately. Essentially, all known numerical methods for solving equation (4) are some variations of the finite difference method. The most common approach is to reformulate equation (4) as a free boundary value problem: one must find a closed domain with a piecewise differentiable boundary and nonempty interior , plus a continuous function , which is continuously differentiable with respect to the variable and is twice continuously differentiable with respect to the variables , that satisfies the following two relations everywhere inside
and, finally, satisfies the following boundary conditions along the free (i.e., unknown) boundary
The last two conditions are known, respectively, as the value matching and the smooth pasting conditions. It can be shown that under some rather general—but still quite technical—assumptions, the free boundary value problem that was just described has a unique solution (consisting of a function and domain ) which then allows one to write the solution to the optimal stopping problem as follows: for any and any, one has
The drawbacks from formulating an optimal stopping problem as a free boundary value problem for some parabolic partial differential equation (PDE), which—in principle, at least—can be solved numerically by way of finite differencing, are well known. First, the value matching and the smooth pasting conditions are difficult to justify and this makes the very formulation of the problem rather problematic in many situations. Second, just in general, for purely technical reasons, it is virtually impossible to use finite differencing when the dimension of the state-space is higher than 3 and, in fact, in the case of free boundary value problems, even a state-space of dimension 2 is rather challenging. Third, with the exception of the explicit finite difference scheme, which is guaranteed to be stable only under some rather restrictive conditions, the implementation of most finite differencing procedures on parallel processors is anything but trivial.
At the time of this writing, it is safe to say that finite differencing is no longer at the center of attention in computational finance, where most problems are inherently complex and multidimensional. Indeed, most of the research in computational finance in the last five years or so appears to be focused on developing new simulation-based tools—see [4–8], for example (even a cursory review of the existing literature from the last few years is certain to be very long and is beyond the scope of this article). Among these methods, the Longstaff-Schwartz algorithm [8] seems to be the most popular among practitioners.
The attempts to avoid the use of finite differencing are nothing new (there is a section in the book Numerical Recipes in C entitled There Is More to Life than Finite Differencing—see [9], p. 833). Indeed, Monte Carlo, finite element, and several other computational tools have been in use for solving fixed boundary value problems for PDEs for quite some time. Apparently, from the point of view of stochastic optimal control, it is more efficient—and in some ways more natural—to develop numerical procedures directly from the dynamic programming equation (1) and skip the formulation of the HJB equation altogether—at least this is the approach that most recent studies in computational finance are taking.
Since the solution to an optimal stopping problem is given by an infinite family of functions of the form , , for some set , from a computational point of view the actual representation of any such object involves two levels of discretization. First, one must discretize the state-space , that is, functions defined on must be replaced with finite lists of values attached to some fixed set of distinct abscissas
for some fixed . This means that for every list of values , one must be able to construct a function which “extends” the discrete assignment , , to some function defined on the entire state-space . Second, one must discretize time, that is, replace the infinite family of functions not just with the finite sequence of functions
determined by the discrete dynamic programming equation (1), but, rather, with a finite sequence of lists
computed from the following space-discretized analog of the time-discrete dynamic programming equation (1): for , we set
and then define
(5) |
consecutively, for , , , where is simply the map that “extends” the discrete assignment .
Of course, in order to compute the conditional expectation in equation (5), the distribution law of the random variable relative to the pricing measure and conditioned to the event must be expressed in computable form. Unfortunately, this is possible only for some rather special choices for the diffusion coefficients and that govern the stochastic dynamics of the process under the pricing measure . This imposes yet another—third—level of approximation. The simplest such approximation is to replace in equation (5) with the random quantity
(6) |
where is some standard normal -valued random variable. As a result of this approximation, expression (5) becomes
(7) |
where stands for the standard normal probability density function in . Another (somewhat more sophisticated) approximation scheme for the diffusion process in the case is discussed in [1]. Fortunately, in the most widely used diffusion models in finance, the conditional law of , given the event , coincides with the law of a random variable of the form for some function that can be expressed in computable form, so that in this special case the right side of equation (7) is exactly identical to the right side of equation (5) and the approximation given in equation (6) is no longer necessary.
One must be aware that, in general, the distribution law of the random variable is spread on both sides of the point and, even though for a reasonably small time step this law is concentrated almost exclusively in some small neighborhood of the node , when this node happens to be very close to the border of the interpolation region, the computation of the integral on the right side of equation (7) inevitably requires information about the function outside the interpolation domain. However, strictly speaking, interpolating functions are well defined only inside the convex hull of the abscissas used in the interpolation. One way around this problem is to choose the interpolation domain in such a way that along its border the solution to the optimal stopping problem takes values that are very close to certain asymptotic values. For example, if one is pricing an American put option with strike price of , one may assume that the value of the option is practically when the price of the underlying asset is above , regardless of the time left to maturity (of course, this assumption may be grossly inaccurate when the volatility is very large and/or when the option matures in the very distant future). Consequently, for those nodes that are close to the border of the interpolation domain, the respective values will not depend on and will not be updated according to the prescription (7). At the same time, the remaining nodes must be chosen in such a way that the probability mass of the random variable is concentrated almost exclusively inside the interpolation region.
In general, numerical procedures for optimal stopping of stochastic systems differ in the following: First, they differ in the space-discretization method, that is, the method for choosing the abscissas and for “reconstructing” the function from the (finite) list of values assigned to the abscissas . Second, numerical procedures differ in the concrete quadrature rule used in the calculation of the conditional expectation in the discretized dynamic programming equation (5). The method described in this article is essentially a refinement of the method of dynamic interpolation and integration described in [1]. The essence of this method is that the functions are defined by way of spline interpolation—or some other type of interpolation—from the list of values , while the integral in equation (7) is calculated by using some standard procedures for numerical integration. As was illustrated in [1], the implementation of this procedure in Mathematica is particularly straightforward, since NIntegrate accepts as an input InterpolatingFunction objects, and, in fact, can handle such objects rather efficiently.
The key point in the method developed in this article is a procedure for computing a universal list of vectors , , associated with the abscissas , , that depend only on the stochastic dynamics of and the time step , so that one can write
(8) |
for any and for any , where stands for the convex hull of the interpolation grid , that is, the domain of interpolation. Essentially, equation (8) is a variation of the well-known cubic spline quadrature rule—see §4.0 in [9] and §5.2, p. 89, in [10]. It is also analogous to the “cubature formula,” which was developed in [11] with the help of a completely different computational tool (following [14] and [11], we use the term cubature as a reference to some integration rule for multiple integrals and the term quadrature as a reference to some integration rule for single integrals). One of the principal differences between the interpolation quadrature/cubature rules and the cubature formula obtained in [11] is that the former allows for a greater freedom in the choice of the abscissas . This means that, in practice, one can choose the evaluation points in a way that takes into account the geometry of the payoff function and not just the geometry of the diffusion . From a practical point of view, this feature is quite important and, in fact, was the main reason for developing what is now known as Gaussian quadratures.
An entirely different method for computing conditional averages of the form
was developed in [12] and [13]. In this method, quantities of this form are approximated by finite products of linear operators acting on the function . One must be aware that the methods described in [11–13] require a certain degree of smoothness for the integrand , which, generally, InterpolatingFunction objects may not have.
The method developed in this article bears some similarity also with the quantization algorithm described in [4, 5]. The key feature in both methods is that one can separate and compute—once and for all—certain quantities that depend on the observable process but not on the payoff function. Essentially, this means that one must solve, simultaneously, an entire family of boundary value problems corresponding to different payoff functions. The two methods differ in the way in which functions on the state-space are encoded in terms of finite lists of values. The advantages in using interpolating functions in general, or splines in particular, are well known: one can “restore” the pricing maps from fewer evaluation points and this feature is crucial for optimal stopping problems in higher dimensions.
Making Quadrature/Cubature Rules with Mathematica
In this section we show how to make quadrature/cubature rules for a fixed set of abscissas that are similar to the well-known Gaussian rules or the cubic spline rules. We consider quadrature rules on the real line first. To begin with, suppose that one must integrate some cubic spline that is defined from the abscissas , , and from some list of tabulated values . Since is a cubic spline, for any fixed there are real numbers , , , and for which we have
As a result, one can write
(9) |
which reduces the computation of the integral to the computation of the coefficients from the tabulated values and the abscissas .
Though not immediately obvious, it so happens that when all abscissas are fixed, each coefficient —and, consequently, the entire expression on the right side of equation (9)—can be treated as a linear function of the list (in fact, each is a linear function only of a small portion of that list). This property is instrumental for the quadrature/cubature rules that will be developed in the next section. To see why one can make this claim, recall that (e.g., see §3.3 in [9]) the cubic spline created from the discrete assignment , , is given by
(10) |
where
and , (note that the definition implies , ). Usually, one sets (which gives the so-called natural splines) and determines the remaining values from the requirement that the cubic spline has a continuous first derivative on the entire interval (it is a trivial matter to check that with this choice the second derivative of the function defined in equation (10) is automatically continuous on ). Finally, recall that this last requirement leads to a system of linear equations with unknowns and that the right side of each of these equations is a linear function of the list , while the coefficients in the left side are linear functions of the abscissas —see equation (3.3.7) in [9], for example. This means that each quantity in the list can be treated as a linear function of the list . Since the quantities , , , and are all independent from the choice of , one can claim that the entire expression in the right side of equation (10) is a linear function of the list . Thus, for every and every , we can write
where are polynomials of degree at most . More importantly, these polynomials are universal in the sense that they depend on the abscissas but not on the tabulated values . In fact, this representation holds for spline interpolation objects of any degree, except that in the general case the degree of the polynomials may be bigger than 3. An analogous representation is valid for other (non-spline) interpolating functions ; in particular, such a representation is valid for interpolating functions defined by way of divided differences interpolation, which is the method used by ListInterpolation. For example, if denotes the usual interpolating polynomial of degree defined from the discrete assignment , , then is simply the Lagrange polynomial for the abscissas . Thus, when is an interpolating function object (defined by, say, splines, divided differences, or standard polynomial interpolation) from the assignment , , assuming that is some (fixed) continuous and strictly monotone function and that is some (fixed) integrable function and, finally, assuming that the values , , are chosen so that , , one can write
(11) |
where
Since the list can be computed once and for all and independently from the interpolated values , and since the calculation of the integral comes down to the calculation of the dot product , the computation of the list may be viewed as a simultaneous integration of all functions of the form for all interpolating functions (of the same interpolation type) that are based on one and the same (forever fixed) set of abscissas . Indeed, the knowledge of the list turns the expression into a trivial function of the list , that is, a trivial function defined on the entire class of interpolating functions . In other words, in equation (11) we have made a quadrature (or cubature) rule.
The preceding remarks contain nothing new or surprising in any way and we only need to find a practical way for computing the weights . Fortunately, Mathematica allows one to define InterpolatingFunction objects even if the interpolated values are defined as symbols that do not have actual numerical values assigned to them. Furthermore, such objects can be treated as ordinary functions that can be evaluated, differentiated, and so on—all in symbolic form. We illustrate this powerful feature next (recall that ListInterpolation has default setting ).
Since, just by definition, the restriction of the function to the interval coincides with a polynomial of degree 3, by Taylor’s theorem, for any fixed , we have
Thus, if we define
then on the interval we can identify the function with the dot product
Mathematica can rearrange this expression as an ordinary polynomial:
We can also do
Essentially, this is some sort of reverse engineering of the InterpolatingFunction object , that is, a way to “coerce” Mathematica to reveal what polynomials the object is constructed of between the interpolation points.
There are other tricks that we can use in order to reverse engineer an InterpolatingFunction object. For example, with the definition of , the entire function can be restored from the values . To do the restoration we must solve a linear system for the four unknown coefficients in some (unknown) cubic polynomial. This can be implemented in Mathematica rather elegantly without the use of CoefficientList. The only drawback in this approach is that one must resort to Solve or LinearSolve, which, generally, are more time-consuming, especially in dimensions 2 or higher. Of course, instead of using reverse engineering, one can simply implement the interpolation algorithm step-by-step, which, of course, involves a certain amount of work.
For example, we have
In the definition of S34, the variable x0 can be given any numerical value inside the interval ; the choice was completely arbitrary.
In the same way, one can reverse engineer InterpolatingFunction objects that depend on any number of independent abscissas. Here is an example.
In this example, for and for , the quantity can be expressed as the following polynomial of the independent variables and .
If one does not suppress the output from the last line (which would produce a fairly long output), it becomes clear that all coefficients in the polynomial of the variables and are linear functions of the array . Consequently, if the variables and are integrated out, the expression will turn into a linear combination of all the entries (symbols) in the array VV.
Now we turn to concrete applications of the features described in this section. For the purpose of illustration, all time-consuming operations in this notebook are executed within the Timing function. The CPU times reflect the speed on Dual 2 GHz PowerPC processors with 2.5 GB of RAM. The Mathematica version is 6.0.1. For all inputs with computing time longer than 1 minute, a file that contains the returned value is made available.
Some Examples of Cubic Cubature Rules for Gaussian Integrals in
First, define the standard normal density in .
In this section, we develop cubature rules for computing integrals of the form
for certain functions . We suppose that the last integral can be replaced by
(12) |
without a significant loss of precision, where the value
is chosen so that
The next step is to choose the abscissas , , and , :
For the sake of simplicity, in display equations we will write instead of and instead of . Then we define the associated array of symbolic values attached to the interpolation nodes
and produce the actual InterpolatingFunction object:
Note that V was defined simply as an array of symbols and that, therefore, f can be treated as a symbolic object, too. In any case, we have
Note that each double integral on the right side of the last identity is actually an integral of some polynomial of the variables and multiplied by the standard normal density in . Furthermore, as was pointed out earlier, every coefficient in this polynomial is actually a linear function of the array V. Since the InterpolatingFunction object f was defined with (the default setting for ListInterpolation), we can write
where are linear functions of the tensor that can be computed once and for all by using the method described in the previous section. In addition, the integrals in the right side of the last identity also can be computed once and for all by using straightforward integration—this is the only place in the procedure where actual integration takes place. Once these calculations are completed, the integral (12) can be expressed as an explicit linear function of the array V of the form
for some fixed tensor . Of course, the actual calculation of the tensor is much more involved and time consuming than the direct numerical evaluation of the integral (12) in those instances where f is not a symbolic object, that is, when the symbols are given concrete numeric values. The point is that the calculation of the tensor is tantamount to writing the integral (12) as an explicit function of the symbols that define f; that is, we evaluate simultaneously the entire family of integrals for all possible choices of actual numeric values that can be assigned to the symbols . Once the tensor is calculated, we can write
(13) |
Note that this identity is exact if the abscissas are exact numbers. In other words, by computing the tensor we have made a cubature rule. Furthermore, when is some function defined in the rectangle that has already been defined and the symbol V is given the numeric values
then the right side of equation (13) differs from the left side by no more than times the uniform distance between the function and the interpolating function created from the assignment
First, we compute—once and for all—all integrals of the form
for all choices of , , , and , and we represent these integrals as explicit functions of the symbols .
Instead of calculating the tensor integrals, one may load its (precomputed) value from the files or integrals.txt (if available):
or
Next, we replace the symbols with the actual numeric values of the associated abscissas on the grid:
The next step is to create the lists of the midpoints between all neighboring abscissas
which would allow us to produce the Taylor expansion of the InterpolatingFunction object f at the points , for any and any . As noted earlier, for every choice of and , the associated expansion coincides with the object f everywhere in the rectangle
Now we will compute the entire expression
as a linear function of the symbols , , . Note that for every fixed and , the third summation simply gives the dot product between the vectors.
It is important to recognize that the sum of these dot products gives the exact value of the integral
Unfortunately, the exact values of the integrals stored in the list WM are quite cumbersome:
Since working with such expressions becomes prohibitively slow, we convert them to floating-point numbers (we can, of course, use any precision that we wish).
Note that this last operation is the only source of errors in the integration of the interpolating function f.
Finally, we compute the integral
in symbolic form, that is, as an explicit function of the symbols that define the interpolating function f—what we are going to compute are the actual numerical values of the coefficients for the symbols in the linear combination that represents this integral. Consequently, an expression in symbolic form for the integral is nothing but a tensor of actual numeric values. We again stress that the only loss of precision comes from the conversion of the exact numeric values for the integrals stored in WM into the floating-point numbers stored in WMN.
Instead of calculating the object CoeffList in the following, which contains the actual linear combination of the symbols , one may load its (precomputed) value from the files CoeffList.mx or CoeffList.txt (if available):
or
Now we extract the numeric values for the coefficients and store those values in the tensor :
Note that in order to produce the actual cubature rule we need to compute the tensor only once, so that several minutes of computing time is quite acceptable. Note also that (i.e., the cubature rule) can be calculated with any precision. This means that if we approximate an integral of the form
with the dot product
then the upper estimate for the error in this approximation can be made arbitrarily close to the uniform distance between the function and the interpolating function created from the values that u takes on the grid multiplied by the area of the integration domain, which is .
Now we turn to some examples. First, consider the function
and set
Now we have
For the function
we get
One must be aware that NIntegrate is a very efficient procedure and, generally, replacing NIntegrate with a “hand made” cubature rule of the type described in this section may be justified only when the integrand cannot be defined in terms of standard functions, or when a better control of the error is needed, provided that one can control the uniform distance between the integrand and the respective interpolating function created from the grid. In general, such “hand made” cubatures become useful mostly in situations where one has to compute a very large number of integrals for similarly defined integrands—assuming, of course, that the uniform error from the interpolation in all integrands can be controlled (this would be the case, for example, if one has a global bound on a sufficient number of derivatives).
Interpolation quadrature rules for single integrals can be obtained in much the same way. It must be noted that, in general, spline quadrature rules are considerably more straightforward than spline cubature rules. This is because smooth interpolation in can be obtained without any additional information about the derivatives at the grid points and this is not the case in spaces of dimension greater than or equal to. For example, cubic splines in are uniquely determined by the values at the grid points and by the requirement that the second derivative vanishes at the first and last grid points (the so-called natural splines). In contrast, smooth interpolation in two or more dimensions depends on the choice of the gradient and at least one mixed derivative at every grid point, and when information about these quantities is not available—as is often the case—one usually resorts to divided difference interpolation. In general, the efficiency of the interpolation may be increased by using a nonuniform grid and by placing more grid points in the regions where the functions that are being interpolated are more “warped.” We will illustrate this approach in the last section. The use of nonsmooth interpolation, such as bilinear interpolation in , for example, reduces considerably the computational complexity of the problem. In any case, the error from the interpolation must be controlled in one way or another. While the discussion of this issue is beyond the scope of this article, it should be noted that rather powerful estimates for the interpolation error are well known.
Some Examples of Bilinear Cubature Rules for Gaussian Integrals in
In general, just as one would expect, cubature rules based on bilinear interpolation are less accurate than the cubature rules described in the previous section (assuming, of course, one uses the same grid and keeping in mind that this claim is made just in general). However, from a computational point of view, bilinear cubature rules are considerably simpler and easier to implement, especially for more general integrals of the form
for some fixed functions Math and . The greater computational simplicity actually allows one to use a denser grid, which, in many cases, is a reasonable compensation for the loss in smoothness. The dynamic integration procedures in that are discussed in the next section are all based on bilinear cubature rules (in the case of we will use cubic quadratures).
Here we will be working with the same interpolation grid , , , , that was defined in the previous section and, just as before, in all display formulas will write instead of and instead of . Instead of working with the interpolation objects of the form
we will work with bilinear interpolation objects that we are going to “construct from scratch” as explicit functions of the variables and that depend on the respective grid points and tabulated values. This means that for
the bilinear interpolation object can be expressed as
where the function bi is defined as
It is clear from this definition that, treated as a function of the variables and , the expression bi can be written as
where the entire list can be treated as a function of the grid points and the tabulated values . In fact, this function (that is, a list of functions) can be written as
For every fixed and , the integral of the bilinear interpolating object taken over the rectangle
is simply the dot product between the list and the list of integrals (over the same rectangle) of the functions
Now we will calculate—and store—all these lists (of integrals) of dimension 4 for all choices for and for :
and then compute the sum of all dot products:
This sum is a linear function of the symbols , and this linear function represents the integral of the interpolating function over the entire interpolation region (each dot product gives the integral of in the respective sub-region). Finally, we must extract the coefficients for the symbols from the linear combination (of those symbols) that we just obtained.
Now we can use the tensor in the same way in which we used the tensor in the previous section: although contains concrete numeric values, this tensor still has the meaning of “integral of in symbolic form.” For example, if we set
then we obtain
which is reasonably close to
In this case, the cubic cubature rule is considerably more accurate:
Here is another example.
This function is nonsmooth, as the following 3D plot shows.
Finally, we remark that—at least in principle—the procedures developed in this and in the previous sections may be used with any other, that is, nonGaussian, probability density , including nonsmooth densities.
Application to Optimal Stopping Problems and Option Pricing
The main objective in this section is to revisit the method of dynamic interpolation and integration described in [1] and to present a considerably more efficient implementation of this method, which takes advantage of the fact that the objects that are being integrated sequentially are actually interpolating functions. More importantly, we will show how to implement this method in the case of optimal stopping problems for certain diffusions in . Essentially, this section is about the actual implementation in Mathematica of the method encoded in the space-discretized version of the time-discrete dynamic programming equation (7). The solution is approximated in terms of dynamic cubature rules of the form (8) by using the procedure encoded in equation (11). For the sake of simplicity, the examples presented in this section will be confined to the case where the (diffusion type) observation process is computable, in the sense that for any fixed , the random variable can be expressed as
for some computable function (note that the position of the diffusion process at time , , is independent of the future increment of the Brownian motion ). In other words, the conditional law of given the event can be identified with the distribution law of a random variable of the form
where is a standard normal -valued random variable. Of course, when is computable in the sense that we just described, then there is no need to use the approximation (6), and, if is not computable, in principle at least, the procedures described in this section can still be used, provided that the definition of the function is changed according to the recipe (6).
We note that the Mathematica code included in this section is independent of the code included in all previous sections.
To begin with, define the following symmetric probability density function in .
This is nothing but the probability density of a bivariate normal random variable , with standard normal marginals and and with correlation . Let , , and , , be two standard Brownian motions that are independent, so that for any . Next, for some fixed and for , set and for any . Note that the joint distribution law of is the same as the joint distribution law of . It is a trivial matter to check that
is a diffusion process in with generator
In what follows, we study the problem for optimal stopping of the process no later than time , for a given termination payoff . For the sake of simplicity, we will suppose that the discount rate for future payoffs is . In the special case where , , for some “sufficiently nice” function of one variable , the problem comes down to the optimal stopping of the one-dimensional diffusion . We analyze this one-dimensional case first.
The time step in the procedure, that is, the quantity , is set to
Next, define the interpolation grid:
Just as before, in all display equations we will write instead of and instead of . The boundary conditions will be prescribed at the first and last two abscissas on the extended grid . This means that the recursive procedure will update only the values on the grid . Thus, the procedure that we develop can be applied only to situations where the value function does not change at those points in the state-space that are either close to or happen to be very large.
The next step is to define the list of tabulated symbols
and the associated (symbolic) InterpolatingFunction object
Next, extract the coefficients of the cubic polynomials that represent the object g between the interpolation nodes.
For example,
which means that for , the expression can be identified with the polynomial
Next, note that the relation
is equivalent to
For every fixed , we will compute all integrals
and will store these integrals in the matrix (initially filled with zeros):
In particular, the integral
can be replaced by
and we remark that
The entire expression
then becomes a linear function of the symbols . This linear function (i.e., vector of dimension ) depends on the index and represents a quadrature rule for the integral in the left side in the last identity. We stress that we need one such rule for every abscissa , . We will store all these rules (i.e., vectors of dimension ) in the tensor (initially filled with zeroes):
As a first application, we compute the price of an American put option with payoff:
The first iteration will be done by using a direct numerical integration—not interpolation quadrature rules.
Now we will do additional iterations by using interpolation quadratures. The time step is years. This will give us the approximate price of an American put with strike price and one year left to expiry. Note that the list gives the integral associated with the abscissa , which is the same as for . The idea is to keep the value at the abscissa fixed, throughout the iterations, equal to the initial value and, similarly, keep the values at the abscissas and forever fixed at the initial values and .
Note that the first iteration took about 20 times longer to compute than the remaining 19. Now we define the value function after iterations—this is nothing but the option price with one year left to maturity treated as a function defined on the entire range of observable stock prices.
We plot the value function together with the termination payoff .
The range of abscissas at which the value function and the termination payoff coincide is precisely the price range where immediate exercise of the option is optimal, assuming that there is exactly one year left to expiry.
Now we consider the same optimal stopping problem but with a smooth payoff function, namely
Essentially, this is an example of some sort of a generalized obstacle problem.
In this case, there is no need to perform the first iteration by using direct numerical integration.
Now we perform iterations, which will give us the value function with three years left to the termination deadline.
The most common approach to dynamic programing problems of this type is to use the free boundary problem formulation, followed by some suitable variation of the finite difference scheme. However, such an approach is much less straightforward and usually takes much longer to compute—a fraction of a second would be very hard to imagine even on a very powerful computer. In fact, we could double the number of the interpolation nodes and cut the time step in half, which would roughly quadruple the timing, and still complete the calculation within one second.
It is important to point out that, in terms of the previous procedure, there is nothing special about the standard normal density. In fact, we could have used just about any other reasonably behaved probability density function —as long as the associated integrals are computable. In particular, one can use this procedure in the context of financial models in which the pricing process is driven by some nonwhite noise process, including a noise process with jumps. The only requirement is that we must be able to compute the integrals
for any .
Finally, we consider the two-dimensional case. We use the same grid-points on both axes:
Although the lists Y and YY are the same as and , in the notation that we use we will pretend that these lists might be different and will write instead of , instead of , and set
We will use bilinear cubature rules, which are easier to implement—we need one such rule for every point , , . Thus, we must compute all integrals
(14) |
for , for , for , for , and for (because of the symmetry in the density , it is enough to consider only the case ). Now we need to operate with substantially larger lists of data and for that reason we need to organize the Mathematica code differently.
Assuming that is an interpolating function constructed by way of bilinear interpolation from the tabulated symbols
we can express each of the integrals
in the form
where is the matrix given by (here we use the definition of cfbi from the previous section)
and locIntegrals is the matrix of the integrals (14) indexed by ( is the first index and is the second). In fact, with our special choice for the bi-variate density , all these integrals can be calculated in closed form:
After fixing the values for the volatility and the time step
the integrals can be written as explicit functions of the integration limits:
Our method depends in a crucial way on the fact that is actually a linear function of the symbols . In fact, we can compute the coefficients in this linear combination explicitly, as we now demonstrate.
The next two definitions are taken from the previous section, so that the code included in this section can be self-contained.
Thus, the entire integral
(15) |
can be treated as a linear function of the symbols
This linear function is nothing but a tensor of dimension and this tensor is nothing but a cubature rule for the integral (15), which, obviously, depends on the indices and . Consequently, we have one such cubature rule (tensor) for every choice of and every choice of and all such rules will be stored in the tensor (initially filled with zeros):
The calculation of the tensor is the key step in the implementation of the method of dynamic interpolation and integration in dimension 2. This is the place where we must face the “curse of the dimension.” It is no longer efficient to express the global integral (15) as the sum of local integrals of the form (14) and then extract from the sum the coefficients for the symbols . It turns out to be much faster if one updates the coefficients for the symbols sequentially, as the local integrals in (14) are calculated one by one. Just as one would expect, dealing with the boundary conditions in higher dimensions is also trickier. Throughout the dynamic integration procedure we will be updating the values of the value function at the grid points , , , or, which amounts to the same, the grid points , , . The values on the remaining grid points with , or , or , or , or , or , will be kept unchanged, or, depending on the nature of the problem, will be updated according to some other—one-dimensional, perhaps—dynamic quadrature rule.
Now we turn to the actual computation of the tensor . On a single “generic” processor, this task takes about hour (if more processors are available the task can be distributed between several different Mathematica kernels in a trivial way). The key point is that this is a calculation that we do once and for all. As will be illustrated shortly, once the tensor is calculated, a whole slew of optimal stopping problems can be solved within seconds. Of course, the “slew of optimal stopping problems” is limited to the ones where the termination payoff and the value functions obtained throughout the dynamic integration can be approximated reasonably well by way of bilinear interpolation from the same interpolation grid. In general, the set of these “quickly solvable” problems can be increased by choosing a denser grid and/or a grid that covers a larger area. However, doing so may become quite expensive. For example, if we were to double the number of grid points in each coordinate, the computing time for the tensor would increase roughly 16 times—again, everything is easily parallelizable.
It is important to recognize that the calculation of the list involves only the evaluation of standard functions and accessing the elements of a fairly large tensor ( has a total of 1,245,621 entries). This is close to the “minimal number of calculations” that one may expect to get away with: if the number of discrete data from which the value function can be restored with a reasonable degree of accuracy is fairly large, then there is a fairly large number of values that will have to be calculated no matter what. Furthermore, the evaluation of standard functions is very close to the fastest numerical procedure that one may hope to get away with in this situation. From an algorithmic point of view, the efficiency of this procedure can be improved in two ways: (1) find a new way to encrypt the value function with fewer discrete values; and/or (2) represent all functions involved in the procedure (i.e., all combinations of standard functions) in a form that allows for an even faster evaluation. In terms of hardware, the speed of the procedure depends not only on the speed of the processor but also on the speed and the organization of the memory.
Instead of calculating the list , one may load its (precomputed) value from the files Xi.mx or Xi.txt (if available):
or
Now we can turn to some concrete applications. First, we will consider an American put option with strike price on the choice of one of two uncorrelated assets (the owner of the option can choose one of the two assets when the option is exercised). The termination payoff from this option is
The two underlying assets are uncorrelated and follow the processes , , and , , where and are the Brownian motions described earlier in this section. Let be the value function with years left to expiry, that is, if the time left to expiry is years and the prices of the two assets are, respectively, and , then the value of the option is and we remark that what is meant here as “the value of the option” is actually a function defined on the entire range of prices, that is, the range of prices covered by the interpolation grid. Clearly, we must have . Furthermore, when one of the assets has a very large value, then the option is very close to a canonical American put option on the other asset. Consequently, the values at the grid points , where or , will never be updated and will remain forever fixed at the initial value . When is fixed to either or , the values at the grid points , , will be updated exactly as we did earlier in the case of an American put on a single asset with a strike price . Similarly, when is fixed to either or , the values at the grid points , , will be updated in the same way, that is, as if we were dealing with an American put on a single asset with strike price 40. Of course, since the payoff is symmetric and so is the density , we only need to update the values at the grid points for and .
Now we turn to the actual calculation. By using the method of dynamic integration of interpolating functions, we will compute—approximately—the pricing function that maps the prices of the underlying assets in the range covered by the grid into the price of the option with one year left to maturity. With time step , we need to perform 20 iterations in the dynamic integration procedure. Note that if we choose to perform the first iteration by direct numerical integration of the termination payoff, then that would require the calculation of 741 integrals.
We initialize the procedure by tabulating the termination payoff over the interpolation grid.
Then we do 20 iterations—at each iteration we update the tensor
Now we can produce the pricing map with one year left to maturity.
Calculating the derivatives of the value function (treated as functions, too) is straightforward as we now illustrate, and before we do we remark that in the realm of finance, information about these derivatives (known as the deltas of the option) is often more important than the pricing function itself.
In our second example, we consider an American put option on the more expensive of two given underlying assets (in contrast, in our previous example we dealt with an American put option on the less expensive of the two underlying assets). In this case, the termination payoff function is given by
This time the boundary conditions must take into account the fact that when one of the assets becomes too expensive, the option becomes worthless, and when one of the assets is worthless, the option may be treated as a standard American option on the other asset. For the purpose of illustration we will compute the value of the option with three years left to maturity.
This is a good example of a value function which is neither everywhere smooth nor convex.
In our final example we consider optimal stopping of the same diffusion in but with smooth termination payoff of the form
With this termination payoff the boundary conditions must reflect the fact that the option is worthless when one of the assets is worthless or when one of the assets is too expensive. We will execute iterations, which will give us the value function with three years left to expiry.
Conclusions
Most man-made computing devices can operate only with finite lists of numbers or symbols. Any use of such devices for modeling, analysis, and optimal control of stochastic systems inevitably involves the encoding of inherently complex phenomena in terms of finite lists of numbers or symbols. Thus one can think of two general directions that may lead to expanding the realm of computable models. First, one may try to construct computing devices that can handle larger and larger lists, faster and faster. Second, one may try to develop “smarter” procedures with which more complex objects can be encrypted with shorter lists of numbers. With this general direction in mind, the methodology developed in this article is entirely logical and natural. Indeed, the approximation of functions with splines or other types of interpolating functions is a familiar, well developed, and entirely natural computing tool. The use of special quadrature/cubature rules—as opposed to general methods for numerical integration—in the context of dynamic integration of interpolating functions is just as logical and natural. However, only recently have computer languages in which one can implement such procedures in a relatively simple and straightforward fashion become widely available. This article is an attempt to demonstrate how the advent of more sophisticated computing technologies may lead to the development of new and more efficient algorithms. Curiously, new and more efficient algorithms often lead to the design of new and more efficient computing devices. In particular, all procedures described in this article can be implemented on parallel processors or on computing grids in essentially a trivial way. There is a strong incentive to build computing devices that can perform simultaneously several numerical integrations, or can compute dot products between very large lists of numbers very fast.
Finally, we must point out that most of the examples presented in the article are only prototypes. They were meant to be executed on a generic (and slightly out of date) laptop computer with the smallest possible number of complications. Many further improvements in terms of higher accuracy and overall efficiency can certainly be made.
Acknowledgments
The author thanks all three anonymous referees for their extremely helpful comments and suggestions.
References
[1] | A. Lyasoff, “Path Integral Methods for Parabolic Partial Differential Equations with Examples from Computational Finance,” The Mathematica Journal, 9(2), 2004 pp. 399-422. |
[2] | A. Bensoussan and J. L. Lions, Applications of the Variational Inequalities in Stochastic Control, Amsterdam: North Holland, 1982. |
[3] | M. H. A. Davis, “Markov Models and Optimization,” Monographs on Statistics and Applied Probability, Vol. 49, Chapman & Hall/CRC Press,1993. |
[4] | V. Bally and G. Pagès, “A Quantization Algorithm for Solving Discrete Time Multidimensional Optimal Stopping Problems,” Bernoulli, 9(6), 2003 pp. 1003-1049. |
[5] | V. Bally and G. Pagès, “Error Analysis of the Quantization Algorithm for Obstacle Problems,” Stochastic Processes and their Applications, 106(1), 2003 pp. 1-40. |
[6] | B. Bouchard, I. Ekeland, and N. Touzi, “On the Malliavin Approach to Monte Carlo Approximation of Conditional Expectations,” Finance and Stochastics, 8(1), 2004 pp. 45-71. |
[7] | M. Broadie and P. Glasserman, “Pricing American-Style Securities Using Simulation,” Journal of Economic Dynamics and Control, 21(8-9),1997 pp.1323-1352. |
[8] | F. A. Longstaff and R. S. Schwartz, “Valuing American Options By Simulation: A Simple Least-Square Approach,” Review of Financial Studies, 14(1), 2001 pp. 113-147. |
[9] | W. H. Press, B. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, 2nd ed., Cambridge: Cambridge University Press, 1992. |
[10] | G. E. Forsythe, M. A. Malcolm, and C. B. Moler, Computer Methods for Mathematical Computations, Englewood Cliffs, NJ: Prentice Hall, 1977. |
[11] | T. Lyons and N. Victoir, “Cubature on Wiener Space,” in Proceedings of the Royal Society of London, Series A (Mathematical and Physical Sciences), 460(2041), 2004 pp. 169-198. |
[12] | S. Kusuoka, “Approximation of Expectation of Diffusion Process and Mathematical Finance,” in Advanced Studies in Pure Mathematics, Proceedings of the Taniguchi Conference on Mathematics (Nara 1998), (M. Maruyama and T. Sunada, eds.), 31, 2001 pp. 147-165. |
[13] | S. Kusuoka, “Approximation of Expectation of Diffusion Processes Based on Lie Algebra and Malliavin Calculus,” Advances in Mathematical Economics, 6, 2004 pp. 69-83. |
[14] | A. H. Stroud, Approximate Calculation of Multiple Integrals, Englewood Cliffs, NJ: Prentice-Hall, 1971. |
A. Lyasoff, “Dynamic Integration of Interpolating Functions and Some Concrete Optimal Stopping Problems,” The Mathematica Journal, 2011. dx.doi.org/10.3888/tmj.10.4-3. |
Additional Material
Lyasoff.zip contains:
integrals.mx
integrals.txt
CoeffList.mx
CoeffList.txt
Xi.mx
Xi.txt
Available at content.wolfram.com/sites/19/2011/12/Lyasoff.zip.
About the Author
Andrew Lyasoff is director of the Graduate Program in Mathematical Finance at Boston University. His research interests are mainly in the areas of Stochastic Analysis, Optimization Theory, and Mathematical Finance and Economics.
Andrew Lyasoff
Boston University
Mathematical Finance Program
Boston, MA
alyasoff@bu.edu