Parametric Techniques on Other Distributions- USING THE PARAMETERS TO FIND OPTIMAL F

Parametric Techniques on Other Distributions


Now that we have found the best-fitting parameter values, we can find the optimal f on this distribution. We can take the same procedure we used to find the optimal f on the Normal Distribution. The only difference now is that the associated probabilities for each standard value (X value) are calculated per the procedure. With the Normal Distribution, we find our associated probabilities column by using equation. Here, to find our associated probabilities, we must follow the procedure detailed previously:

1. For a given standard value, X, we figure its corresponding N'(X).
2. For each standard value, we also have the interim step of keeping a running sum of the N'(X) 's corresponding to each value of X.
3. Now, to find N(X), the resultant probability for a given X, add together the running sum corresponding to the X value with the running sum corresponding to the previous X value. Divide this sum by 2. Then divide this quotient by the sum total of the N'(X)'s, the last entry in the column of running sums. This new quotient is the associated 1- tailed probability for a given X.

Since we now have a procedure to find the associated probabilities for a given standard value, X, for a given set of parameter values, we can find our optimal f. The procedure is exactly the same as that detailed for finding the optimal f on the Normal Distribution. The only difference is that we calculate the associated probabilities column differently. In our 232-trade example, the parameter values that result in the lowest K-S statistic are .02, 2.76, 0, and 1.78 for LOC, SCALE, SKEW, and KURT respectively. We arrived at these parameter values by using the optimization procedure. This resulted in a K-S statistic of .0835529, and a significance level of 7.8384%.

Figure  Adjustable distribution fit to the 232 trades.

If we take these parameters and find the optimal f on this distribution, bounding the distribution from +3 to -3 sigmas and using 100 equally spaced data points, we arrive at an optimal f value of .206, or 1 contract for every $23,783.17. Compare this to the empirical method, which showed that optimal growth is obtained at 1 contract for every $7,918.04 in account equity. But that is the result we get if we bound the distribution at 3 sigmas either side of the mean. In reality, in the empirical stream of trades, we had a worst-case loss of 2.96 sigmas and a best-case gain of 6.94 sigmas. 

Now if we go back and bound our distribution at 2.96 sigmas on the left of the mean and 6.94 on the right, we obtain an optimal f of .954 or 1 contract for every $5,062.71 in account equity. Why does this differ from the empirical optimal f of $7,918.04? The difference is in the "roughness" of the actual distribution. Recall that the significance level of our best-fitting parameters was only 7.8384%. Let us take our 232-trade distribution and bin it into 12 bins from -3 to +3 sigmas.

                 Bin      Number of Trades
    -3.0      -2.5         2
    -2.5      -2.0         1
    -2.0      -1.5         2
    -1.5      -1.0         24
    -1.0      -0.5         39
    -0.5       0.0         43
     0.0       0.5         69
     0.5       1.0         38
     1.0       1.5         7
     1.5       2.0         2
     2.0       2.5         0
     2.5       3.0         2

Notice that out on the tails of the distribution are gaps, areas or bins where there isn't any empirical data. These areas invariably get smoothed over when we fit our adjustable distribution to the data, and it is these smoothed-over areas that cause the difference between the parametric and the empirical optimal fs. Why doesn't our distribution fit the observed better, especially in light of how malleable it is? The reason has to do with the observed distribution having too many pointy of inflection.

A parabola can be cupped upward or downward. Yet over the extent of a parabola, the direction of the cup, whether it points upward or downward, is unchanged. We define a point of inflection as any time the direction of the concavity changes from up to down. Therefore, a parabola has 0 points of inflection, since the direction of the concavity never changes. An object shaped like the letter S lying on its side has one point of inflection, one point where the concavity changes from up to down.

Figure  Points of inflection on a bell-shaped distribution.

Notice there are two points of inflection in a bell-shaped curve such as the Normal Distribution. Depending on the value for SCALE, our adjustable distribution can have n zero points of inflection or two points of inflection. The reason our adjustable distribution does not fit the actual distribution of trades any better than it does is that the actual distribution has too many Points of inflection.

Does this mean that our fitted adjustable distribution is wrong? Probably not. If we were so inclined, we could create a distribution function that allowed for more than two points of inflection, which would better curve-fit to the actual observed distribution. If we created a distribution function that allowed for as many points of inflection as we desired, we could fit to the observed distribution perfectly. Our optimal f derived therefrom would • then be nearly the same as the empirical. However, the more points of inflection we were to add to our distribution function, the less robust it would be (i.e., it would probably be less representative of the trades in the future). However, we are not trying to fit the parametric f to the observed exactly. 

We are trying to determine how the observed data is distributed so that we can determine with a fair degree of accuracy what the optimal fin the future will be if the data is distributed as it were in the past. When we look at the adjustable distribution that has been fit to our actual trades, the spurious points of inflection are removed. An analogy may clarify this. Suppose we are using Galton's board. We know that asymptotically the distribution of the balls falling through the board will be Normal. However, we are only going to see 4 balls rolled through the board. Can we expect the outcomes of the 4 balls to be perfectly conformable to the Normal? How about 5 balls? 50 balls? In an asymptotic sense, we expect the observed distribution to flesh out to the expected as the number of trades increases. 

Fitting our theoretical distribution to every point of inflection in the actual will not give us any greater degree of accuracy in the future. As more trades occur, we can expect the observed distribution to converge toward the expected, as we can expect the extraneous points of inflection to be filled in with trades as the number of trades approaches infinity. If the process generating the trades is accurately modeled by our parameters, the optimal f derived from the theoretical will be more accurate over the future sequence of trades than the optimal f derived empirically over the past trades. In other words, if our 232 trades are a proxy of the distribution of the trades in the future, then we can expect the trades in the future to arrive in a distribution more like the theoretical one that we have fit than like the observed with its extraneous points of inflection and its roughness due to not having an infinite number of trades. 

In so doing, we can expect the optimal fin the future to be more like the optimal f obtained from the theoretical distribution than it is like the optimal f obtained empirically over the observed distribution. So, we are better off in this case to use the parametric optimal f rather than the empirical. The situation is analogous to the 20-coin-toss. If we expect 60% wins at a 1:1 pay-off, the optimal f is correctly .2. However, if we only had empirical data of the last 20 tosses, 11 of which were wins, our optimal f would show as .1, even though ,2 is what we should optimally bet on the next toss since it has a 60% chance of winning. We must assume that the parametric optimal f is correct because it is the optimal f on the generating function. As with the coin-toss game just mentioned, we must assume that the optimal f for the next trade is determined parametrically by the generating function, even though this may differ from the empirical optimal f.

Obviously, the bounding parameters have a very important effect on the optimal f. Where should you place the bounding parameters so as to obtain the best results? Look at what happens as we move the upper bound up. The following table is compiled by bounding the lower end at 3 sigmas, and using 100 equally spaced data points and the optimal parameters to our 232 trades:

Upper Bound        f         f$
3 Sigmas          .206     $23783.17
4 Sigmas          .588     $8,332.51
5 Sigmas          .784     $6,249.42
6 Sigmas          .887     $5,523.73
7 Sigmas          .938     $5,223.41
8 Sigmas          .963     $5,087.81
100 Sigmas      .999     $4,904.46

Notice that, keeping the lower bound constant, the higher up we move the higher bound, the more the optimal f approaches 1. Thus, the more we move the upper bound up, the more the optimal f in dollars will approach the lower bound exactly. In this case, where our lower bound is at -3 sigmas, the more we move the upper bound up, the more the optimal f in dollars will approach the lower bound as a limit-$330.13-(1743.23*3) = -$4,899.56. Now observe what happens when we keep the upper bound constant (at 3), but move the lower bound lower. Very soon into this process the arithmetic mathematical expectation turns negative. This happens because more than 50% of the area under the characteristic function is to the left of the zero axis. Consequently, as we move the lower bounding parameter lower, the optimal f quickly goes to zero.

Now consider what happens when we move both bounding parameters out at the same rate. Here we are using the optimal parameter set of .02, 2.76, 0, and 1.78 on our distribution of 232 trades, and 100 equally spaced data points:

Upper and Lower Bound                 f                f$
3 Sigmas                                     .206        $23,783.17
4 Sigmas                                     .158        $42,040.42
5 Sigmas                                     ,126        $66,550.75
6 Sigmas                                     .104        $97,387.87
10 Sigmas                                   .053        $322,625.17

Notice that our optimal f approaches 0 as we move both bounding parameters out to plus and minus infinity. Furthermore, since our worst-case loss gets greater and greater, and gets divided by a smaller and smaller optimal f, our f$, the amount to finance 1 unit by, approaches infinity as well. The problem of where the best place is to put the bounding parameters is best rephrased as, "Where, in the extreme case, do we expect the best and worst trades in the future to occur?" The tails of the distribution itself actually go to plus and minus infinity. To account for this we would optimally finance each contract by an infinitely high amount. If we were going to trade for an infinitely long time into the future, our optimal f in dollars would be infinite. 

But we're not going to trade this market system forever. The optimal f in the future over which we are going to trade this market system is a function of what the best and worst trades in that future are. Recall that if we flip a coin 100 times and record what the longest streak of consecutive tails is, then flip the coin another 100 times, the longest streak of consecutive tails at the end of 200 flips will more than likely be greater than it was after only the first 100 flips. Similarly, if the worst-case loss seen over our 232-trade history was a 2.96-sigma loss then we should expect a loss of greater than 3 sigmas in the future over which we are going to trade this market system. 

Therefore, rather than bounding our distribution at what the bounds of the past history of trades were (-2.96 and +6.94 sigmas), we will bound it at -4 and +6.94 sigmas. We should perhaps expect the high-end bound to be violated in the future, much as we expect the low-end bound to be violated. However, we won't make this assumption for a couple of reasons. The first is that trading systems notoriously do not trade as well into the future, in general, as they have over historical data, even when there are no optimizable parameters involved. It gets back to the principle that mechanical trading systems seem to suffer from a continually deteriorating edge. 

Second, the fact that we pay a lesser penalty for erring in optimal f if we err to the left of the peak of the f curve than if we err to the right of it suggests that we should err on the conservative side in our prognostications about the future. Therefore, we will determine our parametric optimal f by using the bounding parameters of -4 and +6.94 sigmas and use 300 equally spaced data points. However, in calculating the probabilities at each of the 300 equally spaced data points, it is important that we begin our distribution 2 sigmas before and after our selected bounding parameters. We therefore determine the associated probabilities by creating bars from -6 to +8.94 sigmas, even though we are only going to use the bars between -4 and +6.94 sigmas. 

In so doing, we have enhanced the accuracy of our results. Using our optimal parameters of .02, 2.76, 0, and 1.78 now yields an optimal f of .837, or 1 contract per every $7,936.41. So long as our selected bounding parameters are not violated, our model of reality is accurate in terms of the bounds selected. That is, so long as we do not see a loss greater than 4 sigmas-$330.13-(1743.23*4) = -$6,642.79-or a profit greater than 6.94 sigmas- $330.13+(1743.23*6.94) = $12,428.15-we have accurately modeled the bounds of the distribution of trades in the future. The possible divergence between our model and reality is our blind spot. That is, the optimal f derived from our model is the optimal f for our model, not necessarily for reality. 

If our selected bounding parameters are violated in the future, our selected optimal f cannot then be the optimal. We would be smart to defend this blind spot with techniques, such as long options, that limit our liability to a prescribed amount. While we are discussing weaknesses with the method, one final weakness should be pointed out. Once you have obtained your parametric optimal f, you should be aware that the actual distribution of trade profits and losses is one in which the parameters are constantly changing, albeit slowly. You should frequently run the technique on your trade profits and losses for each market system you are trading to monitor these dynamics of the distributions.


Once you have obtained your parametric optimal f, you can perform "What If types of scenarios on your distribution function by altering the parameters LOC, SCALE, SKEW, and KURT of the distribution function to replicate different expected outcomes in the near future and observe the effects. Just as we can tinker with stretch and shrink on the Normal distribution, so, too, can we tinker with the parameters LOC, SCALE, SKEW, and KURT of our adjustable distribution.

The "What if capabilities of the parametric technique are the strengths that help to offset the weaknesses of the actual distribution of trade P&L's moving around. The parametric techniques allow us to see the effects of changes in the distribution of actual trade profits and losses before they occur, and possibly to budget for them. When tinkering with the parameters, a suggestion is in order. When finding the optimal f, rather than tinkering with the LOC, the location parameter, you are better off tinkering with the arithmetic average trade in dollars that you are using as input. 

Figure  Altering location parameters.

Notice that changing the location parameter LOC moves the distribution right or left in the "window" of the bounding parameters. But the bounding parameters do not move with the distribution. Thus, a change in the LOC parameter also affects how many equally spaced data points will be left of the mode and right of the mode of the distribution. By changing the actual arithmetic mean, the window of the bounding parameters moves also. When you alter the arithmetic average trade as input, or alter the shrink variable in the Normal Distribution mechanism, you still have the same number of equally spaced data points to the right and left of the mode of the distribution that you had before the alteration.


The technique was shown using data that was not equalized. We can also use this very same technique on equalized data. If we want to determine an equalized parametric optimal f, we would convert the raw trade profits and losses over to percentage gains and losse. Next, we would convert these percentage profits and losses by multiplying them by the current price of the underlying instrument. For example, P&L number 1 is .18. Suppose the entry price to this trade was 100.50. The percentage gain on this trade would be .18/100.50 = .001791044776. Now suppose that the current price of this underlying instrument is 112.00. Multiplying .001791044776 by 112.00 translates into an equalized P&L of .2005970149.

If we were seeking to do this procedure on an equalized basis, we would perform this operation on all 232 trade profits and losses. We would then calculate the arithmetic mean and population standard deviation on the equalized trades and would use to standardize the trades. Next, we could find the optimal parameter set for LOC, SCALE, SKEW, and KURT on the equalized data exactly for nonequalized data. The rest of the procedure is the same in terms of determining the optimal f, geometric mean, and TWR. The by-products of the geometric average trade, arithmetic average trade, and threshold to the geometric are only valid for the current price of the underlying instrument. 

When the price of the underlying instrument changes, the procedure must be done again, going back to step one and multiplying the percentage profits and losses by the new underlying price. When you go to redo the procedure with a different underlying price, you will obtain the same optimal f, geometric mean, and TWR. However, your arithmetic average trade, geometric average trade, and threshold to the geometric will be different based upon the new price of the underlying instrument. The number of contracts to trade must be changed. The worst-case associated P&L, the W variable, will be different as a result of the changes caused in the equalized data by a different current price.


At this point you should realize that there are many other ways you can determine your parametric optimal f. We have covered a procedure for finding the optimal f on Normally distributed data. Thus we have a procedure that will give us the optimal f for any Normally distributed phenomenon. That same procedure can be used to find the optimal on data of any distribution, so long as the cumulative density function of the selected distribution is availableWhen the cumulative density function is not available, the optimal f can be found for any other function by the integration method to approximate the cumulative densities, the areas under the curve. I have elected to model the actual distribution of trades by way of our adjustable distribution. 

This amounts to little more than finding a function and its appropriate values, which model the actual density function of the trade P&L's with a maximum of 2 points of inflection. You could use or create many other functions and methods to do this-such as polynomial interpolation and extrapolation, rational function interpolation and extrapolation, or using splines to fit a theoretical function to the actual. Once any theoretical function is found, the associated probabilities can be determined by the same method of integral estimation as was used in finding the associated probabilities of our adjustable distribution or by using integration techniques of calculus. There is a problem with fitting any of these other functions. 

Part of the thrust has been to allow users of systems that are not purely mechanical to have the same account management power that users of purely mechanical systems have. As such, the adjustable distribution route that I took only requires estimates for the parameters. These parameters pertain to the first four moments of the distribution. It is these moments -location, scale, skewness, and kurtosis-that describe the distribution. Thus, someone trading on some not purely mechanical basis-e.g., Elliott wave— could estimate the parameters and have access to optimal f and its by-product calculations. A past history of trades is not a prerequisite for estimating these parameters. 

If you were to use any of the other fitting techniques mentioned, you wouldn't necessarily need a past history of trades either, but the estimates for the parameters of those fitting techniques do not necessarily pertain to the moments of the distribution. What they pertain to is a function of the particular function you are using. These other techniques would not necessarily allow you to see what would happen if kurtosis increased or skewness changed or the scale were altered, and so on. Our adjustable distribution is the logical choice for a theoretical function to fit to the actual, since the parameters not only measure the moments of the distribution, they give us control over those moments when prognosticating about future changes to the distribution. Furthermore, estimating the parameters of our adjustable distribution is easier than with fitting any other function which I am aware of.


People who forecast for a living have a notorious history for incorrect forecasts, but most decisions anyone must make in life usually require making a forecast about the future. A couple of pitfalls immediately crop up here. To begin with, people generally make assumptions about the future that are more optimistic than the actual probabilities. Most people feel that they arc far more likely to win the lottery this month than they are to die in an auto accident, even though the probabilities of the latter are greater. This is not only true on the level of the individual, it is even more pronounced at the level of the group. When people work together, they tend to see a favorable outcome as the most likely result, otherwise they would quit the project they are a part of (unless, of course, we have all become automatons mindlessly slaving away on sinking ships).

The second and more harmful pitfall is that people make straight-line forecasts into the future. People try to predict the price of a gallon of gas two years from now, predict what will happen with their jobs, who will be the next president, what the next styles will be, and on and on. Whenever we think of the future, we tend to think in terms of a single, most likely outcome. As a result, whenever we must make decisions, whether as an individual or a group, we tend to make these decisions based on what we think will be the single most likely outcome in the future. As a consequence, we are extremely vulnerable to unpleasant surprises. Scenario planning is a partial solution to this problem. A scenario is simply a possible forecast, a story about one way that the future might unfold. 

Scenario planning is a collection of scenarios to cover the spectrum of possibilities. Of course, the complete spectrum can never be covered, but the scenario planner wants to cover as many possibilities as he or she can. By acting in this manner, as opposed to a straight-line forecast of the most likely outcome, the scenario planner can prepare for the future as it unfolds. Furthermore, scenario planning allows the planner to be prepared for what might otherwise be an unexpected event. Scenario planning is tuned to reality in that it recognizes that certainty is an illusion. Suppose you are involved in long-run planning for your company. Say you make a particular product. Bather than making a single-most-likely-outcome, straight-line forecast, you decide to exercise scenario planning. You Will need to sit down with the other planners and brain-storm for possible scenarios. 

What if you cannot get enough of the raw materials to make your product? What if one of your competitors fails? What if a new competitor emerges? What if you have severely underestimated demand for this product? What if a war breaks out on such-and-such a continent? What if it is a nuclear war? Because each scenario is only one of several, each scenario can be considered seriously. But what do you do once you have defined these scenarios? To begin with, you must determine what goal you would like to achieve for each given scenario. Depending upon the scenario, the goal need not be a positive one. For instance, under a bleak scenario your goal may simply be damage control. Once you have defined a goal for a given scenario, you then need to draw up the contingency plans pertaining to that scenario to achieve the desired goal. 

For instance, in the rather unlikely bleak scenario where your goal is damage control, you need to have plans formulated so that you can minimize the damage. Above all else, scenario planning provides the planner with a course of action to take should a certain scenario develop. It forces you to make plans before the fact; it forces you to be prepared for the unexpected. Scenario planning can do a lot more, however. There is a hand-in-glove fit between scenario planning and optimal f. Optimal fallows us to determine the optimal quantity to allocate to a given set of possible scenarios. We can exist in only one scenario at a time, even though we are planning for multiple futures. Scenario planning puts us in a position where we must make a decision regarding how much of a resource to allocate today given the possible scenarios of tomorrow. 

This is the true heart of scenario planning-quantifying it. We can use another parametric method for optimal f to determine how much of a certain resource to allocate given a certain set of scenarios. This technique will maximize the utility obtained in an asymptotic geometric sense. First, we must define each unique scenario. Second, we must assign a number to the probability of that scenario's occurrence. Being a probability means that this number is between 0 and 1. Scenarios with a probability of 0 we need not consider any further. Note that these probabilities are not cumulative. In other words, the probability assigned to a given scenario is unique to that scenario. Suppose we are a decision maker for XYZ Manufacturing Corporation. Two of the many scenarios we have are as follows. 

In one scenario XYZ Manufacturing files for bankruptcy, with a probability of .15; in the other scenario XYZ is being put out of business by intense foreign competition, with a probability of .07. Now, we must ask if the first scenario, filing for bankruptcy, includes filing for bankruptcy due to the second scenario, intense foreign competition. If it does, then the probabilities in the first scenario have not taken the probabilities of the second scenario into account, and we must amend the probabilities of the first scenario to be .08 (.15-.07). Note also that just as important as the uniqueness of each probability to each scenario is that the sum of the probabilities of all of the scenarios we are considering must equal 1 exactly, not 1.01 nor .99, but 1.

For each scenario we now have assigned a probability of just that scenario occurring. We must also assign an outcome result. This is a numerical value. It can be dollars made or lost as a result of a scenario manifesting itself, it can be units of utility, medication, or anything. However, our output is going to be in the same units that we put in as input. You must have at least one scenario with a negative outcome in order to use this technique. This is mandatory. Since we are trying to answer the question "How much of this resource should we allocate today given the possible scenarios of tomorrow?", if there is not a negative outcome scenario, then we should allocate 100% of this resource. Further, without a negative outcome scenario it is questionable how tuned to reality this set of scenarios really is.

A last prerequisite to using this technique is that the mathematical expectation, the sum of all of the outcome results times their respective probabilities, must be greater than zero.

ME = ∑[i = 1,N] (Pi *Ai)


Pi = The probability associated with the ith scenario.
Ai = The result of the ith scenario.
N = The total number of scenarios under consideration.

If the mathematical expectation equals zero or is negative, the following technique cannot be used. That's not to say that scenario planning itself cannot be used. It can and should. However, optimal f can only be incorporated with scenario planning when there is a positive mathematical expectation. When the mathematical expectation is zero or negative, we ought not allocate any of this resource at this time.

Lastly, you must try to cover as much of the spectrum of outcomes as possible. In other words, you really want to account for 99% of the possible outcomes. This may sound nearly impossible, but many scenarios can be made broader so that you don't need 10,000 scenarios to cover 99% of the spectrum. In making your scenarios broader, you must avoid the common pitfall of three scenarios: an optimistic one, a pessimistic one, and a third where things remain the same. This is too simple, and the answers derived therefrom are often too crude to be of any value. Would you want to find your optimal f for a trading system based on only three trades?

So even though there may be an unknowably large number of scenarios covering the entire spectrum, we can cover what we believe to be about 99% of the spectrum of outcomes. If this makes for an unmanageably large number of scenarios, we can make the scenarios broader to trim down their number. However, by trimming down their number we lose a certain amount of information. When we trim down the number of scenarios down to only three, a common pitfall, we have effectively eliminated so much information that this technique is severely hampered in its effectiveness.

What is a good number of scenarios to have then? As many as you can and still manage them. Here, a computer is a great asset. Assume again that we are decision making for XYZ. We are looking at marketing a new product of ours in a primitive, remote little country. We are looking at five possible scenarios (in reality you should have many more than this, but we'll use five for the sake of illustration). These five scenarios portray what we perceive as possible futures for this primitive remote country, their probabilities of occurrence, and the gain or loss of investing there.

Scenario      Probability        Result
War              .1                     -$500,000
Trouble        .2                     -$200,000
Stagnation   .2                      0
Peace          .45                    $500,000
Prosperity    .05                    $1 ,000,000
                     Sum 1.00

The sum of our probabilities equals 1. We have at least 1 scenario with a negative result, and our mathematical expectation is positive:

(.1*-$500,000)+(.2*-$200,000)+.. = $185,000

We can therefore use the technique on this set of scenarios. Notice first, however, that if we used the single most likely outcome method we would conclude that peace will be the future of this country, and we would then act as though peace was to occur, as though it were a certainty, only vaguely remaining aware of the other possibilities.

Returning to the technique, we must determine the optimal f. The optimal f is that value for f (between 0 and 1) which maximizes the geometric mean:

Geometric mean = TWR^(1/∑[i = 1,N] Pi) and
TWR = ∏[i = 1,N] HPRi and
HPRi = (1+(Ai/(W/-f))) ^ Pi therefore
Geometric mean = (∏[i = 1,N] (1+(Ai/(W/-f))) ^ Pi) ^ (1/∑[i = 1,N] Pi) 

Finally then, we can compute the real TWR as:

TWR = Geometric Mean ^ X


N = The number of different scenarios.
TWR = The terminal wealth relative.
HPRi = The holding period return of the ith scenario.
Ai = The outcome of the ith scenario.
Pi = The probability of the ith scenario.
W = The worst outcome of all N scenarios.
f = The value for f which we are testing.
X = However many times we want to "expand" this scenario out. 

That is, what we would expect to make if we invested f amount into these possible scenarios X times. The TWR returned is just an interim value we must have in order to obtain the geometric mean. Once we have this geometric mean, the real TWR can be obtained by equation. Here is how to perform these equations. To begin with, we must decide on an optimization scheme, a way of searching through the f values to find that f which maximizes our equation. Again, we can do this with a straight loop with f from .01 to 1, through iteration, or through parabolic interpolation. 

Next, we must determine what the worst possible result for a scenario is of all of the scenarios we are looking at, regardless of how small the probabilities of that scenario's occurrence are. In the example of XYZ Corporation this is -$500,000. Now for each possible scenario, we must first divide the worst possible outcome by negative f. In our XYZ Corporation example, we will assume that we are going to loop through f values from .01 to 1. Therefore we start out with an f value of .01. Now, if we divide the worst possible outcome of the scenarios under consideration by the negative value for f:

-$500,000/-.01 = $50,000,000

Negative values divided by negative values yield positive results, so our result in this case is positive. As we go through each scenario, we divide the outcome of the scenario by the result just obtained. Since the outcome to the first scenario is also the worst scenario, a loss of $500,000, we now have:

-$500,000/$50,000,000 = -.01

The next step is to add this value to 1. This gives us: l+(-.01) = .99 Lastly, we take this answer to the power of the probability of its occurrence, which in our example is .1:

.99^.1 = .9989954713

Next, we go to the next scenario labeled 'Trouble," where there is a .2 probability of a loss of $200,000. Our worst-case result is still -$500,000. The f value we are working on is still .01, so the value we want to divide this scenario's result by is still $50,000,000:

-$200,000/$50,000,000 = -.004

Working through the rest of the steps to obtain our HPR:

1+(-.004) = .996
    .996^.2 = .9991987169

If we continue through the scenarios for this test value of .01 for f, we will find the 3 HPRs corresponding to the last 3 scenarios:

        Stagnation      1.0
       Peace             1.004467689
       Prosperity       1.000990622

Once we have turned each scenario into an HPR for the given f value, we must multiply these HPRs together:

.9989954713*.9991987169*1.0*1.004487689*1.000990622 = 1.00366'7853

This gives us the interim TWR, which in this case is 1.003667853. Our next step is to take this to the power of 1 divided by the sum of the probabilities. Since the sum of the probabilities is 1, we can state that we must raise the TWR to the power of 1 to give us the geometric mean. Since anything raised to the power of 1 equals itself, we can say that our geometric mean equals the TWR in this case. We therefore have a geometric mean of 1.003667853. If, however, we relaxed the constraint that each scenario must have a unique probability, then we could allow the sum of the probabilities of the scenarios to be greater than 1. In such a case, we would have to raise our TWR to the power of 1 divided by this sum of the probabilities in order to derive the geometric mean.

The answer we have just obtained in our example is our geometric mean corresponding to an f value of .01. Now we move on to an f value of .02, and repeat the whole process until we have found the geometric mean corresponding to an f value of .02. We keep on proceeding until we arrive at that value for f which yields the highest geometric mean. In our example we find that the highest geometric mean is obtained at an f value of .57, which yields a geometric mean of 1.1106. Dividing our worst possible outcome to a scenario (-$500,000) by the negative optimal f yields a result of $877,192.35. In other words, if XYZ Corporation wants to commit to marketing this new product in this remote country, they will optimally commit this amount to this venture at this time. 

As time goes by and things develop, so do the scenarios, and as their resultant outcomes and •probabilities change, so does this f amount change. The more XYZ Corporation keeps abreast of these changing scenarios, and the more accurate the scenarios they develop as input are, the more accurate their decisions will be. Note that if XYZ Corporation cannot commit this $877,192.35 to this undertaking at this time, then they are too far beyond the peak of the f curve. It is the equivalent to the trader who has too many commodity contracts on with respect to what the optimal f says he or she should have on. If XYZ Corporation commits more than this amount to this project at this time, the situation would be analogous to a commodity trader with too few contracts on.

Furthermore, although the quantity discussed here is a quantity of money, it could be a quantity of anything and the technique would be just as valid. The approach can be used for any quantitative decision in an environment of favorable uncertainty. If you create different scenarios for the stock market, the optimal f derived from this methodology will give you the correct percentage to be invested in the stock market at any given time. For instance, if the f returned is .65, then that means that 65% of your equity should be in the stock market with the remaining 35% in, say, cash. This approach will provide you with the greatest geometric growth of your capital in the long run. Of course, again, the output is only as accurate as the input you have provided the system with in terms of scenarios, their probabilities of occurrence, and resultant payoffs and costs. 

Furthermore, recall that everything said about optimal f applies here, and that also means that the expected drawdowns will approach a 100% equity retracement. If you exercise this scenario planning approach to asset allocation, you can expect close to 100% of the assets allocated to the endeavor in question to be depleted at any one time in the future. For example, suppose you arc using this technique to determine what percentage of investable funds should be in the stock market and what percentage should be in a risk-free asset. Assume that the answer is to have 65% invested in the stock market and the remaining 35% in the risk-free asset. You can expect the drawdowns in the future to approach 100% of the amount allocated to the stock market. 

In other words, you can expect to see, at some point in the future, almost 100% of your entire 65% allocated to the stock market to be gone. Yet this is how you will achieve maximum geometric growth. This same process can be used as an alternative parametric technique for determining the optimal f for a given trade. Suppose you are making your trading decisions based on fundamentals. If you wanted to, you could outline the different scenarios that the trade may take. The more scenarios, and the more accurate the scenarios, the more accurate your results would be. Say you are looking to buy a municipal bond for income, but you're not planning on holding the bond to maturity. 

You could outline numerous different scenarios of how the future might unfold and use these scenarios to determine how much to invest in this particular bond issue. This concept of using scenario planning to determine the optimal f can be used for everything from military strategies to deciding the optimal level to participate in an underwriting to the optimal down payment on a house. For our purposes, this technique is perhaps the best technique, and certainly the easiest to employ for someone not using a mechanical means of entering and exiting the markets. Those who trade on fundamentals, weather patterns, Elliott waves, or any other approach that requires a degree of subjective judgment, can easily discern their optimal fs with this approach. This approach is easier than determining distributional parameter values.

The arithmetic average HPR of a group of scenarios can be computed as:

AHPR = (∑[i = 1,N](1+(Ai/(W/-f)))*Pi)∑[i = 1,N]Pi


N = the number of scenarios.
f = the f value employed.
Ai = the outcome (gain or loss) associated with the ith scenario.
Pi = the probability associated with the ith scenario.
W = the most negative outcome of all the scenarios.

The AHPR will be important later in the text when we will need to discern the efficient frontier of numerous market systems. We will need to determine the expected return of a given market system. This expected return is simply AHPR-1. The technique need not be applied parametrically, as detailed here; it can also be applied empirically. In other words, we can take the trade listing of a given market system and use each of those trades as a scenario that might occur in the future, the profit or loss amount of the trade being the outcome result of the given scenario. Each scenario would have an equal probability of occurrence-1/N, where N is the total number of trades. 

This will give us the optimal f empirically. This technique bridges the gap between the empirical and the parametric. There is not a fine line that delineates the two schools. As you can see, there is a gray area. When we are presented with a decision where there is a different set of scenarios for each facet of the decision, selecting the scenario whose geometric mean corresponding to its optimal f is greatest will maximize our decision in an asymptotic sense. Often this flies in the face of conventional decision-making rules such as the Hutwicz rule, maximax, minimax, minimax regret, and greatest mathematical expectation.

For example, suppose we must decide between two possible choices. We could have many possible choices, but for the sake of simplicity we choose two, which we call "white" and "black." If we select the decision labeled "white," we determine that it will present the possible future scenarios to us:

White Decision
Scenario       Probability       Result
A                      .3                     -20
B                      .4                      0
C                      .3                      30
Mathematical expectation = $3.00
Optimal f = .17
Geometric mean = 1 .0123

It doesn't matter what these scenarios are, they can be anything, and to further illustrate this they will simply be assigned letters, A, B, C in this discussion. Further, it doesn't matter what the result is, it can be just about anything. The Black decision will present the following scenarios:

Black Decision
Scenario       Probability       Result
A                     .3                     -10
B                     .4                       5
C                     .15                     6
D                     .15                    20
Mathematical expectation = $2.90
Optimal f = .31
Geometric mean = 1.0453

Many people would opt for the white decision, since it is the decision with the higher mathematical expectation. With the white decision you can expect, "on average," a $3.00 gain versus black's $2,90 gain. Yet the black decision is actually the correct decision, because it results in a greater geometric mean. With the black decision, you would expect to make 4.53% (1.0453-1) "on average" as opposed to white's 1.23% gain. When you Consider the effects of reinvestment, the black decision makes more than three times as much, on average, as does the white decision! 

"Hold on, pal," you say. "We're not doing this thing over again, we're doing it only once. We're not reinvesting back into the same future scenarios here. Won't we come out ahead if we always select the highest arithmetic mathematical expectation for each set of decisions that present themselves this way to us?" The only time we want to be making decisions based on greatest arithmetic mathematical expectation is if we are planning on not reinvesting the money risked on the decision at hand. 

Since, in almost every case, the money risked on an event today will be risked again on a different event in the future, and money made or lost in the past affects what we have available to risk today, we should decide based on geometric mean to maximize the long-run growth of our money. Even though the scenarios that present themselves tomorrow won't be the same as those of today, by always deciding based on greatest geometric mean we are maximizing our decisions. It is analogous to a dependent trials process such as a game of blackjack. Each hand the probabilities change, and therefore the optimal fraction to bet changes as well. By always betting what is optimal for that hand, however, we maximize our long-run growth. Remember that to maximize long-run growth, we must look at the current contest as one that expands infinitely into the future. 

In other words, we must look at each individual event as though we were to play it an infinite number of times over if we want to maximize growth over many plays of different contests. As a generalization, whenever the outcome of an event has an effect on the outcome(s) of subsequent event(s) we are best off to maximize for greatest geometric expectation. In the rare cases where the outcome of an went has no effect on subsequent events, we are then best off to maximize for greatest arithmetic expectation. Mathematical expectation (arithmetic) does not take the variance between the outcomes of the different scenarios into account, and therefore can lead to incorrect decisions when reinvestment is considered, or in any environment of geometric consequences.

Using this method in scenario planning gets you quantitatively positioned with respect to the possible scenarios, their outcomes, and the likelihood of their occurrence. The method is inherently more conservative than positioning yourself per the greatest arithmetic mathematical expectation. Allowed that the geometric mean is never greater than the arithmetic mean. Likewise, this method can never have you position yourself than selecting by the greatest arithmetic mathematical expectation would. In the asymptotic sense, the long-run sense, this is not only a superior method of positioning yourself, as it achieves greatest geometric growth, it is also a more conservative one than positioning yourself per the greatest arithmetic mathematical expectation, which would invariably put you to the right of the peak of the f curve.

Since reinvestment is almost always a fact of life (except on the day before you retire20) - that is, you reuse the money that you are using today - we must make today's decision under the assumption that the same decision will present itself a thousand times over in order to maximize the results of our decision. We must make our decisions and position ourselves in order to maximize geometric expectation. Further, since the outcomes of most events do in fact have an effect on the outcomes of subsequent events, we should make our decisions and position ourselves based on maximum geometric expectation. This tends to lead to decisions and positions that arc not always apparently obvious.


Now we come to the case of finding the optimal f and its by-products on binned data. This approach is also something of a hybrid between the parametric and the empirical techniques. Essentially, the process is almost identical to the process of finding the optimal f on different scenarios, only rather than different payoffs for each bin, we use the midpoint of each bin. Therefore, for each bin we have an associated probability figured as the total number of elements (trades) in that bin divided by the total number of elements (trades) in all the bins. Further, for each bin we have an associated result of an element ending up in that bin. The associated results are calculated as the midpoint of each bin.

For example, suppose we have 3 bins of 10 trades. The first bin we will define as those trades where the P&L's were -$1,000 to -$100. Say there are 2 elements in this bin. The next bin, we say, is for those trades which are -$100 to $100. This bin has 5 trades in it. Lastly, the third bin has 3 trades in it and is for those trades that have P&L's of $100 to $1,000.

Bin           Bin           Trades         Associated Probability           Associated Result
-1,000      -100           2                   .2                                           -550
- 100         100           5                   .5                                             0
100           1,000        3                   .3                                             550

Now it is simply a matter of solving, where each bin represents a different scenario. Thus, for the case of our S-bin example here, we find that our optimal f is at .2, or 1 contract for every $2,750 in equity (our worst-case loss being the midpoint of the first bin, or (-$1000+-$100)/2 = -$550).

This technique, though valid, is also very rough. To begin with, it assumes that the biggest loss is the midpoint of the worst bin. This is not always the case. Often it is helpful to make a single extra bin to hold the worst-case loss. As applied to our 3-bin example, suppose we had a trade that was a loss of $1,000. Such a trade would fall into the -$1,000 to -$100 bin, and would be recorded as -$550, the midpoint of the bin. Instead we can bin this same data as follows:

Bin         Bin        Trades         Associated Probability        Associated Result
-1,000    -1,000    1                  .1                                         -1,000
-999       -100       1                  .1                                         -550
-100        100       5                  .5                                          0
100         1,000    3                  .3                                          550

Now, the optimal f is .04, or 1 contract for every $25,000 in equity. Are you beginning to see how rough this technique is? So, although this technique will give us the optimal f for binned data, we can see that the loss of information involved in binning the data to begin with can make our results so inaccurate as to be useless. If we had more data points and more bins to start with, the technique would not be rough at all. In fact, if we had infinite data and an infinite number of bins, the technique would be exact. (Another way in which this method could be exact is if the data in each of the bins equaled the midpoints of their respective bins exactly.) The other problem with this technique is that the average element in a bin is not necessarily the midpoint of the bin. 

In fact, the average of the elements in a bin will tend to be closer to the mode of the entire distribution than the midpoint of the bin is. Hence, the dispersion tends to be greater with this technique than is the real case. There are ways to correct for this, but these corrections themselves can often be incorrect, depending upon the shape of the distribution. Again, this problem would be alleviated and the results would be exact if we had an infinite number of elements (trades) and an infinite number of bins. If you happen to have a large enough number of trades and a large enough number of bins, you can use this technique with a fair degree of accuracy if you so desire. You can do "What if" types of simulations by altering the number of elements in the various bins and get a fair approximation for the effects of such changes.


We have now seen that we can find our optimal f from an empirical procedure as well as from a number of different parametric procedures for both binned and unbinned data. Further, we have seen that we can equalize the data as a means of preprocessing, to find what our optimal f should be if all trades occurred at the present underlying price. At this point you are probably asking for the real optimal f to please stand up. Which optimal f is really optimal? For starters, the straight (nonequalized) empirical optimal f will give you the optimal f on past data. Using the empirical optimal f technique and in Portfolio Management Formulas will yield the optimal f that would have realized the greatest geometric growth on a past stream of outcomes. However, we want to discern what the value for this optimal f will be in the future, considering that we are absent knowledge regarding the outcome of the next trade. 

We do not know whether it will be a profit, in which case the optimal f would be 1, or a loss, in which case the optimal f would be 0. Rather, we can only express the outcome of the next trade as an estimate of the probability distribution of outcomes for the next trade.  That being said, our best estimate for traders employing a mechanical system, is most likely to be obtained by using the parametric technique on our adjustable distribution function as detailed in this chapter on either equalized or nonequalized data. If there is a material difference in using equalized versus nonequalized data, then there is most likely too much data, or not enough data at the present price level. For non-system traders, the scenario planning approach is the easiest to employ accurately. In my opinion, these techniques will result in the best estimate of the probability distribution of outcomes on the next trade.


  1. Such a great information for blogger i am a professional blogger thanks.

    Share Online Trading
    Platform For Stock Trading

  2. Thanks for giving a great Blog, I learned new info in this article, keep on it doing like this. Share broker charges


Post a Comment

Popular posts from this blog