Benefits of Accurately Determining Electricity Price Distributions:
Better Risk
Metrics, Beating the Market on Trades
|
|
MICHAEL A. S. GUTH, Ph.D., J.D.
|
Financial Economics Homepage || Attorney at Law Homepage
Benefits of Accurately Determining Electricity Price Distributions: Better Risk Metrics, Beating the Market on Trades
Most pricing models and software used in the power industry assume prices are lognormally distributed and their returns are normally distributed. As shown in this article, other distributions fit power price data much better than either the lognormal or normal distribution.
Too many electricity pricing
software packages assume that electricity prices follow the lognormal distribution and that price
returns follow the symmetrical normal distribution. In fact, electricity prices spike up during acute shortages and
are more likely to spike up than down.
Electricity prices are directly tied to weather patterns, and weather
patterns are too unpredictable to
follow a lognormal distribution.
Nevertheless, through its Value at Risk measures of the company’s
exposure from trading floor transactions and through use of the standard option
pricing formulas, the power industry continues to embrace the fiction that
power prices are (approximately) lognormally distributed.
Analyzing electricity price data for the daily markets in Cinergy, PJM, and Entergy shows that power prices do not follow any single distribution. In fact, using the Komolgorov-Smirnov test statistic, the Anderson-Darling test statistic, and the conventional Chi-squared test, we can prove that fact statistically. The three most common distributions that best fit the electricity price data are the Inverse Gauss distribution, the Loglogistic distribution, and the Pearson 5 distribution. Each of these three distributions show pronounced skewness (indicating that electricity markets are more likely to experience external shocks causing shortages in supply rather than abundant supplies). The distributions also have comparatively fat tails, known as kurtosis, because the probability of extreme temperatures is greater than either the normal or lognormal distribution would predict.
What does all this mean? First, Value at Risk (VaR) calculations at power marketing firms need to incorporate statistical distribution assumptions tailored to power markets, rather than passively adopt assumptions of VaR models designed for foreign exchange and capital markets. Pity the poor risk manager who has to try to defend his company’s use of risk measures that were based on completely unrealistic assumptions (lognormality, geometric Brownian motion), when the overwhelming published empirical evidence suggests otherwise.
Second, knowledge of the correct statistical distribution of prices is valuable information in pricing electricity products. In theory, a trader armed with this additional piece of information should be able to profit at the expense of counter-parties who still rely on the old models using normal and lognormal distributions. Towards the end of this article, we provide some examples of trades using forward, futures, and option contracts that take advantage of this knowledge of the correct underlying distribution of power prices.
BASIC COURSE. Power traders and analysts typically have limited knowledge of statistics gleaned from introductory courses during their college days. These introductory courses taught students how to form 95% confidence intervals around sample means using two standard deviations (one standard deviation on either side of the mean). They also illustrated the familiar bell-shaped normal curve. The technical name for that curve is a probability density function (pdf).
However, introductory courses rarely concluded with a word of caution that out in the real world, students would generally confront data from a wide array of statistical distributions other than the normal. Thus, the former students (who are now power marketers and traders) might not want to assume all data they ever analyze has the convenient properties of the normal distribution. For example, power price data are rarely distributed anything like the familiar bell-shaped pdf.
Our intuition about power prices tells us that power prices do not move randomly up or down. Power prices do not follow a random walk or Brownian motion process, which support the normal distribution. We expect power prices to be high in the summer when electricity is needed for cooling; we don’t expect prices to move with equal probability up and down when summertime temperatures outside are rising.
Aside from seasonality, power prices are influenced by planned and unplanned losses of generating units which can cause regional supply disruptions. Other causes of price spikes include losses in transmission line availability – either due to a break in a transmission line or congestion – and extreme weather events. Extreme weather occurs more frequently than either the tails of the lognormal or the normal distribution would predict.1 Once the precursor events for a price spike have occurred, electricity prices will not thereafter randomly tick up and down. Instead, prices will remain high until the generating unit comes back on-line, the extreme weather passes, or the transmission line again becomes available.
Power prices are more likely to spike up than fall precipitously, and to remain high for a period of time after some precursor event has occurred. Thus, our intuition tells us the actual distribution of power prices will likely be skewed to the right (towards the higher prices), and will not appear like a symmetrical bell-shape. Furthermore, price “crashes” in the power market are generally not as rapid or as large as the price spikes.
STATISTICAL TESTS. For hypothesis testing, we form various statistics (functions of the
data), and then ask whether the realized value of the statistics have a high or
low probability of arising. If
power prices follow a distribution other than the familiar normal distribution,
then we must test for the correct distribution given a data sample of power
prices. The appropriate technique to test alternative hypotheses about the
correct statistical distribution of data is a χ²-test of the estimated variance from the sample.
Thus, we will use the estimated variance to calculate a χ²-statistic, and
then undertake a χ²-test to check the hypothesis.
Due to space limitations, we will omit writing out the
mathematical expression for each of the three tests, but they can be found on
the Internet or in standard textbooks on statistics. Because the variance of a sample is distributed χ², each of
the three statistics are used to form χ²-Tests. With more than one goodness-of-fit statistic available, we have
no hard rule to decide which test will give us the “best” ranking of possible
distributions. Each of the three tests
has its strengths and weaknesses.2
The first statistic we might use to test for the correct distribution of power prices is the standard χ²-statistic. This statistic can be used with both continuous and discrete sample data. Although this statistic is the best known goodness-of-fit statistic, it is has a drawback in that the x-axis domain (power prices in our example) must be divided into several “bins.” Unfortunately, no clear guidelines exist to tell the user how many or how wide the bins should be. It is possible to reach different conclusions from the same sample data depending on how the bins are specified.
The second goodness-of-fit
statistic, suitable for continuous sample data, is the Komolgorov-Smirnov (K-S)
statistic. Unlike the standard χ²-statistic, the K-S statistic
does not require the selection of bins.
However, the K-S statistic does not detect tail discrepancies well. Consequently, we will use the K-S statistic
to characterize power prices in months when the data seems clustered about the
mean, with few outliers.
For months where the “fat tails” of the distribution are
important to characterize accurately, we prefer to use the Anderson-Darling
(A-D) statistic, which is the third possible statistic we could use to rank the
possible distributions. The A-D
statistic does not require the assignment of the sample data to bins, and it
highlights difference in the tails of the fitted distribution and the empirical
data. Thus, for January – February and
June – August when the distributions are likely to have significant tails due
to extreme weather, we rank the possible distributions using the A-D statistic.
The list of 38 possible distributions that we test for
in this study is shown in Table 1.
Further description of each distribution is contained in the @Risk
user’s manual [Ref. 1].
Table 1.
List of 38 Statistical Distributions Used for Tests of Goodness-of-Fit
Beta
BetaGeneral BetaSubj
Binomial ChiSq Cumulative Discrete D-uniform Error
Function Erlang
Exponential
ExtValue Gamma General Geometric Histogram Hypergeometric InvGauss IntUniform Logistic Loglogistic Lognormal Lognormal2 Negbin Normal Pareto
Pareto2 Pearson5
Pearson6 Pert Poisson Rayleigh Simtable Student Triang Trigen Uniform Weibull
EMPIRICAL EVIDENCE. For Cinergy Jan-Feb daily,
volume-weighted average spot market prices, 5x16 (on-peak) over the
period 1998-2001, we found the Inverse Gauss distribution ranked as the best
fit using the A-D test. As depicted in Fig. 1, the Inverse Gauss distribution has skewness
to the right and kurtosis, a tail that extends far to the right. Visual inspection of the fitted red Inverse
Gauss pdf or the blue histogram of
data in Fig. 1 should alert practitioners that power prices do not follow the symmetrical, bell-shaped curve of the normal distribution.
Figure 1.
Cinergy Jan-Feb Average Daily, Volume-Weighted Spot
Prices 1998-2001
4
The fitted (Inverse Gauss) and input (assumed normal) statistical parameters and moments of the data sample are given in Table 2.
Table 2. Cinergy Jan-Feb 1998-2001 Statistical
Results


Note the “Diff. P” values in the two columns of Table 2. “Diff. P” represents the probability of the confidence interval depicted by two light grey lines that bound the interval [$14.99/MWh, $58.26/MWh]. To the extent the past accurately represents the future, the fitted Inverse Gauss distribution enables us to say with 90% confidence that the average price for Cinergy Jan-Feb. 2002 will be in the interval [$14.99/MWh, $58.26/MWh]. Whereas with a bell-shaped Normal pdf, the probability assigned to the interval [$14.99/MWh, $58.26/MWh] is (from the Diff. P line) 93.90%.
Instead of focusing on the small difference between 90% and 93.90% and erroneously concluding the Inverse Gauss and Normal distributions are alike, practitioners should observe that different distributions yield different confidence levels for the same value or interval being tested. The difference between having a trading signal based on a best-fitted distribution and one based on the default normal distribution can affect profits and buy-sell decisions of traders.
At this point, we have tested Cinergy Jan-Feb daily spot market data for 1998-2001 and concluded that the Inverse Gauss distribution best fits the observed data. Now we would like to know if that distribution is stable. Suppose we repeat the test, but this time use only the first three years of data (1998-2000) and omit data for 2001.
With only three years of data, the Pearson 5 distribution best fits Cinergy Jan-Feb daily prices. Perhaps most impressive, the Pearson 5 distribution ranked as the best fitting distribution using all three tests: the standard χ²-statistic, the K-S statistic, and the A-D statistic! By comparison with four years of data, the Pearson 5 distribution ranks as the third best-fitting distribution, behind the Inverse Gauss and the Lognormal23 distributions, according to the A-D test. We conclude that the Inverse Gauss distribution may not be a stable fit to Cinergy Jan-Feb prices; consequently, we might want to qualify any inferences drawn from that distribution or consider inferences that take account of both the Pearson 5 and Inverse Gauss distributions.
TRADING SIGNALS. But our analysis does not stop here, or it would all be academic. From a Middle Office financial controls perspective, we have a piece of information on the actual distribution of Jan-Feb power prices. Our next step should be to look for ways to modify the Value-at-Risk metric to incorporate the probabilities from the Inverse Gauss or Pearson 5 distributions, both of which are skewed to the right, and have longer tails than either the normal or lognormal distribution.
Traders are interested in quantitative models that suggest profitable trading strategies. They will want to hear the reasons why they should buy or sell the forward contract, a call, a put, or some combination of forwards and options. From a trading strategy perspective, knowledge of the correct distribution of prices enables us to construct a tighter confidence interval about today’s forward price than if we relied on the customary normal distribution.
We have calculated the four-year mean of the Cinergy Jan-Feb daily spot prices to be $26.29/MWh. On Dec. 3, 2001, the forward price for Cinergy Jan-Feb, 5x16 on-peak, was $25.75/MWh. Because we did not expect a large price swing in either direction, the market conditions indicated that writing an at-the-money straddle (simultaneously selling both a call and a put with a strike price at $25.75/MWh or $26/MWh) could be a profitable trading strategy. Profitability would depend on whether the premium offered for the two options fairly compensated the trader for incurring the risk of price movements in either direction.
A less risky strategy could be to write a strangle perhaps with a $6 spread between the call and put (simultaneously sell a call with strike price $29/MWh and sell a put with strike price $23/MWh). However, the power option markets are so illiquid that traders might have to implement a strangle by settling for strike prices in multiples of 5: sell a call with strike price $30/MWh and sell a put with strike price $20/MWh or $25/MWh.
Yet another strategy emerging from these market conditions might be to write a covered call: buy the forward contract and write a call against it. The ideal scenario for writing a covered call would be if the forward prices dipped down low, say to $21/MWh, and we had reason to believe that the market overreacted and would correct itself, then move back to the $25-$26/MWh level in the following weeks. Our contingent strategy might be to buy the forward at $21/MWh. At that point we could write a $20/MWh-strike monthly (covered) call to hedge the position or wait in anticipation that the underlying price would move up. To refine this strategy, we would need to consult an accurate regional price forecasting model to learn where a forward-looking model predicted equilibrium prices would prevail for next winter. Relying solely on a statistical distribution of past prices for future trading strategies is risky business. It is analogous to driving a car by looking out the rearview mirror.
INFORMED DECISIONS. Most traders will want to know the probability that an option will go in the money, and by how much, before they will consider writing that option. Because our proposed strategy involves writing monthly strike options, we can avoid the irksome problem of trying to estimate the number of trading days in a given month that a daily strike option would be exercised. Instead, we can focus on the average price for the month.
For Cinergy Jan-Feb spot prices, the Inverse Gauss distribution fitted to our four-year data sample gives a 50% confidence level at a price of $20.00/MWh, 70% confidence level at price $25.80/MWh, and 78% confidence at a price of $29.99/MWh. See the figures below.



In other circumstances, traders will want to base their decisions on a combination of “low technology” visual inspection of the data with the “high technology” method of fitting a distribution to sample data. For example, with the PJM March contract using data for 1997, 1999, 2000, and 2001, we found with 77% confidence that the price of the March contract would average $30.09/MWh or less. See the next figure.

However, visual inspection of the data revealed that most of the outliers for this distribution were from year 2001. Although it violates classical principles of statistics to visually inspect data and eliminate outliers from the sample, there are times when we are more concerned with getting the best possible forecast for next year’s price, rather than rigidly complying with the requirements of statistical theory. Therefore, if the traders could advance sound reasons to justify excluding 2001 data, then our fitted distribution using 1997, 1999, and 2000 data would appear as shown in the following chart.

Now we see that a 90% confidence interval can be formed with the price of $30.15/MWh as the upper bound on the interval. If we were to perform a one-tailed test, then with 95% confidence, the average price of power in March 2002 would be $30.15/MWh or less. That is the kind of high confidence level signal that traders might find useful. Of course, we must exercise caution in drawing inferences from a data sample where part of the sample was eliminated. The fitted distribution for PJM March prices with three years data is the Inverse Gauss distribution. With four years of data, the best fit was the loglogistic. As hard as it may be for statistical purists to accept a methodology that eliminates sample data, in the real world the elimination of that abnormal data may provide a better forecast than including the entire sample.
We can see this point illustrated more clearly below. Note in the table that follows that the average price of power for the Entergy July-August contract is $82.84/MWh when calculated with a five-year data sample. However, the average price for 2000 and 2001 is $51.09/MWh. Which average price is a better representation of the long-run equilibrium to which summer of 2002 prices will migrate? Most traders will tell you that $51.09/MWh is much better than the five-year sample average of $82.84/MWh. In fact as of Dec. 2001, the futures contract for Entergy July-August 2002 has already settled at a value close to $51/MWh.

When we attempt to fit the best statistical distribution to the five-year data sample for Entergy July-August prices, the distribution appears highly leptokurtotic do to the presence of outliers so far to the right that they do not appear on the chart.

However, when we fit the best distribution to price data for 2000 and 2001, then the distribution starts to resemble more of a bell-shape.

This article has shown that electricity prices do not follow the lognormal distribution commonly assumed in Value-at-Risk and option pricing models. Instead, power prices are governed by statistical distributions with skewness and kurtosis. Knowledge of these alternative distributions can lead to better risk metrics for trading floor exposure and to possible trading strategies.
Michael Guth is Manager, Quantitative Risk Management, and Gloria Zhang is Principal Business Analyst at Progress Energy, Raleigh, North Carolina. They can be reached via e-mail at michael.guth@pgnmail.com and gloria.zhang@pgnmail.com
.
1 In technical terms, the higher frequency of extreme prices associated with extreme weather means the power price distribution will have kurtosis, or fatter tails, than either the Normal or Lognormal distributions.
2 Guide to Using @Risk Version 4, Palisade Corporation, (Feb. 2001), p. 138.
3 The Lognormal2 distribution in @Risk “specifies a lognormal distribution where the entered mean and the standard deviation equal the mean and standard deviation of the corresponding normal distribution. The arguments entered are the mean and standard deviation of the normal distribution for which an exponential of the values in the distribution was taken to generate the desired lognormal.” Guide to Using @Risk Version 4, Palisade Corporation, (Feb. 2001), p. 353.
4 Data for PJM prices in 1998 was not available. Analysts in the power industry frequently must find ways to work around the problem of missing or nonexistent data.
© Copyright 2004 by Michael A. S. Guth. All Rights Reserved. No portion of this site, including the contents of this web page may be copied, retransmitted, reposted, duplicated, or otherwise used without the express written permission of Dr. Michael Guth. Reprinted from The Risk Desk (January 2002) with permission of the publisher, Scudder Publishing Group, LLC. www.scudderpublishing.com.