This article was originally published in the September/October 1997 issue of Home Energy Magazine. Some formatting inconsistencies may be evident in older archive content.
Home Energy Magazine Online September/October 1997
Home Energy Rating Systems: Actual Usage May Vary
by Jeff Ross Stein
Jeff Ross Stein, a former research assistant at Lawrence Berkeley National Laboratory, is currently a design engineer at ACCO, an HVAC contractor.
Home energy ratings attempt to predict typical energy costs for a given residence and estimate the savings potentials of various energy retrofits. But one question has gone unanswered: How accurate are these ratings at predicting actual energy consumption? A new analysis suggests the ratings could do better.
Home energy rating systems (HERS) and related energy efficiency financing products have been in use since the late '70s. Today, 21 states have HERS. These systems score homes and estimate how much typical occupants would spend on energy. Consumers use the scores and annual cost estimates to compare the current and potential energy consumption of different homes.
The estimated energy costs of a high-rated home can help buyers to qualify for larger mortgages. However, if they get a larger mortgage based on the rating and still have high energy bills, their wallets will feel the squeeze. And if the estimate of resident energy use is wrong, the list of suggested cost-effective improvements that comes with the rating may include money-losing investments. To avoid these problems, HERS estimates need to approximate actual energy cost in homes.
As a research project for Lawrence Berkeley National Laboratory, Alan Meier and I compared home energy ratings with actual utility billing data for about 500 houses. The ratings were supplied to us by HERS providers in four states--the California Home Energy Efficiency Rating System (CHEERS), Energy Rated Homes of Colorado, Home Energy Ratings of Ohio, and Midwest Energy, a utility company and HERS provider in Kansas.
The CHEERS ratings were conducted in 1994; the others in 1996. These HERS all used different rating software and had slightly different rating procedures. For example, the CHEERS ratings did not include blower door testing, while the other ratings did. All of the HERS providers assured us that the samples were representative of the house types they rate and were within the expected accuracy of their ratings. (However, CHEERS has changed its software significantly since 1994, so our analysis of their predictions may no longer be relevant.)
We examined weather data from local federal weather stations for all of the locations, to ensure that utility bills during our study period were not thrown off by unusual weather. Since the heating degree-days during our study period were all close enough to the long-term averages used in the HERS software, we deemed weather normalization unnecessary.
One of our most surprising discoveries was that none of the HERS we examined showed any clear relationship between rating score and total energy use or energy cost. Technically, rating scores only measure a house's individual potential for energy improvement; they are not designed to be used to compare different houses in the same way miles-per-gallon ratings are designed to compare cars. However, many consumers and HERS-related financing programs assume that houses with higher scores will have lower energy costs. Unfortunately, houses with higher scores, even when compared to houses of similar size, did not tend to use any less energy than houses with lower scores. The dashed line in Figure 1 shows the regression line of the CHEERS predictions. The declining energy use with higher ratings would seem to make sense. However, the solid line shows the regression line of actual average energy cost. It was constant at about $1,000 per year, regardless of the score.
The discrepancy between scores and energy use may be due to the take-back effect. Take-back occurs when people with more efficient homes use more energy than expected because they are less cautious about maintaining thermostat setbacks and other basic efficiency measures. In other words, higher-scoring houses may indeed be more efficient than lower scoring houses, but only if they are operated in the same manner.
Because of the way the results are presented, people are being led to believe that energy use and cost predictions are more precise than they really are. HERS predictions sometimes calculate energy costs or life cycle savings to four significant digits, a much higher level of accuracy than is necessary or realistic. A sample rating from CHEERS stated, Upgrading the cooling system to SEER 12.0 will save $2,166 on a life cycle basis. However, even ratings systems that are quite accurate on average have large margins of overprediction and underprediction for individual homes.
Three of the four HERS--Kansas, Ohio, and Colorado--were remarkably accurate at predicting actual energy cost or energy use for all homes in our sample (see Table 1). For example, on average, the Colorado system underpredicted the actual energy use by only 3%. The fourth system, CHEERS, tended to overpredict the actual energy cost by about 50%, but it was much more accurate for newer houses, underpredicting them by 8% on average.
Again, while the average estimates were close to the real average in most cases, individual errors were often high. For example, the standard deviation of CHEERS predictions from actual energy use was 80%, with about one-third of the houses overpredicted by more than 130% or underpredicted by more than 30%. While much of this individual error can be attributed to occupant behavior, the magnitude (and CHEERS's consistent tendency to overpredict energy use) implies the existence of a systematic error in the rating procedure.
A HERS rating comes with recommended measures to improve a given home's energy efficiency. The recommended measures are expected to be cost-effective. For example, a HERS might calculate that a hot-water tank wrap will reduce water heating stand-by losses and pay for itself in a particular house in one year.
We wanted to know what the impact of these recommended measures really was. We compared the actual energy use of CHEERS homes to the total energy savings that CHEERS predicted the occupants would receive if they implemented all recommendations.
We found obvious errors--some ratings predicted that homeowners would save more energy than they actually used, and many ratings predicted savings greater than 50% of the actual consumption. When total savings estimates are impossibly high, it is likely that some recommended measures are not actually cost-effective. This is especially likely because HERS only require that life cycle cost be less than predicted life cycle savings. Recommendations do not always have a built-in margin of safety to account for likely variation between occupants.
On the other hand, the value of many typical HERS recommendations are not dependent on the accuracy of the rating. In the water-tank wrap example above, the rating calculated that the wrap would pay back in one year. Even if the rating overpredicted hot water use by 300%, the tank wrap would still pay for itself in about three years. The detailed economic information that usually comes with HERS recommendations, such as simple payback period, allows consumers to compare the financial aspects of different options and possibly reduce the risk of a bad investment.
Moreover, many recommended improvements also provide intangible benefits, such as increased comfort, reduced noise, greater security, and better aesthetics.
Why Isn't HERS Perfect? The algorithms most HERS use to rate a house include many variables, among them the dimensions of every window, wall, and floor in the house. To satisfy the ratings formulas, raters must also collect data on a wide range of variables, from duct leakage rates to insulation thickness to window overhang dimensions.
Accurate measurements for each of these are necessary for accurate predictions. Although raters are required to be trained and certified, they can introduce errors by collecting or recording inaccurate data. For example, in the CHEERS ratings, the six raters who rated the 185 CHEERS homes used for our study had ratings with significantly different average error and variance, suggesting that the data may have been entered incorrectly.
In addition, the simulation algorithms can be based on incorrect assumptions. For example, algorithms make assumptions about local weather (based on typical years); about some physical features, such as the number of appliances; and about the occupants.
Occupant behavior is probably the single most significant determinant of actual energy use (see Can We Transform the Market Without Transforming the Customer? HE Jan/Feb '94, p. 17). HERS have the difficult task of making assumptions based on typical occupant behavior. Reality can easily diverge from these assumptions; predicted energy use or energy cost can be off by 50% or more due to occupant behavior. Other variables also rely on assumptions rather than on measurement. For example, the weather variable is based on long-term averages, while the actual weather can differ considerably from the average in a given year. Any assumption can introduce error (see Differences Between HERS and HERS).Improving HERS Our study results suggest several areas in which HERS could be improved, including better software, training, evaluation, and disclaimers. As national HERS accreditation moves forward, minimum standards in each of these areas may help to resolve many of HERS's problems.
Accurate Disaggregation of End Uses
The accuracy of specific end-use predictions, such as space heating and cooling and hot water heating, must be improved if recommendations are to be accurate. Suppose that a HERS provider calibrated a software package by assuming less hot water use and higher winter thermostat settings. The rating system might recommend replacing a lot of furnaces and not replacing hot-water heaters when, in reality, the opposite might be more appropriate.
Philip Fairey of the Florida Solar Energy Center studied HERS in Florida. By submetering certain end uses, he showed that the total energy use prediction was generally quite accurate, but that HERS tended to overpredict some end uses and underpredict others. While submetering particular equipment can be very expensive, it is the best way to verify and improve accuracy. Disaggregation is one area where the continual evolution of software can have a beneficial effect.
Error Correlations and Corrections
Analysis of billing data can be taken a step further by looking for correlations between rating accuracy and house characteristics. For example, we found that CHEERS overpredicted gas use more in Eureka, California, which has a relatively cold climate, than in Fresno, California, which has a relatively hot climate, and that it overpredicted electricity use more in Fresno. In general, we found that climates calling for more heating or cooling were the climates with more overpredicted energy use. CHEERS may be using incorrect heating and cooling setpoints, infiltration rates, or conduction rates.
Analyzing utility billing data can be a valuable and inexpensive way to improve HERS accuracy; however, it doesn't give the whole picture. Other types of research are also needed to document and improve accuracy. For example, the HERS BESTEST, which benchmarks HERS against DOE-2 and other state-of-the-art simulation software, is a valuable tool for testing the simulation properties of HERS.
To evaluate ratings on an ongoing basis, some HERS get utility bills for many rated homes. As nationwide accreditation leads to nationwide monitoring and evaluation, HERS guidelines may be modified. Accreditation will give HERS administrators a chance to note uniform irregularities nationwide. The currently proposed process for accreditation would require each HERS provider to collect utility bills for at least 10% of homes rated annually or 500 homes annually, whichever is less.
Training the Raters
Another important trend we found in the CHEERS data was that some raters tended to produce more accurate ratings than others. This emphasizes the need for rater training, supervision, and retraining, and the need to minimize rater judgment calls in the rating procedures. Rater training varies in length and detail from state to state. For example, Indiana uses a weeklong course; across the border in Illinois, training takes just two days. Also, raters bring different backgrounds to the job. Some have no experience; some have done weatherization; and some are contractors who are familiar with blower doors, analysis software, and the whole-house approach. Again, accreditation may provide an opportunity to require minimum training levels.
Disclaimers: The Scores Are Not What You Think!
HERS providers need to give consumers more information about the accuracy and meaning of the ratings. HERS agencies generally do not explain how scores are calculated or how they should be interpreted. Rating scores are not designed to compare houses in the same way that miles-per-gallon ratings are used to compare cars.
Today, many people in the HERS industry want to overhaul or eliminate the scoring system and focus consumers' attention exclusively on energy use and cost predictions. However, these predictions might be more accurately presented as a range of savings, which would eliminate much of the uncertainty in the calculation.
This approach has its critics. Mark Janssen of Indiana's HERS believes that rating software is accurate enough to be trusted. More importantly, he points out that customers want a number, not a range. They want to be told whether an improvement will be cost-effective or not.
Regardless of how accurate the ratings are, an increasing number of HERS are including a lengthy disclaimer. These disclaimers attempt to communicate to customers that savings estimates do not guarantee savings.
Many homes have been rated by HERS across the country in the last several years. However, agencies using ratings systems have not rigorously evaluated whether the ratings are providing accurate and useful information. To improve their ratings, agencies need to rigorously evaluate their programs and make data about their programs available to researchers. Researchers studying HERS can start with easy-to-use, low-cost data--for example, actual utility bill data for rated houses. Utility data can be used to validate accuracy, to calibrate rating systems, and to help identify and correct specific system errors.
At this point, those in the field do not generally consider accuracy to be a significant barrier to widespread HERS use. But everyone agrees that accuracy is important for credibility and long-term success of the programs.
Furthermore, a lack of accuracy may eventually catch up with some HERS and create a stigma that could spread to other programs. When other energy efficiency technologies have failed to live up to initial expectations, they have suffered from serious and long-lasting problems--for example, solar water heaters or compact fluorescent light bulbs. For these reasons, HERS organizations and HERS providers must continue to document and improve accuracy.
Home Energy can be reached at: firstname.lastname@example.org
- FIRST PAGE
- PREVIOUS PAGE