Monday, December 9, 2019

To Analyze a Sales Population, Consider the Expanded Percentile Curve


(Click on the image to enlarge)


As we know, not all segments of the market move in tandem. When the market starts to move up, it generally begins at the bottom of the value strata (start-up homes) and graduates up the value ladder. Therefore, while analyzing a large sales population, it is prudent to use the entire percentile curve (as shown in the above Miami graphic) rather than just the Median as it may musk the actual pictures on both ends of the curve, say below the 25th percentile and above the 75th percentile and more precisely below the 10th and above the 90th.


How to Analyze a Sales Population


1. Single Parameter – Instead of just one parameter (like the Median), it's better to consider the expanded percentile curve, preferably 1st percentile to 99th percentile, avoiding minimum and maximum as they may skew the picture as well.


2. Sample Selection – When confronted with all sales meaning both arms-length and non-arms length and virtually no time to validate the sales, the 5th to 95th percentile sample is more meaningful, without having to spend time on manual validations. Conversely, if the sample comprises only the arms-length sales, the 1st to 99th percentile range could be more meaningful.


3. Outlier Analysis – Therefore, while studying outliers of a sample lacking validation, it's better to consider only the cases below the 5th and above the 95th percentile. Likewise, below 1st and above the 99th could be a better starting point for a validated sample, gradually extending out to the outliers on both ends of the percentile curve (as time permits).


4. Sales Timeframe – When the timeframe is extended (9 to 24 months), sales must be time-adjusted, preferably at the monthly level (deriving monthly time factors). If the sample comprises 3-4 years' of sales, quarterly adjustments will make more statistical sense. When an extended series (e.g., 10+ years) is analyzed, annual factors would be appropriate. Most extended series analyses are performed to detect seasonality in the data.


5. Growth Factors – As we all know, the residential market is as local as it gets. Therefore, a good sales analysis must additionally be broken down to the sub-market level as long as those sub-markets are well-established and accepted. Since the growth rates vary by the market, time adjustment factors must be derived at the sub-market level (e.g., 12% in our example for the City of Miami). Applying national or even regional factors could result in flawed and indefensible results. Time adjustment in AVM is generally different (will be discussed later).


6. Use of Median – Due to time constraints, If one has to choose one parameter to ascertain time's impact, it must be the Median, as it is less prone to outliers (outliers heavily influence average, often distorting the analysis). In an even like that, the sale's Median must be compared with the normalized (by Bldg SF) Median, ensuring they are close to each other.


(Click on the image to enlarge)

7. Spatial Distribution – As part of the sales sampling, one must also ensure that the sales are spatially distributed in line with the population, so a meaningful spatial chart is in order alongside the data tables. In the above example, one must understand that the Median ASP and Median Bldg SF are mutually exclusive, but they may be connected to get a general idea of the ASP/SF, but not for any serious analysis. To analyze the normalized ASP/SF, one needs to create the organic variable (row-wise ASP/SF) and the run percentile stats.


8. Creating Sales Ratios – In the above example, in addition to the percentile analysis of sales, the distribution of sales ratios (ratio of County Market Values to Time-adjusted Sales or ASP) is shown, thus connecting the apples-to-apples dots. The spatial chart additionally depicts the stratified sales ratios. Caution: While creating the sales ratios, one must time adjust the sales to the valuation date (in this case, 01-01-2019) as the tax roll values are as of that date, or else it would be an apples-to-oranges.


9. Regression Values – Ideally, ASP should be modeled (using multiple regressions or any other industry-accepted methodology). The resulting sales ratios of the regression values (which are smoother and statistically more significant) used all analyses, Including defining and removal of the model outliers. Regression values could also be used to challenge the tax roll market values additionally. When there is a paucity of comps, such regression values could also be used to proxy actual comps in a comparables grid.


In a nutshell, to get a better picture of the overall market, an expanded percentile distribution analysis of sales is significantly more meaningful than a simplistic median-based sales analysis. Additionally, normalized values and spatial sales ratios could provide better insight into the building blocks.


-Sid Som, MBA, MIM

homequant@gmail.com


Link to the Book

Sales Ratio Study is largely Ineffective, if not Counter-productive

A Sales Ratio study examines the relationship of Market Values on the Assessment Roll to Time-adjusted Sale Prices (adjusted to the Valuation/Taxable Status Date). A Sales Ratio study, unlike an Automated Valuation Model (AVM), is not an econometric solution that could be used in any meaningful decision-making. Unfortunately, sales ratio studies are often developed and used to test the assessment rolls' metallurgy and progression.

Since sales ratios are developed using sales complexes only, two very similar homes in a given neighborhood - with very different effective ages, say 15 vs. 50 - will be evaluated alike. On the other hand, a properly-developed AVM will effectively assess the differences and return values that are different, yet statistically significant and econometric.

What does a sales complex comprise?

-Sale Price
-Sale Date (to time-adjust sales to valuation/status date)
-Sale Validation (to ensure only arms-length sales are used)
-Classification (to ensure the right class of properties is used)
-Market Value (from the Tax Assessment Roll)
-Assessed Value (when Residential Assessment Ratio or RAR is also required)
Additionally, some consultants retain a few other variables like Town (to evaluate sub-markets if it is a county-wide study) and Living Area (to consider normalized scenarios). Of course, sub-market and normalized ratios are statutorily rare.

So, why does a sales ratio study become an ineffective solution? Let's consider the following reasons...

1. Sales Ratio Studies are at best Heuristic analyses -- Most large and even medium-sized tax jurisdictions have moved to primarily AVM-based tax rolls. Therefore, when the raw sale prices (generally time-adjusted) are compared with the scientifically-derived AVM values (to compute the sale ratio), it is no longer an apples-to-apples comparison. Sale prices do indeed reflect property characteristics (in addition to location, etc.). They are nonetheless highly subjective, reflecting individual (un-equalized) economic behavior, including personal tastes and preferences (e.g., when one is bent on buying a pink house, one will overpay). Exterior walls and conditions are actual modeling variables, while exterior color is not. Therefore, the presence of data variables will force AVMs (hence the tax roll values) to ignore those emotional premiums. Simultaneously, the standalone sale prices in sales ratio studies will fail to differentiate and ignore them.

2. Sales Ratio Studies do not require "Representative" Tests -- The underlying assumption of a sales sample is that it statistically represents the population it is derived from. But that assumption is not necessarily valid. When the sample is large, it tends to be representative at the body of the curve (between the 25th and 75th percentiles), but not necessarily on short (<25th percentile) and long end (>75th percentile) of the curve. The reason is simple: Not all segments of the market move in tandem. When a market starts its upswing, it usually begins at the lower end of the curve, followed by the mid-range and further up. Thus, without a proper representative test, a sales ratio study is, at best, a hit or miss. The additional sub-market or normalized ratios remain equally unreliable.

3. Sales Ratio Studies do not require Price Segmentation Tests -- Sales ratio studies are perfect "one size fits all," meaning only a median-based ratio does the real trick. The absence of the price-segmented (<25th percentile; 25th to 50th percentiles; 50th to 75th percentiles; >75th percentile) ratios is, at best, a limited scope analysis, which is the primary reason why many tax rolls are regressive, i.e., why the middle-class neighborhoods heavily subsidize the wealthy districts. State Boards and Industry Technical bodies must additionally (in addition to the median ratio) require the full price-segmented ratios to minimize the incidence of compensating errors, leading to fair and equitable adjustments    

4. Sales Ratio Studies do not require Champ-Challenger Validations -- Before an AVM is finalized, it is optimized and then tested against a mutually exclusive hold-out sample (Challenger). If the hold-out test results are very similar, the model is considered final (Champ) and is ready to be applied to the population. Of course, when it comes to sales ratio studies, there are no such requirements. A forward sales sample would be an ideal challenger. For example, if the statutory ratio is developed off of the 2018 calendar year sales, it could be tested against a forward sales sample (comprising validated Q1/Q2-2019 sales). Seasoned listings could be added to bolster the forward sample. The forward sample test must produce comparable (to the statutory sample) results. Before rushing to make a biblical prophecy to confirm the roll results, so the dust settles, the concerned 3rd parties like the local newspaper reporters and independent review consultants should, at least, undertake this challenger test.

5. Sales Ratio Studies do not require Stratified Time Adjustments -- As explained before, not all segments of the market move in tandem; hence time adjustment factors in each segment are often different. Applying one median factor generally distorts both ends of the curve, forcing the outer segment values to move further away from the AVM (Roll) values. Again, the State Boards and Industry Technical bodies must require all ratio analyses - from sales sampling to time adjustments to error ratios - performed and broken down into statistically significant price segments. In fast-moving markets, time becomes a critical issue, so the time adjustment factors must be analyzed and applied by statistical segments. Alternatively, even when a median time factor is used in an AVM, it does not pose any threat as it interacts with other variables, including location, and gets corrected.

6. Sales Ratio Studies do not require any meaningful Spatial Tests -- While a system-wide median ratio could be acceptable for a small and mostly homogeneous jurisdiction, it is not very meaningful for large and complex jurisdictions with multiple towns, boroughs, etc. For example, a system-wide coefficient of dispersion (COD) of 15 for New York City is neither very instructive nor very helpful, as a low COD of 9 for Staten Island, a relatively homogeneous borough in the City, may compensate for Brooklyn's 20 due to its highly heterogeneous housing stock. In this example, while Staten Island passes with flying colors, Brooklyn fails even though its overall COD remains compliant. Therefore, the study of assessment equity requires meaningful analysis by major spatial parts and the aforesaid economic/market attributes and segments.

7. Sales Ratio Studies do not require the use of MLS data to Test Data Validity -- Granted, most tax jurisdictions tend to be more careful as to the quality of the sales data (easy picking by media, etc.) than the unsold properties. Yet, this data quality is nowhere as clean and up-to-date as the MLS data that are all professionally inspected and verified. Therefore, the State Boards and other Technical bodies should urge that the jurisdictions develop the ratio-eligible database after comparing the internal sales data with those of the MLS'. Only the (arms-length) sales data as matched and confirmed by the MLS data should qualify for the ratio study. There is a long-term benefit to this exercise as well: By studying the unmatched data, an AI logic could be developed and applied on to the unsold population to isolate (or at least narrow down) the cases requiring immediate attention. An AI-driven auto-regressive data update process is always preferable (inherently more surgical) to the traditional cyclical approach (e.g., update all data on a 5-year cycle), considering only a small percent of the entire population might need attention or update, although costing taxpayers an unnecessary ton.

8. Sales Ratio Studies do not require any Data Convergence schema -- Although the raw sale prices are being compared with the modeled values, no data convergence schema is necessary to make the sale prices closely align with the data. As indicated before, absent the data variables, it is difficult to explain why two very similar homes in very close proximity of each other are fetching somewhat different prices. While an AVM will correct and explain that difference, sales ratio studies will have no explanations. Therefore, sales verification must also include "Effective Age" and "GIS Implications." If the GIS implications are noted alongside the verified sales, the highly impacted sales could easily be avoided, a priori. Similarly, the Effective Age ranges could effectively serve as sub-stratification criteria to compare and analyze the genuinely similar properties. The point is that the sale price alone is inadequate to form any meaningful decision-making.    

9. Sales Ratio Studies have yet to factor in the Impact of Cap on SALT deductions -- The new $10,000 cap on SALT deductions (including property taxes) has started to impact the high-end residential markets, especially in high-tax coastal markets. A report from the New York Federal Reserve concluded that the caps on taxes and mortgage debts "have negatively impacted the housing market" by lowering the sales volume. Of course, the market will take a while to manifest any meaningful medium to the long-term effect. As the volume wanes on the long end of the value curve, sales must be adequately replaced by hand-worked appraisals. It would be imprudent and highly regressive to re-populate that curve's stretch by drawing from the 50th to 75th percentile ranges. That type of re-population or idea perhaps works in physical sciences, but not in economic sciences. State Boards and Technical bodies must recognize and act on this emerging trend.

Sales ratio studies urgently need a fresh and forward look. The median-based one-size-fits-all concept has to be replaced with meaningful market segmentation analyses, coupled with a handful of spatial and economic attribute tests. 

-Sid Som, MBA, MIM
homequant@gmail.com



Tuesday, December 3, 2019

Why AVM Error Rates (COD's) across Tax Rolls are not Comparable

The analysis of Assessment Rolls of major Jurisdictions requires advanced technical training and quantitative knowledge. It's hilarious when a local staff reporter settles the score annually, with a lengthy and superficial article, and the politicians run with it, silencing the unhappy taxpayers. And the cycle continues, year in and year out.

A recent local newspaper article indicated that the percent error rates "of the five largest cities for which studies have been completed in the last two years... including New York at 17.6 percent, Chicago at 25.1 percent, and Philadelphia at 20.2 percent. Houston's error rate was 7 percent in its most recent study, and Phoenix's was 8.1 percent."
Though Automated Valuation Modeling ("AVM") was used to develop all of the above Assessment Rolls ("Roll"), the modeling error rates as indicated above (generally defined by the Coefficient of Dispersion or "COD" of the underlying AVM) are not comparable. 

While there are general AVM guidelines, they are not like the SAT or GRE. AVMs' development is highly subjective, depending mainly on the in-house modeler(s) or the hired consultant's acumen. Since the actual models are not published, the re-validation of those model CODs, externally, is even more subjective and circular.  

So, why are the above CODs are not comparable? Here are the fundamental reasons:

1. Sales Validation -- All market AVMs are developed off of recent, arms-length sales. Thus, all sales have to be validated, and then a random or stratified random sample of arms-length sales serves as the modeling sample. Of course, there is no hard science behind the sales validation process. Therefore, if Jurisdiction X considers all of its border-line cases as arms-length, while Jurisdiction Y aggressively removes them from its identical universe, the resulting AVM of the former, ceteris paribus, will produce a higher COD than the latter's. Unfortunately, when the local reporters compare the competing CODs, they will have no idea how the respective jurisdictions validated the sales.

2. Sales Sampling -- From the universe of the validated arms-length sales, a sample properly representing the overall population is then derived. The sales sample must statistically "represent" the population, failing which the resulting AVM will be invalid, paving the way for a flawed Assessment Roll (statutorily, an Assessment Roll must be fair and equitable). Again, there is no hard and fast rule as to the extraction of the sales sample. If Jurisdiction X restricts the representative test to the 1st-to-99th percentile range while Jurisdiction Y takes a more lax approach of 5th-to-95th percentile, the AVM of X, ceteris paribus, will have higher COD than Y's. Of course, the local reporters would not even know of this requirement, let alone performing the test.

3. Removal of Outliers -- As part of the model optimization, a set of outliers are systematically identified and removed. While there are various methods to identify and remove outliers, the (sales) ratio percentile range is typical. Of course, some would use a very conservative range or approach while others (those obsessed with better stats, i.e., lower CODs) would be more aggressive. Ceteris paribus, the modeler who conservatively defines and removes outliers below the 1st percentile and above the 99th percentile range, will have a much higher model COD than someone who aggressively removes all below the 5th and above the 95th percentile range. Case in point: Chicago's 25.1 vs. Houston's 7. Unfortunately, the local reporters would try to justify both – perhaps they already have, without even knowing the underlying modeling criteria as models are rarely published.

4. Sub-market Modeling -- Many modelers and consultants build their AVMs bottom-up, instead of the customary top-down. Here is an example of what bottom-up modeling means: Let's say that the Roll is for the County as a whole, though the County comprises five Towns. Now, if the modeling takes place at the Town level (bottom-up), instead of at the normal County level (top-down), the average Town-wise CODs would be lower than the customary top-down modeling, even though the objective remains unchanged: To produce a fair and equitable County-wide Roll. This type of bottom-up modeling problem is that there will be significant noise along the Town lines, generating a considerable amount of inconsistent values. Of course, the rush-to-approve local reporters would never know any of this, as those models are rarely made public. They even disregard the FOIL requests by citing 3rd party software copyright, etc.

5. Spatial Tests -- Irrespective of #4 above, publications of Town-wise results are not typical. Again, while the County-wide COD could be compliant, the Town-wise CODs could be far apart. If Town-1 is highly urban (requiring complex modeling, hence higher COD), whereas Town-5 is highly suburban (involves easier modeling, thus much lower COD), the CODs are expected to be quite different. Of course, the modeling criteria (sales sampling, outliers, etc.) must remain uniform across all Towns. Absent publications of the actual models, taxpayer advocacy groups must, at least, insist on the CODs by major sub-markets (e.g., Towns), in addition to the system-wide COD. They must also insist on knowing if the modeling criteria were uniform across all of the major sub-markets. Of course, the local reporters vouching for the Rolls would confidently do so without even knowing how the modeling had taken place.
   
6. Equity Analysis -- A system-wide COD is just the beginning. It does not confirm that the Roll is fair and equitable. Let's assume that the reported COD is 15, which is compliant, a priori. Now, let's also assume that the unreported Town-wise average sales ratios range between 85 and 115. Since the Rolls tend to be regressive, it's highly likely that the 85 ratios would pertain to the most affluent Town in the County while the 115 would represent one of the middle-class Towns. In essence, the poor and middle-class neighborhoods perennially subsidize their wealthy counterparts. While the rich would make a lot of splash about their Roll values, they would be reticent when they sell their homes at twice the same Roll values. The average ratio of 85 does not mean that all homes in that Town are assessed strictly at that level. The 1st-to-99th range could be 70 to 100 (generally wider), while the Town with an average ratio of 115 could have a 1st-to-99th range of 100 to 130. Now, let's compare 70 to 75 with 125 to 130. The local reporters who boldly confirm the Rolls would be clueless about this regressivity.  

7. Data Maintenance -- Intra (i.e., within the Jurisdiction) comparison: Sales are dressed and staged so the sale data are inherently cleaner and more up-to-date than the unsold property data, thereby producing lower CODs for the modeling sample. Also, the sold parcels with data inconsistencies fall off by way of model outliers and resurface upon applying the model to the population. It's a classic 'hide and seek' unless those data errors are heeded before the model application. Of course, nobody knows what happens behind the curtain. Generally, the local MLS plays a significant role in (indirectly), forcing the Jurisdiction to keep the sale data up-to-date (obviously, sale data are easy picking by the media and other interested groups). Inter (i.e., across Jurisdictions) comparison: Two adjoining Jurisdictions may have vastly different outlooks in managing the population data. One may be very proactive while the other may be reactive, at best. Ceteris paribus, the lot fraction defective of the former Roll, would be significantly lower, generating far fewer tax appeals (an excellent metric to follow) than the latter's. Again, the local reporters confirming those Rolls would be clueless of these competing scenarios. 

8. Model Testing -- The modelers and consultants who apply their draft models to the mutually exclusive hold-out samples, ceteris paribus, will have more sound and reliable Rolls than those who tend to skip this critical modeling step. This step helps identify the errors and inconsistencies - from sample selection to outliers to optimization to spatial ratios and CODs - in draft models, often to the extent that they get sent back and are reworked from square one. The hold-out sample must have the same attributes of the modeling sample (and, in turn, of the population), so this test is one of the most established ways to finalize a model, leading to its successful application. Again, the Jurisdiction that methodically performs this step produces a more sound and reliable Roll, with potentially far fewer tax appeals than its counterpart, that boldly skips it. Of course, the local reporters confirming these Rolls would not know any of these crucial details.

9. Forward Sales Ratio Study -- A forward sales ratio study would be an ideal way to begin the Roll investigation process. For example, if the Roll was developed off of the 2018 calendar year sales, it could be tested against a set of forward sales ratios (comprising validated Q1/Q2-2019 sales, etc.). To bolster the size of the forward sales sample, seasoned listings could also be added. Once time-adjusted back to the valuation date, the forward sales ratio test should produce results that closely parallel the published Roll. Therefore, before rushing to hire expensive consultants, the taxpayer advocacy groups should consider hiring local analysts to compile forward sales samples and run the ratio tests. The results must then be studied multi-dimensionally, meaning by major sub-markets, value ranges, non-waterfront vs. waterfront, Non-GIS vs. GIS, etc. If the results turn out very different, a challenger AVM is in order. At that point, instead of hiring some from the universe of so-called industry experts (who would not shoot themselves in the foot), an outside economic consulting firm would be preferable as that firm would provide real analysis along with a coordinated strategic action plan.  

So, what is the solution? To minimize the damage done by the low-knowledge local reporters who rush to confirm the Roll (to please the ruling party), the taxpayer advocacy groups must present their critical viewpoints via op-eds in competing papers and magazines.

No denial that well-respected billionaire businessmen like Warren Buffett and Sam Zell have written the print media off.

-Sid Som, MBA, MIM
President, Homequant, Inc.
homequant@gmail.com