• Vol 10, No 1 (2019)
  • Architecture

A Comparison of Bandwidth and Kernel Function Selection in Geographically Weighted Regression for House Valuation

Joseph Awoamim Yacim, Douw Gert Brand Boshoff


Published at : 28 Jan 2019
IJtech : IJtech Vol 10, No 1 (2019)
DOI : https://doi.org/10.14716/ijtech.v10i1.975

Cite this article as:
Yacim, J.A., Boshoff, D.G.B., 2019. A Comparison of Bandwidth and Kernel Function Selection in Geographically Weighted Regression for House Valuation. International Journal of Technology. Volume 10(1), pp. 58-68
124
Downloads
Joseph Awoamim Yacim Department of Estate Management and Valuation, School of Environmental Studies, Federal Polytechnic, Nasarawa, P.M.B. 001, Nasarawa 962101, Nigeria
Douw Gert Brand Boshoff Urban Real Estate Research Unit, Department of Construction Economics and Management, University of Cape Town, Private Bag X3, Rondebosch 7701, South Africa
Email to Corresponding Author

Abstract
image

The study examines the influence of four spatial weighting functions and bandwidths on the performance of geographically weighted regression (GWR), including fixed Gaussian and bi-square adaptive kernel functions, and adaptive Gaussian and bi-square kernel functions relative to the global hedonic ordinary least squares (OLS) models. A demonstration of the techniques using data on 3.232 house sales in Cape Town suggests that the Gaussian-shaped adaptive kernel bandwidth provides a better fit, spatial patterns and predictive accuracy than the other schemes used in GWR. Thus, we conclude that the Gaussian shape with both fixed and adaptive kernel functions provides a suitable framework for house price valuation in Cape Town.

Global model; Geographically weighted regression; House price; Kernel function

Introduction

Global hedonic ordinary least squares (OLS) models have, over the years, been identified and utilized for a variety of purposes in different fields. In the housing and related fields, these techniques are typically used to identify the marginal contribution of each of the housing features to price for over 50 years. Nonetheless, the global techniques are affected by their inability to completely remove spatial dependence and spatial heterogeneity in the data. These glitches, if ignored, might result in biased and unreliable parameter estimates. Des Rosier and Thériault (2008) reported that creating appropriate market segments, transforming the data, ensuring adequate model specification and applying the right spatial models are possible ways of dealing with these limitations. Of particular interest in this study is the use of spatial models to control spatial dependence and spatial heterogeneity. The models are based on refined hedonic techniques devoid of parametric restrictions with built-in features that adequately capture spatial autocorrelation (dependence) and/or variation or non-stationarity (heterogeneity) in housing prices.

Though data driven, spatial models are problem-specific solvers (tackling autocorrelation or heteroskedasticity) and do, however, have the potency to reduce other glitches found in the property market. For instance, local regression methods, such as geographically weighted regression (GWR), designed to, among other functions, control spatial heterogeneity have been found to reduce spatial autocorrelation in residual errors (McCluskey & Borst, 2011). However, despite its capability of controlling spatial heterogeneity in the property market, there is little attention given to GWR in the literature (Bitter et al., 2007), particularly from a pan-African perspective.

Thus far, GWR has been used in a number of housing price studies, including those published by Bitter et al. (2007), Páez et al. (2008), Borst and McCluskey (2008), Lockwood and Rossini (2011), McCluskey et al. (2013), and Bidanset and Lombard (2014). To date, the only known pan-African housing study involving GWR was undertaken by Yacim and Boshoff (forthcoming) in Cape Town, South Africa. This study is considered more comprehensive in terms of house price analysis because of the differing number of kernel functions and other schemes employed within its GWR framework.  

GWR attempts to capture spatial variation in the interactions between the response variable and the different explanatory variables at each regression point in the study area, assigning weights to all observations relative to their distance from the regression point. Accordingly, the nearer an observation is to the regression point, the more the weights assigned thereby exert more of an effect on the regression estimates than more distant observations. Kernel (density) function and bandwidth schemes are central to the effective performance of GWR. Thus, the optimal performance of GWR in house price estimation is a reflection of the proper selection of kernel density and bandwidth, and their parameter settings. According to Bitter et al. (2007) and Guo et al. (2008), a higher bandwidth will produce coefficient estimates that are similar to estimates of the global OLS models with a spatial pattern that appears smooth across the geographic space of the local market. Contrariwise, if a lower bandwidth is used, the coefficient estimates will only be for observations that are closer to the regression points, thereby causing a high variance (Fotheringham et al., 2000; Fotheringham et al., 2002). To ameliorate the glitches, and because house data behave differently relative to geographical location, different GWR schemes were tested with South African data to unravel the best kernel and bandwidth specifications.

Previously, the study of Bidanset and Lombard (2014) examined the combined contribution of bandwidths and kernel functions to a house price analysis conducted in Norfolk, Virginia. However, the main goal of this study is to see if a data example derived from South Africa (a region with different socioeconomic and contextual settings that influence buyers’ attitudes) might replicate the results of previous housing price analysis. The findings of this study could be of great interest to analysts and modelers as it provides an easy framework for selecting optimal kernel function and bandwidth without the need to try different schemes in the GWR assessment.

Conclusion

The global OLS models produce regression coefficients that do not reflect the true relationship within housing datasets because of their limitations in correcting for spatial effects. GWR permits local variation of parameter estimates within a geographic region, thereby producing reliable results. However, the optimal performance of GWR is predicated on the choice of spatial weights kernels and bandwidths. Providing a framework for analysts and modelers, particularly within a pan-African context, is the main motivation, among others, of this study. Accordingly, the study compares the performance of different spatial kernel bandwidth weighting specifications in GWR relative to the global models using an example of housing data from Cape Town, South Africa. Specifically, the AIC for the golden bandwidth search scheme on fixed (Gaussian and bi-square) and adaptive (Gaussian and bi-square) kernel functions were used.

While all GWR models improve upon the results of the stationary coefficient global models, despite the inclusion of the second-order polynomial location coordinates, the adaptive kernel bandwidth–Gaussian shaped GWR (model 4) outperformed all other specifications in this study. The fixed kernel bandwidth–Gaussian shaped GWR (model 2) trailed closely behind, revealing that Gaussian-shaped GWR is suitable for house price valuation in Cape Town. One notable relationship that the results of this study shares with those of Bidanset and Lombard (2014) is the fact that the Gaussian-shaped scheme with fixed and adaptive kernels is optimal. However, while this study found the adaptive kernel to perform best, their study found the fixed kernel to be optimum. Thus, analysts and modelers should consider the use of the Gaussian-shaped scheme with either fixed or adaptive kernel functions in their assessments of house prices, as suggested in both studies. Additionally, the results provide complete evidence that either of the spatial weights specifications in GWR is a viable alternative in situations where price estimation is the principal interest.  

One area of concern is the high COD and PRD produced by the models used in this analysis. This result might be the consequence of inaccurate data collection, specification errors exacerbated by omitted attribute information or market inefficiencies. Further research might be necessary to unravel the causes before a definite position can be reached regarding the Cape Town housing data.

References

Berawi, A.R.B., Delgado, R. Calcada, R., Vale, C., 2010. Evaluating Track Geometrical Quality through Different Methodologies. International Journal of Technology, Volume 1(1), pp. 38–47

Bidanset, P.E., Lombard, J.R., 2014. The Effect of Kernel and Bandwidth Specification in Geographically Weighted Regression Models on the Accuracy and Uniformity of Mass Real Estate Appraisal. Journal of Property Tax Assessment & Administration, Volume 10(3), pp. 5–14

Bitter, C., Mulligan, G.F., Dall’erba, S., 2007. Incorporating Spatial Variation in Housing Attribute Prices: a Comparison of Geographically Weighted Regression and the Spatial           Expansion Method. Journal of Geographical Systems, Volume 9(1), pp. 7–27

Borst, R.A., McCluskey, W.J., 2008. Using the Geographically Weighted Regression to Detect Housing Submarkets: Modelling Large-scale Spatial Variations in Value. Journal of Property Tax Assessment & Administration, Volume 5(1), pp. 21–54

Cameron, A.C., Trivedi, P.K., 2005. Microeconometrics: Methods and Applications. Cambridge University Press, New York

Cho, S-H., Lambert, D.M., Chen, Z., 2010. Geographically Weighted Regression Bandwidth Selection and Spatial Autocorrelation: An Empirical Example using Chinese Agriculture Data. Applied Economics Letter, Volume 17(8), pp. 767–772

Des Rosier, F., Thériault, M., 2008. Mass Appraisal, Hedonic Price Modelling and Urban Externalities: Understanding Property Value Shaping Processes. In: Kauko, T and d’Amato, M., (eds.), Mass Appraisal Methods: An International Perspective for Property Valuers, pp. 1-24. Oxford: Wiley–Blackwell

Fotheringham, A., Brunsdon, C., Charlton, M., 2000. Quantitative Geography: Perspectives on Spatial Data Analysis. Sage Publications, London

Fotheringham, A.S, Brunsdon, C., Charlton, M., 2002. Geographically Weighted Regression the Analysis of Spatially Varying Relationships. Wiley, New York

Griffith, D.A., 2008. Spatial-Filtering-Based Contributions to a Critique of Geographically Weighted Regression (GWR). Environment and Planning A, Volume 40(11), pp. 2751–2769

Guo, L., Ma, Z., Zhang, L., 2008. Comparison of Bandwidth Selection in Application of Geographically Weighted Regression: A Case Study. Canadian Journal of Forestry Research, Volume 38(9), pp. 2526–2534

Leung, Y., Mei, C-L., Zhang, W-X. 2000. Statistical Tests for Spatial Nonstationarity based on the Geographically Weighted Regression Model. Environment and Planning A, Volume 32(1), pp. 9–32

Lockwood, T., Rossini, P., 2011. Efficacy in Modelling Location within the Mass Appraisal Process. Pacific Rim Property Research Journal, Volume 17(3), pp. 418–442

McCluskey, W.J., Borst, R.A., 2011. Detecting and Validating Residential Housing Submarkets: A Geostatistical Approach for Use in Mass Appraisal. International Journal of Housing Market and Analysis, Volume 4(3), pp. 290–318

McCluskey, W.J., McCord, M., Davis, P.T., Haran, M., McIlhatton, D., 2013. Prediction Accuracy in Mass Appraisal: A Comparison of Modern Approaches. Journal of Property Research, Volume 30(4), pp. 239–265

Moore, W., Myers, J., 2010. Using Geographic-attribute Weighted Regression for CAMA Modelling. Journal of Property Tax Assessment, Volume 7(3), pp. 5–28

Páez, A., Long, F., Farber, S., 2008. Moving Window Approaches for Hedonic Price Estimation: An Empirical Comparison of Moving Techniques. Urban Studies, Volume 45(8), pp. 1561–1581

Thériault, M., Des Rosier, F., Villeneuve, P., Kestens, Y., 2003. Modelling Interactions of Location with Specific Value of Housing Attributes. Property Management, Volume 21(1), pp. 25–62

Tobler, W.R. 1970. A Computer Movie Simulating Urban Growth in the Detroit Region. Economic Geography, Volume 46(2), pp. 234–240

Yacim, J.A., Boshoff, D.G.B., 2018. Impact of Artificial Neural Networks Training Algorithms on Accurate Prediction of Property Values. Journal of Real Estate Research, Volume 40(3), pp. 375–418