Useful facts when going through this analysis

As a young man who recently got my MSC in physics, living and working in Copenhagen, the next rational step is to buy an appartment. Early estimates of mine - for when I could possibly buy an appartment, based on my pay - dampened my spirits, because appartment prices all were in the price range 2 - 4 mio. kr (270k - 540k $) for 3 room appartments of size 60 - 100 m^2. Having lived in Denmark in my youth, I remembered that my parents bought a ~100 m^2 appartment for far less. For this reason I became very interested in how the appartment prices in Copenhagen had developed in the past years. Luckily danish realtors, have data for sales in Denmark all the way back to 1992, which is available at www.boliga.dk. The data contains the following information on the sale:

## 'data.frame':    70447 obs. of  14 variables:
##  $ city                       : Factor w/ 1 level "Copenhagen": 1 1 1 1 1 1 1 1 1 1 ...
##  $ sales_price                : num  3645000 3975000 1350000 3350000 3250000 ...
##  $ price_percentage_difference: num  -1 -3 0 NA -7 -5 NA -5 0 NA ...
##  $ municipality               : Factor w/ 18 levels "Albertslund",..: 12 4 12 12 12 12 4 15 15 12 ...
##  $ size_square_meters         : int  94 114 45 74 77 60 45 46 116 56 ...
##  $ price_per_square_meter     : num  38776 34868 30000 45270 42207 ...
##  $ postal_code                : int  2450 1800 2400 2100 1307 2300 2000 2630 2630 2100 ...
##  $ sales_type                 : Factor w/ 5 levels "-","Alm. Salg",..: 2 2 2 2 2 2 5 2 2 5 ...
##  $ day                        : chr  "28" "28" "28" "28" ...
##  $ month                      : chr  "01" "01" "01" "01" ...
##  $ year                       : chr  "2016" "2016" "2016" "2016" ...
##  $ nRooms                     : int  3 4 2 3 3 3 2 1 4 2 ...
##  $ housing_built              : int  2007 1897 1937 1903 1806 1902 1885 1968 1974 1932 ...
##  $ date                       : Date, format: "2016-01-28" "2016-01-28" ...

To start with, we can look at some basics, like how have the sales prices changed over the years, and how have the prices per square meter (PPSM) changed over the years?

These figures show the sales prices and price per square meter as a function of time, with the median and first- and third quantile on top. We see that the sales prices have continuously grown, with except for a dip from 2006-2009. This dip likely occured because of the economic crises and resulted in a fall in PPSM of nearly a third. But as we can see from the figures, the prices have grown again since then, and it looks like the PPSM has now reached the same levels as it had in 2006 of around 30.000 kr/m^2. Compared to the measly average of about 6.000 kr/m^2 the appartment prices in Copenhagen have grown with an average rate of 1.000 kr/m^2 per year or a total of 500%. W.r.t. the figure on the left, the average sales price (ASP) has grown from about 500.000 kr to 2.000.000 kr, which is about 400%, and corresponds to an average increase in sales price (SP) per year of about 62.500 kr. Given that prices are increasing, does this reflect in price percentage difference (price percentage difference (PPD) is the difference between the initial price set by the realtor/owner and the actual sales price)?

This plot doesn’t really show anything, as it seems like one sale was done with a positive PPD of around 162.999.900%. To find out if this is an error I went to the website where I got my data. According to the site, this sale was initially set to 1 kr and sold for 1.630.000 kr, resulting in the 162.999.900%. But to see the bulk of the data we can limit the axis limits and show the distribution of PPD.

So it looks as though the actual sales price is almost always lower with a median PPD of -3%. On the left plot it is quite hard to describe what is happening, but on the right plot it is clear that the PPD is almost normally distributed, with a peak at around -3%.

From these figures we can see that the ASP increases, and that the sales price is almost always lower than the initial price. To see if the average incomes have increased as much, I downloaded the average income data for people in the same municipalities as my sales dataset from Danmark Statistik at dst.dk. One issue with this dataset is that the smaller submunicipalities that are present in the appartment sales data, like Valby, Vanløse, Hedehusene, etc. are all grouped up in the København municipality. This will hide much of the information related to København, because this is a large city with many different areas, which have a huge price gap, and should most likely also have a huge income gap.

This figure shows how the average yearly income (AYI) has increased since 1994, and it is quite clear that there has been an increase of about 170% over the entire period. This corresponds to an average increase in AYI of about 5.500 kr/year. The slope of the curve might seem a little steep, but that is because the y axis starts at 150.000 and not 0. It is also interesting to see that the dip we found in the sales prices and PPSM in the sales data also shows as a break in the linear increase of the income curve. The effect of the economic crisis, didn’t have as big an effect on the increase in AYI, but we have to take into account that even though the appartment market prices plummeted, for about 3 years , they had an explosive growth afterwards, which is not reflected in the income growth. Comparing the 200% increase in AYI with the 400% and 500% increase in appartment prices and PPSM respectively, we can deduce that it must be more difficult for people to enter the apparment market. To see if this is reflected in the amount of sales, we can look at the amount of sales per year.

This figure is generated by counting the number of sales per month. It shows that the amount of sales has actually increased, except for a decrease in sales in the period from 2006-2009. To see it differently we can show the median sales price per year per municipality, and the total number of sales per municipality on a map.

So as you can see on the figure, the number of sales are increasing in all municipalities, but it looks as though most of the sales occur in København and Frederiksberg.

To see how the ASP has developed over the years per municipality, we can plot this on a map as well.

This clearly shows that the median sales price is growing per municipality per year, and like with the frequency of sales, København seems to be the area where the median sales price increases the most.

Now, if we just compare the average income increase with the average increase in appartment prices we have an average increase in yearly income of about 5.500 kr/year, while we have an average increase in appartment prices of about 62.500 kr/year. This means that the average appartment prices (AAP) grow at a rate of about 1100% of the average increase in yearly income. One might wonder how anyone, who finished their education in the past couple of years, can ever hope to buy an appartment?

One explanation of why the appartment prices continue to grow is that banks continue to grow more willing to increase the size of loans at a very low interest, in what is called Realkreditlån - see more at bolius.dk. Those interested in the history of Realkreditlån can read more about it at realkreditraadet.dk. This low interest loan was introduced in the 18’th century as a result of a fire in 1795. The initial maximum loans given by the Realkredit institutions was 60% but in 1989, this increased to 80%. The loan can only be used on property, and as a side note, in May the interest on Realkredit loans was down to a minimum of 2.3%. This means that to buy an appartment in Copenhagen, you actually don’t need more than 20% of the property price. But you still need 20% of the property price, which in late 2015, was 20% of 2.000.000 kr or about 400.000 kr to even get a Realkreditlån. Given that AAP’s grow faster than AYI’s, we can assume that the average price of property will continue to grow faster than the average income. So as time goes by, how are people supposed to afford homes? Many people do chose to save up an initial 5-10% of the property price, and then take a regular bank loan for the remaining 10-15%. This is a viable option, because regular bank loans interest rates have been steadily declining the 1980’s as is shown on the figure below - see the more about diskonto, and interest rates on interest rates.

Getting back to the figures, the AYI figure also showed that there is quite a gap between the points, and it looks as though this gap is increasing. To see if this gap can be explained by which municipality one lives in, we can look at the same plot colored by municipality as well as show a map of this AYI.

Sadly, I couldn’t get all the municipalities on the map, but from the colored AYI development plot on the left, it should be clear that there is a distinct difference between AYI for people living in different municipalities. Even two bordering municipalities like Ishøj and Valenbæk have an enormous gap of about 80.000 kr.

Now, I’ve gone some of the questions about the data that I found most intruiging. But maybe by creating a scattermatrix, we can see some correlations that I haven’t thought of.

Looking at this scattermatrix, we can see that there is a correlation between median sales price and average income. This is not surprising as both the average income figure and the sales price figure showed an increase. According to the scattermatrix, there’s a correlation of 0.663 which is quite high. This is not the same as causation, the average income is most likely correlated with the sales price, but sales price of appartments is not the cause of increasing income. To see the correlation figure a bit better we can look plot it.

As with the income gap, this figure shows that there is a difference between income and sales price correlation.

One more thing I wanted to look at is how the median sales price and income by municipality looks when plotted on horizontal bar charts in an ordered fashion.

Looking at how this has developed over the years we can facet it by years. I’ve chosen to group the years.

What is interesting about this development is the effect of the economic crisis on average income in the years 2000-2006 and 2006-2010. We see that all the municipalities have a lower average income, but it looks as though the average income decreases more in the top 3 municipalities, resulting in an almost uniform distribution in the 2006-2010 period.

Now we can look at the same figures for sales price.

This is similar to the map we plotted earlier, and shows that Karlslunde is in fact the most expensive municipality in the Copenhagen area. Facetted by year this gives.

Final Plots and Summary

These final plots were chosen because I found them the most informative and they were important in my understanding of a market as complex as the real estate market.

Plot One

Description One

This plot was the first plot of the analysis and was what spiked my interest into investigating sales price development by years and by municipality as well as later looking into average income by the same factors. The plot clearly shows a growing sales price, and price per square meter but most important also shows the large effect of the economic crisis on appartment prices. I found it very interesting that the appartment prices seemed to explode from 1992 - 2006. A period of 14 years where the appartment prices grew from an average price per square meter of 6.000 kr to 30.000 kr, i.e. almost 100% per year of the initial 1992 price per square meter. I had some talks with people older and wiser than me, who enlightened me about Realkreditlån. I then read a bit about its history and found that in 1989, this Realkreditlån had increased its loan out percentage of the property price, meaning that people would need less at hand to buy an appartment. This has ofcourse had a huge effect on the market as, low interest large loans would mean bigger affordable appartments for most people. This in turn would mean a bigger demand for appartments, which in turn would lead to less supply (appartments), and as of now, we are in the situation of the Supply and Demand curve where appartments are low and prices are very high. Before going deeper into my own investigation of this, I can’t predict anything very well, but with everything I’ve learned from this analysis, the small knowledge I now have on bank loans and other low interest loans, as well as some basic economics like supply and demand, I have a strong indication that the appartment prices will drop drastically again soon.

## Source: local data frame [3 x 5]
## 
##    year     mean median     Q1     Q3
##   (chr)    (dbl)  (dbl)  (dbl)  (dbl)
## 1  1992 7641.443   5940 5057.0 6917.5
## 2  1993 6633.410   5529 4454.0 6536.0
## 3  1994 5830.283   5647 4559.5 6714.0
## Source: local data frame [3 x 5]
## 
##    year     mean median      Q1    Q3
##   (chr)    (dbl)  (dbl)   (dbl) (dbl)
## 1  2014 27044.79  26450 20181.0 32960
## 2  2015 29419.10  28508 21777.5 36024
## 3  2016 27889.79  26744 20160.0 34615
## Source: local data frame [3 x 5]
## 
##    year     mean   median     Q1       Q3
##   (chr)    (dbl)    (dbl)  (dbl)    (dbl)
## 1  1992 697361.0 460000.0 349500 670000.0
## 2  1993 643976.0 444744.0 318000 650000.0
## 3  1994 571301.3 453761.5 305350 670783.2
## Source: local data frame [3 x 5]
## 
##    year    mean  median      Q1      Q3
##   (chr)   (dbl)   (dbl)   (dbl)   (dbl)
## 1  2014 2297956 1870000 1298000 2995000
## 2  2015 2400103 1915000 1350000 3014500
## 3  2016 2256202 1725000 1275000 2845000

Plot Two

Description Two

Looking at approximately the same period as with appartment sales, this plot shows the average income of people in the same municipalities. I chose this plot as one of my finals, as this led to to the finding that the average appartment price grew by a rate of 1100% of the average income, which really spiked my curiousity. By no means does my research into the appartment market in Copenhagen end here, and this plot together with the first have given me the idea that there might be a new appartment price drop coming soon.

## Source: local data frame [3 x 5]
## 
##    year     mean   median       Q1       Q3
##   (int)    (dbl)    (dbl)    (dbl)    (dbl)
## 1  1994 180876.4 176124.0 170918.2 191053.2
## 2  1997 198750.5 191246.5 186488.8 214050.5
## 3  1999 214924.1 206226.0 200769.5 234954.5
## Source: local data frame [3 x 5]
## 
##    year     mean   median       Q1       Q3
##   (int)    (dbl)    (dbl)    (dbl)    (dbl)
## 1  2012 295170.5 285716.0 278348.5 321322.8
## 2  2013 298707.3 288764.5 282200.5 326237.8
## 3  2014 304428.2 294604.5 285679.0 332232.5

Plot Three

Description Three

This plot was chosen to give a better overview over how and where the prices grow. It clearly shows the same tendency as the first of my final plots, which is an increase in median sales price. However, one thing which is quite hard to see, but can be seen on this plot is the movement of expensive municipalities. In the years from 1992 - 1999, the increase in median sales price per municipality seems to be quite uniform. After 2000 and until 2006 København and Frederiksberg seem to grow quite a lot faster than the other municipalities. After the economic crisis Copenhagen, Frederiksberg, Herlev and Vallensbæk seem to grow equally fast. According to the previous barplot, Vallensbæk and Herlev are actually quite below the values of Copenhagen and Frederiksberg. This map plot was the best I could do, and therefore it was important for me to have as a part of the final plots. I’m not very satisfied with it, as my initial idea of this analysis was to get a better indication of smaller sub municipalities like Valby, Vanløse, etc. which in the map data are grouped under København.

Reflection

There were a lot of interesting results for me going through this analysis, like the very visible effect of the economic crisis on both average income but most visibly on the sales price of appartments. I was shocked to see that the growths in average income and appartment sales price didn’t match, but I learned a whole lot about how low interest loans, especially Realkreditlån, have had a very large effect on the increase in appartment prices. It was quite interesting to see how the appartment prices had exploded again after 2010, while the average income had kept the same low increase introduced at the beginning of the economic crisis. However, I was quite sad to find out that the maps that I could find from GADM, and around the internet didn’t have shapes for postal districts in central Copenhagen, since I know for a fact that there is a lot of variation in the Copenhagen areas. Furthermore I found that some of the postal districts, which I have included in my appartment sales price analysis actually turned out to be quite far from central Copenhagen, even though they had a postal code matching central Copenhagen. Furthermore, because the municipality data for average income and sales price didn’t match up, some of the results might be a bit different if other data was available. I initially also had planned to create a heat map of the sales, but since the BBR data, which is the one I got from boliga.dk, doesn’t contain latitudes and longitudes, I couldn’t. I tried using the google API to convert addresses to latitudes and longitudes, but I could only do 2500 conversions a day, which was far too little when my dataset consisted of 70.000 sales. I looked at commercial options, but I didn’t feel like paying 200$ for a conversion of latitudes and longitudes.