357

6

Aspen Times Real Estate

I decided to condense the Aspen Times, one of the two daily papers in Aspen Colorado. Because the entire paper is about the size of one section in a major paper, I decided to work with the whole paper on Friday, January 8, 2016.

7c7f7bb4

One thing I've noticed about the Aspen Times is the large number of real estate advertisements. Aspen is well know as having some of the most expensive real estate in the country and that's well represented in the paper.

f4084d87

Another reason I chose the real estate ads is that there is a reasonably complete set of data associated with each listing:

  • page number
  • page space/location
  • value
  • realator: company and person
  • MLS#
  • font
  • text description
  • days listed
  • address - geo location
  • owners
  • building records
  • lot size
  • building size
  • distance to nearest gondola, store, highway, etc.
  • building age
  • building shape

Of course, not all of this data is contained in the advertisement, but all of it is probably available with some effort. I will start with the most accessible data contained within the paper and then decide where it should be augmented.

Here are some of my initial sketches:

ef81b4d3

e2f03f0d

7c48fbac

Initial Thoughts and Questions:

So far, I've been focusing on the relationship between page location, page space, and property value.  The questions I have in mind are:

  1. How property value relates to page space and page location
  2. How description relates to property value
  3. How description and property value relate to geographic location

I also like the idea of including some aspect of the text descriptions, like using word frequency analysis (thanks Nicholas for the tip) to see how often words like 'splendor' and 'luxury' come up.  I could then see if there is some relationship between the type of words used to describe a property and the property's value and location. Real estate is also inherently geographic, so I keep working with some map representation, but I'm failing to link them together.

Initial Look at the data

I finally finished collecting the data.  I had street addresses, but to do any geographic analysis or visualization, I needed geographic coordinates.  I discovered geocod.io which allows batch geocoding.  You upload a data file, tell it which fields contain the address and it returns a new data file with the latitude and longitude added.  There were a few errors that I fixed manually, but it would pretty good.  I tried to use databasic.io to do a word count on the text descriptions, but the descriptions are short and the result was not very interesting:

002eef68

So I decided to use the alchemy api which does a more sophisticated text analysis.  It extracts keywords and a sentiment score from each text description.

There seems to be some correlation between the property value (in millions $US) and the amount of page space the advertisement occupies:

32c5dd54

the relationship between value and page number is less clear:

40189454

and interestingly, the sentiment score (higher number => more positive) is lower for lower valued properties:

37aa1ca5

and here's a map of some of the properties with the marker scaled by value (made with cartodb.com):

07ba7db7

More work to be done.

Version 1

14336d20

Here is a first version showing total real estate value on each page of the paper as well as the overall sentiment of the text descriptions on each page and the most frequently used words (although I excluded words like room, bed, kitchen, etc.)  I made the basic visualization with Processing and then did the final layout in Illustrator.

Observations:

  1. Many advertisements are loaded in the first pages.
  2. Counterintuitively, most of the property value is in the middle of the paper, rather than the first and last page like I expected.
  3. Words "Aspen", "home", "mountain", "ski", and "views" are the top 5 words

Thoughts:

  1. I like the idea of using the paper title as a header to immediately indicate that the graphic represents something abou the paper.
  2. The only color I used was from the paper title
  3. It feels visually heavy on the bottom
  4. layout of the frequent words was difficult. Initially I had them near the bottom of each line, but there was too much overlap
  5. Still, when it comes to page layout and typography, I feel like I'm groping in the dark.  It's difficult for me to identify what's working and what's not. 

Comments

Please sign in or sign up to comment.