Enabling social change through analytics
Log in
FORUM
Topic: Another dataset
I've downloaded and reverse geocoded the data available at http://www.philly.com/inquirer/multimedia/15818502.html

At the moment the dataset includes all the homicides from 2000 to 2009 (with zipcodes).

Some information is missing for some records, like "cause","weapon" or "motive", and there was 1 record in 2001 and 11 in 2000 that had no location information so have not been included.

Here you have the link to the data:
http://spreadsheets.google.com/ccc?key=0AhFZsxLIS502dGhaNG1WWjgwcXJ2cGo2VnNEVGhVYmc&hl=en_GB

If there is any problem with it let me know.     
< rsallo : 01/12/2010 03:22 PM >
This is a wonderful dataset. How did you get these information all the way back to 2000? I thought they only have date information and no location.      < Siah : 01/12/2010 07:00 PM >
Thanks for pulling this together.      < jerrytown : 01/12/2010 07:20 PM >
rsallo thanks!      < hanih : 01/13/2010 12:22 PM >
@Siah: They have data going back to 1988 (look at the tabs just above the map). I'm still working on the reports from 1988 to 1999, that's why they are not yet in the dataset.      < rsallo : 01/13/2010 04:14 PM >
@rsallo I am still confused about how you do screen scrapping off of a Flash page or there is another html page on that website that I don't see.

Thanks again for your help and dataset. I am really looking forward to see a dataset that goes back many years. Let me know if I can be of any help

Cheers
Siah (http://analyticsx.wordpress.com/)     
< Siah : 01/13/2010 07:39 PM >
@Siah apologies for taking so long to reply.

Re. the flash map, luckily enough the crime data is not embeded in the code, but requested to the server every time you select a different year. If you use Firefox, with the Firebug extension you'll see the url of the file with the data:

http://inquirer.philly.com/graphics/homicide_map_2/data/murders_yy.txt, with yy ranging from 88 to 09. I've then used www.batchgeocode.com/reverse/ to reverse geocode the lat and long to get the postcode of each crime.

At the moment I'm busy with other projects and don't have much time to reverse geocode the data from 1988 to 1999, so if anyone else can take the lead that would be great.     
< rsallo : 01/19/2010 03:49 PM >
@rsallo thanks a lot for clarification

I have used Yahoo API for geo coding, will try to see if I can help with producing this larger dataset. (Yahoo has a daily limit of 5000 entries)

Thanks again     
< Siah : 01/19/2010 05:54 PM >
I have reverse geocoded the addresses from 89 to 2000, a number of locations are translated incorrectly (it reverse geocoded them to Michigan!) it would be great if somebody else could do a little QA on this dataset and correct those addresses
here is a link http://www.box.net/shared/4duuaabsvs     
< Siah : 01/24/2010 01:46 AM >
Thanks a lot Siah, well done :)

I had a few problems with some locations using "long" and "lat", so tried to use xLAT and xLON and worked most of the times (apologies for not mentioning that before...)     
< rsallo : 01/27/2010 03:16 PM >