Watch Out What's About?

Project Info

Data @ Heart thumbnail

Project Description


https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/CrimeByPostcode1995-2017WithSocialPostsIncidences?publish=yes

https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/CrimeByPostcode1995-2017Categories?publish=yes

https://public.tableau.com/profile/tones#!/vizhome/CrimeByPostcode1995-2017WithSocialPostOverlaid/SocialPostsofCommunityReportsPotentialWarnings?publish=yes


Data Story


Structural Topic Model Project - Crime Data

Sarah Fawcett + Tony Nguy

07/09/2018

library(stm)
library(igraph)
library(stmCorrViz)
library(tidyverse)
library(dplyr)
library(stringr)
library(tidytext)
library(car)
library(reshape2)
library(lubridate)
library(ggpmisc)

Set working directory

setwd("~/Documents/DATA-SCIENCE/GOVHACK/DATA")

Clean Out Old Objects

rm(list = ls())

rm(Crime9517_wide)

1: INGEST (PROTOTYPE)

Crime9517wide <- read.csv("CrimePostcodeData1995_2017.csv", header=T)

Convert to Long Format

Crime9517long <- melt(Crime9517wide, id=c("Postcode","Offence.category","Subcategory"))

Crime9517longdate <- melt(Crime9517_wide, id=c("Postcode","Offence.category","Subcategory"))

2: PREDICT

Test of factor

class(Crime9517_long$variable)

Convert Data to Date If Necessary

mdy(Crime9517_long$variable)

Crime9517long$variable <- as.Date(Crime9517long$variable)

Crime9517long$variable <- as.factor(Crime9517long$variable)

Convert Postcode to Character

class(Crime9517_long$Postcode)

Crime9517long$Postcode <- as.numeric(Crime9517long$Subcategory)

Crime9517long$Postcode <- as.numeric(Crime9517long$Postcode)

Crime9517long$value <- as.numeric(Crime9517long$value)

Generalised Linear Regression Model

attach(Crime9517_long)

model.glm.crime <- glm(Crime9517long$Postcode ~ Crime9517long$value + Crime9517long$Subcategory + Crime9517long$variable)

Summary

model.lm.internet

Pedict

predict.glm(model.glm.crime, data.frame(value=30, Subcategory=71, variable=21017, type="response", interval="confidence"))

detach(Crime9517_long)

WRITE

write.csv(Crime9517longdate, file = "~/Documents/DATA-SCIENCE/GOVHACK/DATA/Crime9517longdate.csv", row.names=FALSE)


Evidence of Work

Video

Homepage

Team DataSets

Bureau of Crime Statistics and Research (BOCSAR) Monthly data on all criminal incidents recorded by police

Description of Use: Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process these for NLP semantic and entity understanding, in order to predict and offer insight to users of our app. The information of trend predictions of crime in locations around where they are, helps better inform them of the risks that might be experienced.

Data Set

Australian Bureau of Statistics (ABS) Counts of Australian Businesses

Description of Use: Our layers of geolocation mapping also blends and calculates the location of pubs and restaurants to weight the variables of incidence of crime, due to such factors of alcohol consumption.

Data Set

Australian Institute of Health and Welfare Consumption of Alcohol

Description of Use: The data from AIHW has added the ability to incorporate trends and predictions from consumption of alcohol (particularly the increase in consumption of wine in pubs and restaurants)

Data Set

Social Data Scraped from Twitter, Facebook and Instagram

Description of Use: Joining BOCSAR crime data (22 years data across NSW postcodes) with layers of geolocated, time-filtered social posts, (Twitter, Instagram and Facebook) that have been publicly posted and pinned, we process the "Tweet_Body" for NLP semantic and entity understanding, in order to predict and offer insight to users of our app.

Data Set

NSW State Police Crime Reports

Description of Use: Used corpus of text from police crime reports to train topic vectors with a closeness matching index to publicly available social posts ignorer to overlay those posts that are suitable and relevant onto our risk assessment mapping app.

Data Set

Challenges

Data4Good

Region: New South Wales

Challenge

Australians' stories

Region: Australia

Challenge

Spatial data challenge

Region: New South Wales

Challenge

Bounty: Mix and Mashup

Region: Australia

Challenge
Back to Projects