Insolvency Risk Profiler

Project Info

Team Name

Insolvency Oracle

Team Members

.*? and 3 other members with unpublished profiles.

Project Description

We built an AI-based platform that uses information about an individual to quantity their relative insolvency risk.

Their relative risk is expressed as a number which indicates how many times less/more likely than average a given individual is to become insolvent.

We also used this model to derive broad demographic trends in personal insolvency. Geographic insolvency trends are indicated on an interactive map which highlights SA3 regions by the expected insolvency rates. Trends relation to occupation, gender, and family composition are also visualised.

Data Story

Overview

We wanted to estimate, using a Bayesian AI model, the likelihood of a person becoming personally insolvent, given certain information about them. But to do this, we needed the marginal and prior distributions for each variable. Getting statistics about these variables for the general population was difficult, and some key variables had to be abandoned (e.g. assets, liability). However, a few key variables could be correlated with census data.

Variables of interest

The variables which were common to the given non-compliance-in-personal-insolvencies.csv dataset and 2016 census data are:
- the SA3 of debtor
- Family situation of debtor (Census dataset B25 SA3)
- Sex of debtor (Census dataset B57A SA3)
- Debtor occupation code (these seem to be Sub-Major Groups in the ANZCO ontology, see http://www.abs.gov.au/ANZSCO; the closest relevant dataset was B57A SA3 which used ANZSCO Major Groups

Approach

Because we don't have the joint distribution of Debtor occupation and family situation, we can't do this with a single model.
Instead, we'll have to construct two models:
- Estimating Pr(non-compliance) given SA3, sex, and family situation
- Estimating Pr(non-compliance) given SA3, sex, and debtor occupation
We then need to find a way to combine these predictors to give a single number. Adding in quadrature after normalising by the non-compliant marginal probability seemed to be a sensible option.

So we calculated the average expected risk of non-compliance (i.e. the marginal risk of non-compliance), and expressed every prediction in units of this quantity (e.g. person X is 2.5x more likely to be non-compliant than average given their demographic information).

Evidence of Work

Video

Homepage

Team DataSets

ArcGIS Data for SA3 Geographic Boundaries

Description of Use We used this to plot SA3 boundaries on an interactive map.

Data Set

Non-compliance in personal insolvencies

Description of Use We used this data to build our posterior distributions in our Bayesian learner.

Data Set

2016 Census GCP Statistical Area 3 for AUST

Description of Use We used these to build our prior and marginal distributions for our Bayesian learner for family composition, sex and occupation by ANZSCOR Major Groups.

Data Set

Challenge Entries

Bounty: Is seeing truely believing?

How can we tell a story with visualisations, that speaks the truest representation of our data?

Go to Challenge | 28 teams have entered this challenge.

Show Us The Numbers

How can we use open finance data to turn numbers into stories?

Go to Challenge | 13 teams have entered this challenge.

To bankruptcy or not to bankruptcy, keeping the process real.

Helping predict non-compliance in the personal insolvency system. How can Artificial Intelligence and Machine Learning assist us in the future?

Eligibility: Must use this dataset: https://data.gov.au/dataset/non-compliance-personal-insolvencies

Go to Challenge | 13 teams have entered this challenge.

More than apps and maps: help government decide with data

How can we combine data to help government make their big and small decisions? Government makes decisions every day—with long term consequences such as the location of a school, or on a small scale such as the rostering of helpdesk staff.

Eligibility: Use at least two data sets (at least one from data.gov.au) to help government make a decision that will improve services for people. Any code produced for your entry must be published on github under an open license. If your entry is not software, you will need to show the working behind your use of data along with any calculations and analysis you did. You must indicate which specific government agency (at any level of government) can take action based on your entry.

Go to Challenge | 58 teams have entered this challenge.

Back to Projects