Project Description
SmartTax AI is a proof of concept that embeds a chatbot assistant directly inside myTax. It uses open ATO datasets to provide proactive, personalised nudges about deductions, offsets, and superannuation contributions. Instead of relying on generic help text, it guides taxpayers based on what people in similar income bands, occupations, or age groups typically claim.
Every suggestion is transparent and backed by data, with a simple “Why am I seeing this?” link. This makes tax easier, fairer, and less stressful for individuals, while also helping the ATO reduce errors and compliance costs.
Data Story
I built SmartTax AI on the idea that tax compliance becomes easier when insights from real taxpayer behaviour are brought directly into the myTax experience. To demonstrate this, I worked with five ATO open datasets.
I started with Individual Income Tax Return Data (Table 6A) which provides detailed information on income, tax paid, and work-related deductions across states, postcodes, and income levels. From this dataset I extracted patterns showing that, for example, low-income workers in Queensland typically claimed small amounts for uniforms, while mid-income earners had higher averages in car and travel expenses. These cohort insights formed the basis of proactive deduction nudges.
Next, Individual Income Tax Return Data (Table 14A/14B) allowed me to link deductions to occupations. This showed, for instance, that hospitality workers often claim uniforms and that professional occupations frequently claim self-education expenses. Embedding these insights into SmartTax AI makes the guidance feel personal and job-specific.
I then turned to Superannuation Contribution Data (Tables 22, 23A, 24A) which break down contributions by account balance, income range, age, sex, and state. By analysing these cohorts, I could show when a taxpayer’s contributions were below average for their age or income group, or when they were close to hitting annual contribution caps. This enabled SmartTax AI to provide nudges about topping up superannuation or avoiding excess contributions.
All datasets were cleaned, aggregated, and analysed in Python Jupyter notebooks. From this, I produced summary tables and visualisations showing average claims and contribution rates across different cohorts. These outputs were then linked to example nudges in my chatbot prototype. Each nudge includes a “Why am I seeing this?” link that cites the dataset, table, and year, making the assistant transparent and trustworthy.
In short, my data story is about transforming raw, aggregated tax and superannuation data into actionable, personalised guidance. The datasets are not just numbers in spreadsheets, but evidence that powers nudges to help taxpayers make better decisions, reduce mistakes, and improve financial wellbeing.