Youth education and employment
How might we use publicly available data to identify education and employment opportunities for our youth?
Go to Challenge | 25 teams have entered this challenge.
Dizzie
Discover(ie) jobs related to you! We key in on personal interest (media consumption based on YouTube) and personality type to leverage machine learning to suggest jobs you might be interested in!
Finding a job is hard, finding a job you enjoy is harder. While most traditional approaches match skill sets to potential careers, this overlooks potential enjoyment, satisfaction and interest factors of that individual.
Furthermore, current resources such as the Australian Skills Classification website are large and difficult to navigate, it has a high barrier of entry for the general public (due to the amount of clicks) and young people (from fairly complex language).
With the abundance of data out there, we aimed to help focus those choices down.
We have created a platform Discoverie which enables this by keying into two additional factors:
1. Your personal interests
2. Your personality type
We do this in a novel way which
1. Queries your YouTube subscriptions using channels you regularly watch as a metric for interests.
2. Scrape Myers-Briggs for potential careers from 4 data sources (Indeed, Glassdoor, Workopolis, NovoResume)
3. Parse the Australian Skills Classification (xls) data https://www.nationalskillscommission.gov.au/our-work/australian-skills-classification#resources
Note: Interests are represented as a Wikipedia entity e.g. a YouTube channel that discusses physical fitness has a topic https://en.wikipedia.org/wiki/Physical_fitness (I believe this is powered by knowledge graph) which is made available via the YouTube API.
We use these 3 data sources to build a corpus of text or documents which are then used to train a gensim Doc2Vec (machine learning) model.
The Doc2Vec model captures vector representations of text (can be a sustenance / paragraph) and builds its own understanding of it (uses neural networks & distributed bags of words approach). This allows us to query for similarity between job descriptions and other data points (e.g. your interests, your personality type).
To demonstrate this we created a flask backend to host our model and a react frontend where users can
* Enter their personality type
* Select their YouTube interests once logged in
In which our frontend will ask the backend for similar jobs based on the Doc2Vec model similarity which we render narrowing down the occupation choices which the user can then discover more about on the Australian Skills Classification Website.
We also use the ABS API to show job vacancies for each occupation.
This is a low barrier to entry novel approach using Machine Learning approach to capture jobs that you might be interested in based on personality & media consumption! This helps narrow down the entry point to the Australian Skills Classification website and utilizing the abundance of valuable YouTube data (especially in Millennials and Gen Z) to characterize the user and makes it more appealing to the general public.
This seeks to simplify the powerful, but potentially overwhelming https://www.nationalskillscommission.gov.au/our-work/australian-skills-classification website by tailoring results to you, the individual.
Backend: python, flask, gensim (doc2vec model + corpus labelling), Youtube API
Frontend: react js, material UI
Description of Use Parsed out: Occupation Descriptions-Table 1.csv, Specialist tasks-Table 1.csv, Core_competencies-Table 1.csv to create a training corpus for our Doc2Vec model.
Description of Use Used to represent job vancanies for jobs
Description of Use Scraped & parsed the website to to create a training corpus for our Doc2Vec model.
Description of Use Scraped & parsed the website to to create a training corpus for our Doc2Vec model.
Description of Use Scraped & parsed the website to to create a training corpus for our Doc2Vec model.
Description of Use Scraped & parsed the website to to create a training corpus for our Doc2Vec model.
Description of Use Note: also used https://pypi.org/project/python-youtube/ but https://developers.google.com/youtube/v3/docs is the underling resource. Used to query and categorize a user's interest based on their youtube subscriptions and to create a training corpus for our Doc2Vec model. These interests are represented as a wikipedia entity e.g. a YouTube channel that discusses physical fitness has a topic https://en.wikipedia.org/wiki/Physical_fitness (I believe this is powered by knowledge graph)
Go to Challenge | 25 teams have entered this challenge.
Eligibility: Teams are encouraged to use datasets from a variety of sources, but at least one must be drawn from the ABS Data API.
Go to Challenge | 20 teams have entered this challenge.
Eligibility: Participants must use the Australian Skills Classification dataset.
Go to Challenge | 25 teams have entered this challenge.