Our project is a website database that aims to allow users to search, view, and download raw data which is scraped and cleaned from existing open data portals and catalogs.
The site will compile data from existing individual databases all in one location, saving valuable time when
searching in the field. DataStat will include existing metadata and value added search and usage metrics on each set of raw data and use this to rank the data against similar sets.
Our design requires the use of meta data of datasets that have been sourced from CKAN metadata API's. Once the data is acquired an algorithm generates additional metadata from previously un-utilised variables. The data and the metadata are run through statistical analysis software with an integrated visualisation package where it is uploaded to the website for public use.
Evidence of Work
Queensland API Data Retrieval
Description of Use: API script is used for web query (CKAN, DRUPAL). Its intended use is to interact with open data websites, strip metadata and eliminate anomalies, then host it onto DataStat. Additionally, the user can have access to the API key, and therefore customise its code for personal use. The end product is a dynamic framework of data visualisation through CKAN module, that is interactive and user friendly.
weather data QLD
Description of Use: Used in conjunction with API scripts and CKAN to generate a new metadata interface
Description of Use: Metadata is communicated to DataStat through API queries. Using the metadata and API as a resource, and the dataset as the core entity, the CKAN module offers a robust framework of dynamic data visualisation on a web interface that is readily available on the DataStat site. Format response is in JSON file. CKAN also implements a log for when metadata is changed, and therefore supports viewing and reverting changes, as well as metadata exchange.
Check back here once the first checkpoint passes to see the challenges this team has entered.