ACC Datasets
Description:
Information recorded in the process of administering a claim and rehabilitation where needed. This is operational data represented and accessed via the ACC data warehouse known as In Fact. Data is available from 1974 although more recent data is of better quality and more complete.
Additional information:
Have_(encrypted)_NHI | Yes |
Personally identifiable (e.g. linked to NHI numbers) and longitudinal or aggregated (e.g. for planning, clinical research etc.)? | ACC claims and claimants can be linked to MOH data (particularly NMDS (National Minimum Data Set). Claimants can be identified via NHI and ACC claim numbers (mostly ACC45 claim form numbers). ACC records full details of claimants (name, address, date of birth, etc.) including demographics. |
Volume of data (e.g. how many records) Since when? | ACC processes about 1.7 million claims per annum at a total cost of about $2.5B for medical, compensation and rehabilitation costs. |
Purpose and governance including ethics committee/patient consent mechanisms. Q: How do you get around ethics/privacy issues with your data sources? Esp. DHBs? | Application for privacy level data must be made to the ACC Ethics Committee (details on the ACC website - acc.co.nz in the section "About ACC". If use is beyond bona fide research purposes, ACC will solicit the claimants consent before releasing information. |
Scope | National |
Does the data contain diagnoses and clinical outcomes? Does the data contain procedures, device information and medication for therapy? Does this data set have cost / price data? | 1. The data does contain diagnostic information both primary and secondary diagnosis in READ, ICD9 and IC10 coding depending on when and by whom the information was captured. PHOs and DHBs capture the information and is submiotted to ACC. 2. Some procedure/medication information is recorded as part of the treatment and rehabilitation of injured claimants. 3. ACC pays for services directly in some case but the medical treatment proivided by DHBs is bulk funded and is not recorded in ACC's systems apart from the fact that it was an ACC related case. |
Presence of Data dictionary? Column headings in Excel or any kind of data model if residing in a relational database (e.g. Access, SQL Server, Oracle etc.) | A data dictionary is not available directly as this data is from operational systems. However descriptions of subsets of data can be provided based on specific requirements. The data is hosted in an Oracle database using OBIEE as a front end tool. In some cases SAS datasets are still being used but are being phased out. |
Linked (or linkable) to other datasets within your organisation or across the Sector | Can be linked to levy payers in ACC and external datasets such as CAS (motor vehicle traffic accidents) and NMDS datasets as described. |
How often does this data set get updated? Daily? Weekly? Monthly? Quarterly? Yearly? | Updated daily from production systems. |
Indication of data quality (e.g. missing values, duplications, inconsistencies etc.). Q: Audits? How do you ensure the data is valid and correct? | Data quality is in line with the main function performed by ACC (paying for claims related costs) including managing long term and rehabilitation. Overall quality is rated to be good. Detailed information about specific data areas can be provided where required. |
Brief info about the systems and processes used to collect/manage data. Q: Where the data is collected, in what form, and accessibility? | Medical practitioners/DHBs record injury information and submit this to ACC. In about 90% of cases only medical costs are involved. In more severe cases where compensation for loss of earnings is paid as well as rehabilitation, this area is managed by ACC but provided by external providers. Access to ACC data is by request and if confidential information is required, need toi be applied for via the ACC Ethics Committee for review and approval. |
Data format, e.g., data structure, data types, and storage form (relational database, Excel, csv, etc.). | Data can be provided in CSV format extracted files, encrypted if it conatins privavcy information and provided on CD because of the data volumes associated (typically 500MB for 2001-2012 claims data). |
How well the data is structured, e.g. free text VS coded text VS pick-list (drop-down list) | Most dat fields are structured with very few free text fields. |