Technicians failed to effectively respond to an escalating IT meltdown at the Tax Office for nearly four hours last year, adding to chaos caused by warning and back up systems that were not fully operational.
The ATO has blamed multinational Hewlett Packard and stressed fibre optic cables for the unprecedented crash of internal storage networks on December 11 and 12, 2016 and February 2 this year, causing much of the agency's work to slow to a halt and scores of staff to be sent home.
Nearly a petabyte of data - a million gigabytes - had to be recovered from the outages, with much of the chaos repeated in early February when a cable replacement exercise went wrong and data cards were dislodged, prompting another system shutdown.
Described by Tax Commissioner Chris Jordan as the organisation's "worst unplanned system outage in recent memory", the report released on Thursday found performance of the IT systems had been favoured over stability, resilience and cost, with no planning for the series of events which interrupted thousands of businesses and individuals in dealing with the ATO and even risked this year's tax return season.
The report said the three-day December failure came after outages on Sydney-based networks, including due to stressed fibre optic cabling and unsuccessful attempts to recover normal operations. The storage networks were unable to provide key services to applications and control, management and monitoring systems were also not operating.
Despite the first signs of the crash being reported at 3.35am on December 12, the report said the incident was not escalated to a "priority 1" threat level until about 7.00am. "Monitoring and resilience features" had not been enabled, weakening warnings about the meltdown.
The report said there had been no hacking of the systems and no data permanently lost but the ATO website and six of its most popular applications including online services, tax return lodgement, the Australian Business Register and outbound communications all failed.
Signs of problems had been reported to Hewlett Packard over the six months prior to the December crash, promoting replacement of fibre optic cabling. The efforts weren't enough but the five-day February crash was immediately classed as a "priority 1" event.
"The [storage area networks system] was neither designed nor built to cater for greater than single drive failure or single cage failure," the report said.
"This established a risk to our business due to the large number of business systems that depended on the SAN for normal operation."
A series of recommendations from the report are being worked on, including correcting the faults, enhanced IT management and better communication with clients and the general public. A four-day shutdown of ATO systems over Easter was used for testing of improvements.
Mr Jordan said the outages were likely "unprecedented" but no taxpayer data had been lost and government revenue for 2016–17 had not been impacted.
"All refunds were paid inside our service standards and any affected taxpayers were automatically allowed additional time to lodge or make payments to us," he said.
"We have begun implementing a range of measures to enhance the stability and resilience of our systems, which includes the replacement of the faulty hardware that caused the outages.
"With these measures in place, we are confident that when tax time 2017 commences on July 1 2017, we can match the experience of tax time 2016 and taxpayers will be able to lodge their returns and receive their refunds."
Mr Jordan told Senate estimates hearings a confidential settlement had been reached over the incidents.