Hard disk test 'surprises' Google
The impact of heavy use and high temperatures on hard disk drive failure may be overstated, says a report by three Google engineers. The report examined 100,000 commercial hard drives, ranging from 80GB to 400GB in capacity, used at Google since 2001.
The firm uses "off-the-shelf" drives to store cached web pages and services.
"Our data indicate a much weaker correlation between utilisation levels and failures than previous work has suggested," the authors noted.
A wide variety of manufacturers and models were included in the report, but a breakdown was not provided.
Widely-held belief
There is a widely held belief that hard disks which are subject to heavy use are more likely to fail than those used intermittently. It was also thought that hard drives preferred cool temperatures to hotter environments.
The authors wrote: "We expected to notice a very strong and consistent correlation between high utilisation and higher failure rates.
"However our results appear to paint a more complex picture. First, only very young and very old age groups appear to show the expected behaviour."
A hard disk was described as having "failed" if it needed to be replaced.
The report was compiled by Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andre Barroso, and was presented to a storage conference in California last week.
In the report the authors said Google had developed an infrastructure which collected "vital information" about all of the firm's systems every few minutes.
'Essentially forever'
The firm then stores that information "essentially forever".
Google employs its own file system to organise the storage of data, using inexpensive commercially available hard drives rather than bespoke systems.
Lower temperatures are associated with higher failure rates
Google report
Hard drives less than three years old and used a lot are less likely to fail than similarly aged hard drives that are used infrequently, according to the report. "One possible explanation for this behaviour is the survival of the fittest theory," said the authors, speculating that drives which failed early on in their lifetime had been removed from the overall sample leaving only the older, more robust units.
The report said that there was a clear trend showing "that lower temperatures are associated with higher failure rates".
"Only at very high temperatures is there a slight reversal of this trend."
But hard drives which are three years old and older were more likely to suffer a failure when used in warmer environments.
"This is a surprising result, which could indicate that data centre or server designers have more freedom than previously thought when setting operating temperatures for equipment containing disk drives," said the authors.
The report also looked at the impact of scan errors - problems found on the surface of a disc - on hard drive failure.
"We find that the group of drives with scan errors are 10 times more likely to fail than the group with no errors," said the authors.
They added: "After the first scan error, drives are 39 times more likely to fail within 60 days than drives without scan errors."
Source