A unique approach to crunching website visitor data promises the best of both worlds between accuracy and privacy.
Data leaned from people's behaviour online is an important tool in everything from marketing to social planning, but consumers lose control over their privacy the more data is collected about them.
The trick is in knowing as much about you without identifying you as an individual, and computer scientists from Saarland University and the Center for IT Security, Privacy and Accountability (CISPA), in Germany, and the Italian IMT Institute for Advanced Studies might have cracked the code.
Their technology, known as Privada, uses peer-to-peer file sharing as the inspiration to send parts of website visitor data to different servers for processing and storage.
When Privada collects a behavioural metric on visitors (women aged 35-45, for example) it sends it to a third-party server. Other metrics are sent to other servers, so no central database has the complete picture.
Each server then adds up to 10 per cent of data "noise" to their records, enough to keep any single user from being identified and leaving the reassembled data 90 per cent accurate.
"It's a bit like tearing a picture apart and giving pieces to friends," explains PhD student Fabienne Eigner, who is part of the development team. "They can only see the whole image if they put their pieces together."
At that rate of accuracy it would still be of value to some businesses.
Bert-Jan van Essen spokesman for Singapore wealth-management provider Dragon Wealth said even aggregated data that doesn't identify individual users is "extraordinarily useful". The company collects contextual information about its users based on questions and online behaviour.
"It's not necessary for us to know who the user is to explain what other people are doing," he says.
Thomas P Keenan, adjunct professor of computer science at the University of Calgary and author of the book Technocreep, isn't convinced.
He refers to a paper written by the Privada team that says the system assumes "the majority of the computation parties are not colluding", something Dr Keenan considers the "pre-Snowden"world.
"All our internet traffic is being monitored and collected, at least by governments and quite possibly by others," he said. "Someone with good cryptographic expertise could attack the cryptography if they had enough reason to do it."
And rather than find better ways to collect data, chief compliance and risk officer Dana Simberkoff, of compliance and governance software vendor AvePoint believes giving users control over what websites collect is even more important.
"The only way to fully ensure consumer privacy protection is not to take their information at all," she says. "It comes back to educating consumers on the risks they're taking with their data and transparency and accountability from those who collect it."
Privada project lead Dr Aniket Kate said while the project's architecture and basic implementation are ready, more needs to be done to usability and interface before it can be made public.
There was no significant commercial interest in the technology yet, but he said the team would publish a new scientific paper, after which commercial interest would follow.