Paper accepted at IRI 2016!

The paper Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies has been accepted for publications at the IEEE 17th International Conference on Information Reuse and Integration.

pittsburg2

This paper [1] presents a method to evaluate Value Generalisation Hierarchies — tree-like structures used in the data anonymisation process — with the aim of improving their effectiveness. The paper will be accessible online soon, but in the meantime you can have a look at the abstract:

The dissemination of textual personal information has become a key driver for innovation and value creation. However, due to the possible content of sensitive information, this data must be anonymized, which can reduce its usefulness for secondary uses. One of the most used techniques to anonymize data is generalization. However, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used to dictate the anonymization of data, as poorly-specified VGHs can reduce the usefulness of the resulting data. To tackle this problem, we propose a metric for evaluating the quality of textual VGHs used in anonymization. Our evaluation approach considers the se- mantic properties of VGHs and exploits information from the input datasets to predict with higher accuracy (compared to existing approaches) the potential effectiveness of VGHs for anonymizing data. As a consequence, the utility of the resulting datasets is improved without sacrificing the privacy goal. We also introduce a novel rating scale to classify the quality of the VGHs into categories to facilitate the interpretation of our quality metric for practitioners.

This research is mainly lead by Vanessa Ayala-Rivera. It will be presented at the IEEE IRI conference that will be hold in Pittsburgh, PA, USA in July. It supplements a series of other papers on the same topic [2, 3, 4].

It’s the second time I’m lucky enough to get a paper accepted to this conference. The first time, in 2013, our paper [5] received the best paper award. Let’s hope this one will be appreciated too.


References

[1] V. Ayala-Rivera, C. Thorpe, T. Cerqueus and L. Murphy. Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies. In IEEE 17th International Conference on Information Reuse and Integration (IEEE IRI 2016), pp. 30-39, 2016.

[2] V. Ayala-Rivera, P. McDonagh, T. Cerqueus, L. Murphy. A Systematic Comparison and Evaluation of k-Anonymization Algorithms for Practitioners. In Transactions on Data Privacy (7:3), pp. 337-370. 2014.

[3] V. Ayala-Rivera, P. McDonagh, T. Cerqueus, L. Murphy. Ontology-Based Quality Evaluation of Value Generalization Hierarchies for Data Anonymization. In 6th International Conference on Privacy in Statistical Databases (PSD 2014), 2014.

[4] V. Ayala-Rivera, P. McDonagh, T. Cerqueus, L. Murphy. Synthetic Data Generation using Benerator Tool. Technical report UCD-CSI-2013-03, November 2013.

[5] S. T. Buda, T. Cerqueus, J. Murphy, M. Kristiansen. VFDS: Very Fast Database Sampling System. In 14th IEEE International Conference on Information Reuse and Integration (IEEE IRI 2013), pp. 153-160, 2013.