Eintrag weiter verarbeiten
Protecting Privacy Using k-Anonymity
Gespeichert in:
Zeitschriftentitel: | Journal of the American Medical Informatics Association |
---|---|
Personen und Körperschaften: | , |
In: | Journal of the American Medical Informatics Association, 15, 2008, 5, S. 627-637 |
Format: | E-Article |
Sprache: | Englisch |
veröffentlicht: |
Oxford University Press (OUP)
|
Schlagwörter: |
author_facet |
El Emam, Khaled Dankar, Fida Kamal El Emam, Khaled Dankar, Fida Kamal |
---|---|
author |
El Emam, Khaled Dankar, Fida Kamal |
spellingShingle |
El Emam, Khaled Dankar, Fida Kamal Journal of the American Medical Informatics Association Protecting Privacy Using k-Anonymity Health Informatics |
author_sort |
el emam, khaled |
spelling |
El Emam, Khaled Dankar, Fida Kamal 1527-974X 1067-5027 Oxford University Press (OUP) Health Informatics http://dx.doi.org/10.1197/jamia.m2716 <jats:title>Abstract</jats:title><jats:p>Objective: There is increasing pressure to share health information and even make it publicly available. However, such disclosures of personal health information raise serious privacy concerns. To alleviate such concerns, it is possible to anonymize the data before disclosure. One popular anonymization approach is k-anonymity. There have been no evaluations of the actual re-identification probability of k-anonymized data sets.</jats:p><jats:p>Design: Through a simulation, we evaluated the re-identification risk of k-anonymization and three different improvements on three large data sets.</jats:p><jats:p>Measurement: Re-identification probability is measured under two different re-identification scenarios. Information loss is measured by the commonly used discernability metric.</jats:p><jats:p>Results: For one of the re-identification scenarios, k-Anonymity consistently over-anonymizes data sets, with this over-anonymization being most pronounced with small sampling fractions. Over-anonymization results in excessive distortions to the data (i.e., high information loss), making the data less useful for subsequent analysis. We found that a hypothesis testing approach provided the best control over re-identification risk and reduces the extent of information loss compared to baseline k-anonymity.</jats:p><jats:p>Conclusion: Guidelines are provided on when to use the hypothesis testing approach instead of baseline k-anonymity.</jats:p> Protecting Privacy Using k-Anonymity Journal of the American Medical Informatics Association |
doi_str_mv |
10.1197/jamia.m2716 |
facet_avail |
Online Free |
finc_class_facet |
Medizin Informatik |
format |
ElectronicArticle |
fullrecord |
blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE5Ny9qYW1pYS5tMjcxNg |
id |
ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE5Ny9qYW1pYS5tMjcxNg |
institution |
DE-Zwi2 DE-D161 DE-Gla1 DE-Zi4 DE-15 DE-Pl11 DE-Rs1 DE-105 DE-14 DE-Ch1 DE-L229 DE-D275 DE-Bn3 DE-Brt1 |
imprint |
Oxford University Press (OUP), 2008 |
imprint_str_mv |
Oxford University Press (OUP), 2008 |
issn |
1527-974X 1067-5027 |
issn_str_mv |
1527-974X 1067-5027 |
language |
English |
mega_collection |
Oxford University Press (OUP) (CrossRef) |
match_str |
elemam2008protectingprivacyusingkanonymity |
publishDateSort |
2008 |
publisher |
Oxford University Press (OUP) |
recordtype |
ai |
record_format |
ai |
series |
Journal of the American Medical Informatics Association |
source_id |
49 |
title |
Protecting Privacy Using k-Anonymity |
title_unstemmed |
Protecting Privacy Using k-Anonymity |
title_full |
Protecting Privacy Using k-Anonymity |
title_fullStr |
Protecting Privacy Using k-Anonymity |
title_full_unstemmed |
Protecting Privacy Using k-Anonymity |
title_short |
Protecting Privacy Using k-Anonymity |
title_sort |
protecting privacy using k-anonymity |
topic |
Health Informatics |
url |
http://dx.doi.org/10.1197/jamia.m2716 |
publishDate |
2008 |
physical |
627-637 |
description |
<jats:title>Abstract</jats:title><jats:p>Objective: There is increasing pressure to share health information and even make it publicly available. However, such disclosures of personal health information raise serious privacy concerns. To alleviate such concerns, it is possible to anonymize the data before disclosure. One popular anonymization approach is k-anonymity. There have been no evaluations of the actual re-identification probability of k-anonymized data sets.</jats:p><jats:p>Design: Through a simulation, we evaluated the re-identification risk of k-anonymization and three different improvements on three large data sets.</jats:p><jats:p>Measurement: Re-identification probability is measured under two different re-identification scenarios. Information loss is measured by the commonly used discernability metric.</jats:p><jats:p>Results: For one of the re-identification scenarios, k-Anonymity consistently over-anonymizes data sets, with this over-anonymization being most pronounced with small sampling fractions. Over-anonymization results in excessive distortions to the data (i.e., high information loss), making the data less useful for subsequent analysis. We found that a hypothesis testing approach provided the best control over re-identification risk and reduces the extent of information loss compared to baseline k-anonymity.</jats:p><jats:p>Conclusion: Guidelines are provided on when to use the hypothesis testing approach instead of baseline k-anonymity.</jats:p> |
container_issue |
5 |
container_start_page |
627 |
container_title |
Journal of the American Medical Informatics Association |
container_volume |
15 |
format_de105 |
Article, E-Article |
format_de14 |
Article, E-Article |
format_de15 |
Article, E-Article |
format_de520 |
Article, E-Article |
format_de540 |
Article, E-Article |
format_dech1 |
Article, E-Article |
format_ded117 |
Article, E-Article |
format_degla1 |
E-Article |
format_del152 |
Buch |
format_del189 |
Article, E-Article |
format_dezi4 |
Article |
format_dezwi2 |
Article, E-Article |
format_finc |
Article, E-Article |
format_nrw |
Article, E-Article |
_version_ |
1792345308028469251 |
geogr_code |
not assigned |
last_indexed |
2024-03-01T17:21:10.237Z |
geogr_code_person |
not assigned |
openURL |
url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Protecting+Privacy+Using+k-Anonymity&rft.date=2008-09-01&genre=article&issn=1067-5027&volume=15&issue=5&spage=627&epage=637&pages=627-637&jtitle=Journal+of+the+American+Medical+Informatics+Association&atitle=Protecting+Privacy+Using+k-Anonymity&aulast=Dankar&aufirst=Fida+Kamal&rft_id=info%3Adoi%2F10.1197%2Fjamia.m2716&rft.language%5B0%5D=eng |
SOLR | |
_version_ | 1792345308028469251 |
author | El Emam, Khaled, Dankar, Fida Kamal |
author_facet | El Emam, Khaled, Dankar, Fida Kamal, El Emam, Khaled, Dankar, Fida Kamal |
author_sort | el emam, khaled |
container_issue | 5 |
container_start_page | 627 |
container_title | Journal of the American Medical Informatics Association |
container_volume | 15 |
description | <jats:title>Abstract</jats:title><jats:p>Objective: There is increasing pressure to share health information and even make it publicly available. However, such disclosures of personal health information raise serious privacy concerns. To alleviate such concerns, it is possible to anonymize the data before disclosure. One popular anonymization approach is k-anonymity. There have been no evaluations of the actual re-identification probability of k-anonymized data sets.</jats:p><jats:p>Design: Through a simulation, we evaluated the re-identification risk of k-anonymization and three different improvements on three large data sets.</jats:p><jats:p>Measurement: Re-identification probability is measured under two different re-identification scenarios. Information loss is measured by the commonly used discernability metric.</jats:p><jats:p>Results: For one of the re-identification scenarios, k-Anonymity consistently over-anonymizes data sets, with this over-anonymization being most pronounced with small sampling fractions. Over-anonymization results in excessive distortions to the data (i.e., high information loss), making the data less useful for subsequent analysis. We found that a hypothesis testing approach provided the best control over re-identification risk and reduces the extent of information loss compared to baseline k-anonymity.</jats:p><jats:p>Conclusion: Guidelines are provided on when to use the hypothesis testing approach instead of baseline k-anonymity.</jats:p> |
doi_str_mv | 10.1197/jamia.m2716 |
facet_avail | Online, Free |
finc_class_facet | Medizin, Informatik |
format | ElectronicArticle |
format_de105 | Article, E-Article |
format_de14 | Article, E-Article |
format_de15 | Article, E-Article |
format_de520 | Article, E-Article |
format_de540 | Article, E-Article |
format_dech1 | Article, E-Article |
format_ded117 | Article, E-Article |
format_degla1 | E-Article |
format_del152 | Buch |
format_del189 | Article, E-Article |
format_dezi4 | Article |
format_dezwi2 | Article, E-Article |
format_finc | Article, E-Article |
format_nrw | Article, E-Article |
geogr_code | not assigned |
geogr_code_person | not assigned |
id | ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTE5Ny9qYW1pYS5tMjcxNg |
imprint | Oxford University Press (OUP), 2008 |
imprint_str_mv | Oxford University Press (OUP), 2008 |
institution | DE-Zwi2, DE-D161, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, DE-105, DE-14, DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1 |
issn | 1527-974X, 1067-5027 |
issn_str_mv | 1527-974X, 1067-5027 |
language | English |
last_indexed | 2024-03-01T17:21:10.237Z |
match_str | elemam2008protectingprivacyusingkanonymity |
mega_collection | Oxford University Press (OUP) (CrossRef) |
physical | 627-637 |
publishDate | 2008 |
publishDateSort | 2008 |
publisher | Oxford University Press (OUP) |
record_format | ai |
recordtype | ai |
series | Journal of the American Medical Informatics Association |
source_id | 49 |
spelling | El Emam, Khaled Dankar, Fida Kamal 1527-974X 1067-5027 Oxford University Press (OUP) Health Informatics http://dx.doi.org/10.1197/jamia.m2716 <jats:title>Abstract</jats:title><jats:p>Objective: There is increasing pressure to share health information and even make it publicly available. However, such disclosures of personal health information raise serious privacy concerns. To alleviate such concerns, it is possible to anonymize the data before disclosure. One popular anonymization approach is k-anonymity. There have been no evaluations of the actual re-identification probability of k-anonymized data sets.</jats:p><jats:p>Design: Through a simulation, we evaluated the re-identification risk of k-anonymization and three different improvements on three large data sets.</jats:p><jats:p>Measurement: Re-identification probability is measured under two different re-identification scenarios. Information loss is measured by the commonly used discernability metric.</jats:p><jats:p>Results: For one of the re-identification scenarios, k-Anonymity consistently over-anonymizes data sets, with this over-anonymization being most pronounced with small sampling fractions. Over-anonymization results in excessive distortions to the data (i.e., high information loss), making the data less useful for subsequent analysis. We found that a hypothesis testing approach provided the best control over re-identification risk and reduces the extent of information loss compared to baseline k-anonymity.</jats:p><jats:p>Conclusion: Guidelines are provided on when to use the hypothesis testing approach instead of baseline k-anonymity.</jats:p> Protecting Privacy Using k-Anonymity Journal of the American Medical Informatics Association |
spellingShingle | El Emam, Khaled, Dankar, Fida Kamal, Journal of the American Medical Informatics Association, Protecting Privacy Using k-Anonymity, Health Informatics |
title | Protecting Privacy Using k-Anonymity |
title_full | Protecting Privacy Using k-Anonymity |
title_fullStr | Protecting Privacy Using k-Anonymity |
title_full_unstemmed | Protecting Privacy Using k-Anonymity |
title_short | Protecting Privacy Using k-Anonymity |
title_sort | protecting privacy using k-anonymity |
title_unstemmed | Protecting Privacy Using k-Anonymity |
topic | Health Informatics |
url | http://dx.doi.org/10.1197/jamia.m2716 |