class: left, top background-image: url(people.png) background-size: 100% #.solidbackground[Encryption, Anonymisation & Privacy Enhancing Technologies]
.solidbackground[.colour2[*Chris James*], 26 March 2024]
??? --- # Who am I? .big[ • Solicitor, qualified 2008 • General Counsel of a *fintech within a bank* • Fellow and Chair of the Advisory Board of the Society for Computers & Law • Former systems developer; technologist • _Not_ a maths nerd 🙁 ] --- # Agenda .big[ 1. .colour3[Encryption] 2. .colour3[Anonymisation], and 3. Other .colour3[Privacy Enhancing Techniques] ] --- .biggest.center[# 1. Encryption] --- # 1. Encryption ### OECD: .bigger.colour2["Encryption" means the transformation of data by the use of cryptography to produce unintelligible data (encrypted data) to ensure its confidentiality.] Source: OECD,
Recommendation of the Council concerning Guidelines for Cryptography Policy"
, 2022 --- # 1. Encryption ### Simple 'symmetric' encryption .center[
] --- # 1. Encryption ### Asymmetric encryption / "public-key cryptography" .center[
] --- # 1. Encryption ### PKI = Public Key Infrastructure .center[
] --- # 1. Encryption
.bigger[https://etc.ch/5qQg]
--> --- # 1. Encryption .center[
] Attribution:
Google
--- # 1. Encryption in transit * HTTPS uses .colour3["Transport Layer Security"] * This *helps* protect the data .colour2["in transit"] (aka "in flight") against interception by a bad actor --- # 1. Encryption in transit * HTTPS uses .colour3["Transport Layer Security"] * This *helps* protect the data .colour2["in transit"] (aka "in flight") against interception by a bad actor .center[
] --- # 1. Encryption at rest * Encryption ".colour2[at rest]", is also increasingly common. .center[
] --- # 1. Encryption at rest * Encryption ".colour2[at rest]", is also increasingly common. * E.g. major cloud services: * .small.colour3[Google]: .smaller["All data that is stored by Google is encrypted at the storage layer using the Advanced Encryption Standard (AES) algorithm ..."] * .small.colour3[Amazon]: .smaller["Customers can control when data is decrypted, by whom, and under which conditions as it passed to and from their applications and AWS services."] * .small.colour3[Microsoft]: .smaller["Data encryption at rest is available for services across the software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) cloud models."] --- # .center[#1. Encryption] .big.center[Does .colour2["in transit"] + .colour2["at rest"] = a .colour3[privacy panacea?]] --- # 1. Encryption * Encrypted personal data is still personal data. (
GDPR Art.32
). * Encryption is a "technical measure" to protect data. .colour2[_"
The ICO
has seen numerous incidents of personal data being subject to unauthorised or unlawful processing, loss, damage or destruction. In many cases, the damage and distress caused by these incidents may have been reduced or even avoided had the personal data been encrypted."_] --- # 1. Encryption * Data is processed "in use" - and so processors usually need the keys .center[
] --- # 1. Encryption * Data is processed "in use" - and so processors usually need the keys * E.g. Data uploaded to providers is usually encrypted/decrypted there: * Providers usually manage the keys * Therefore, providers have the data __and__ the keys --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology * Be careful to keep up with algorithmic advances * Watch out for vulnerabilities --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology 2. Consider .colour2[external key management] when using cloud --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology 2. Consider .colour2[external key management] when using cloud * Keys held by you or trusted third parties, so only temporarily used by providers * Complex to implement, may cause performance issues --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology 2. Consider .colour2[external key management] when using cloud 3. Can you .colour2[go end-2-end?] --- # Three .colour3[key] tricks for encryption 1. Use .colour2[state of the art] encryption technology 2. Consider .colour2[external key management] when using cloud 3. Can you .colour2[go end-2-end?] * Type of encryption where keys never leave sender or recipient * Good for untrusted environments e.g. payment terminals * More complex for a lot of other situations --- # e2e Encryption 3. Can you .colour2[go end-2-end?] .center[
] --- # e2e Encryption 3. Can you .colour2[go end-2-end?] .center[
] --- .biggest.center[# 2. Anonymisation] --- # 2. Anonymisation * Anonymisation is the process of taking personal data and making it: .small.colour2[“information which does not relate to an identified or identifiable natural person or ... rendered anonymous in such a manner that the data subject is not or no longer identifiable"] --- # 2. Anonymisation * Anonymisation is the process of taking personal data and making it: .small.colour2[“information which does not relate to an identified or identifiable natural person or ... rendered anonymous in such a manner that the data subject is not or no longer identifiable"] * We will cover: * Anonymisation * K-Anonymity * Generalisation and suppression * Adversarial attacks --- # 2. Anonymisation * Is __not__ just removing the .colour3[obvious personal data]
--- # 2. Anonymisation
.bigger[https://etc.ch/fmW2]
Source:
Luc Rocher, Julien Hendrickx, and Yves-Alexandre de Montjoye. “Estimating the success of re-identifications in incomplete datasets using generative models.” Nature Communications 10 (2019) 3069.
--- # 2. Anonymisation * Is __not__ just removing the .colour3[obvious personal data] * .colour3[Risk of re-identification] is inherent in most sets of personal data
--- # 2. Anonymisation * Is __not__ just removing the .colour3[obvious personal data] * .colour3[Risk of re-identification] is inherent in most sets of personal data * Measure using "K-Anonymity" or similar cluster method * Reduce identifiability through: * generalisation * suppression --- # 2. Anonymisation * K-Anonymity " information for each person contained in the released table cannot be distinguished from at least k-1 individuals whose information also appears in the release"
--- # 2. Anonymisation * .colour3[Approach:] Generalisation
--- # 2. Anonymisation * .colour3[Approach:] Suppression
--- # 2. Anonymisation * Adversarial attacks .colour2[bolstered by AI] * _ICO_: When anonymising, consider "all the means reasonably likely to be used, by yourself or a third party, to identify an individual that the information relates to" * Artificial intelligence algorithms allow for "profiling attacks" using non-overlapping data See further,
Rigaki, Garcia (2023)
,
ICO
--- .biggest.center[# 3. Privacy Enhancing Technologies] --- # 3. Privacy Enhancing Technologies * Privacy Enhancing Technologies (.colour2["PETs"]) 1. Pseudonymisation 2. Hashing 3. Differential privacy 4. Synthetic data 5. SMPC and Federated learning --- # 3. PETs ## 1. Pseudonymisation * Pseudonymisation is a word well-known in privacy circles - but may as well be 🤔⁉️🤔 to others * .smaller[Definition:] .smallest[The processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person’. This means that the use of ‘additional information’ can lead to the identification of the individuals, which is why pseudonymous personal data is still personal data.] --- # 3. PETs ## 1. Pseudonymisation * Pseudonymisation is a word well-known in privacy circles - but may as well be 🤔⁉️🤔 to others * .smaller[Example:]
--- # 3. PETs ## 2. Hashing ✅ Allows for pseudonymous (not anonymous) data sharing
--- # 3. PETs ## 3. Differential privacy ✅ Adds noise to the data (.colour2["purturbing"]) ⚖️ Balances privacy and data integrity
--- # 3. PETs ## 3. Differential privacy * Best for statistical algorithms .smallest["... an algorithm is said to be differentially private if by looking at the output, one cannot tell whether any individual's data was included in the original dataset or not. In other words, the guarantee of a differentially private algorithm is that its behavior hardly changes when a single individual joins or leaves the dataset -- anything the algorithm might output on a database containing some individual's information is almost as likely to have come from a database without that individual's information."]
.smallest[Credit:
OpenDP https://opendp.org/
] --- # 3. PETs ## 4. Synthetic data * Creation of partially or wholly new data, statistically representative of original data * Techniques: .smaller[•.colour3[ redaction]: completely removing data from the dataset]
.smaller[•.colour3[ replacing/masking]: replacing parts of the dataset]
.smaller[•.colour3[ coarsening]: reducing the precision of the data]
.smaller[•.colour3[ mimicking]: generate a dataset that closely matches the real dataset but does not contain exactly the same entries]
.smaller[•.colour3[ simulation]: generating part or all of the dataset that is similar in essential ways to the real data but is different with regard to sensitive information]
.smallest[Credit:
Data in Government
] --- # 3. PETs ## 4. Synthetic data * Creation of partially or wholly new data, statistically representative of original data
.smallest[Credit:
Data in Government
] --- # 3. PETs ## 4. Synthetic data
.center[thispersondoesnotexist.com] --- # 3. PETs ## 5. SMPC and Federated learning * Secure Multiparty Computation (".colour3[SMPC]"): .small["allows at least two different parties to jointly perform processing on their combined data, without any party needing to share its all of its data with each of the other parties.] .small[All parties (or a subset of the parties) may learn the result"] .smallest[Credit:
ICO, Privacy Enhancing Technologies
] --- # 3. PETs ## 5. SMPC and Federated learning * Federated learning .center[
] .smallest[Credit
Deloitte
] --- # Emerging Privacy Enhancing Technologies * Trusted execution environments * Zero knowledge proofs * Homomorphic encryption --- # Emerging Privacy Enhancing Technologies * Trusted execution environments * Zero knowledge proofs * Homomorphic encryption
.center[
] --- # Gotchas ## "I've been round the Wrekin and I've still not found the perfect solution" * PETs are often: * .colour2[Costly] - significant computing resources required * .colour2[Complex] - highly skilled engineers required * .colour2[Less accurate] - depending on technique * .colour2[Open to attack] (reconstruction, exfiltration etc.) --- # 3. PETs
.bigger[https://etc.ch/VY7B]
--> --- # .center[My answer:] .center[
] --- # .center[My answer:] .center[
] .center["Good physical security protects against exfiltration attack"] --- # Thank you * Questions? ### Chris James E-mail: mail@chrisjames.me.uk
www.scl.org
.smallest[Cover image generated using Midjourney and available for use CC BY-NC-SA 4.0 DEED]