Kyle Pena

Posted on Nov 1, 2024 • Edited on Jul 7

Use Hardware Security Modules And Browser Integration To Fight Deepfakes

#ai #cryptography #security #deepfake

Introduction

Deepfakes are a massive problem without an effective solution.

Current approaches seem to fall into two categories:

Both approaches have merits, but also some obvious drawbacks. Watermarks rely on deepfake purveyors acting in good faith and are easily removed. On the other hand, AI deepfake detectors are never going to be foolproof - in fact, adversarial training specifically seeks to defeat this modality.

We may yet end up in some kind of end-game where only the most well-resourced producers can make deep-fakes that are undetectable by the best detectors, but this is a risky gamble.

The scheme I propose is relatively simple:

Integrate Hardware Security Modules directly with camera sensors. HSMs are hardware cryptographic modules capable of directly capturing sensor data and generating a digital signature without ever going through a software layer. They are generally designed to be tamper resistant, and use a per-device certificate issued from a PKI root certificate for all similar devices.
Include the HSM-generated digital signature in the image metadata.
Develop and release a first-party, built-in browser feature which displays a tamper-proof mark of authenticity to the user when an image is authentic.
As the inevitable need arises, restrict the distribution of HSM-integrated cameras (or special, more tamper-resistant HSMs) to trusted and licensed organizations.

The problem with implementing the scheme - as you might see - isn't the technology.

It's the cooperation required between manufacturers, browsers, and the public key infrastructure. Like so many of the problems of the ever-expanding polycrisis, we currently lack the proper incentives to compel the necessary action.

I'll spend some time outlining a gradualist's approach to making it happen. In any case, I think it's possible, and will shortly become necessary anyway.

The Solution - In More Detail

There are three legs to the stool.

Trusted Devices With Integrated Hardware Security Modules (HSMs)
Public Key Infrastructure
Browser Integration

I am defining Trusted Devices as digital cameras with a Hardware Security Module (HSM) directly integrated with the camera sensor. When an image is captured by the camera sensor, the HSM uses a certificate from a "Trusted Device Root Certificate" (TDRC) to digitally sign the image.

In order to authenticate an image from a Trusted Device, the image's digital signature must be verified as having been produced by a non-revoked certificate deriving from the TDRC. This can be easily accomplished by leveraging Public Key Infrastructure and utilizing cryptographic standards already available in modern browsers.

Verifying an image establishes a chain of trust from the browser to the camera sensor itself, provided the user is willing to make the following assumptions:

The Trusted Device Root Certificate is not compromised.
The Trusted Device's individual certificate is not compromised.
The HSM of the Trusted Device has not been tampered with (HSMs manufacturers already include a number of tamper-resistant designs).

Browser Iconography

Here's a tricky problem: How do we present a tamper-proof (i.e.; CSS and browser extension resistant) "indicator of authenticity" to the user, that is immediately understandable and also un-spoofable?

The address bar seems like a good place to put it - that area is generally off limits to browser extensions. Perhaps when you hover over a verified image, display a checkmark and an icon-sized representation of the image (so you can't be fooled by overlapped images in CSS space).

I am tempted to propose more specific solutions, but I feel I would be out of my lane. Browsers employ talented UX people, and I don't want to poison the well with a bad suggestion. Suffice it to say that I think the problem is solvable, and would involve some combination of within-page indicators and out-of-page indicators. But perhaps there is another obvious solution which involves just one or neither? I would love to hear some of your ideas in the comments.

The Case For Limited Distribution

Should Trusted Devices be available for purchase by the consumer? My argument is that at first - absolutely. Having every device produce authentic images would familiarize the public with the concept and shove most of the toothpaste back into the tube for a short while.

However, the problems with consumer-grade authentic images will rapidly become apparent.

HSM-enabled trusted Devices have an obvious griefing vector: Print out a deepfake, and take a photo of it with a Trusted Device. The photo is authentic, but the content is not.

Unfortunately, there's no way around it: Extending the chain of trust from the camera sensor to the depicted content itself requires trust in the operator. The trust in the operator has to come from somewhere and be justified by something.

To me, the only reasonable solution is to rely on the power of institutions. Human institutions have successfully controlled human behavior and provided reasonable guarantees for thousands of years - in a sense they are our most "reliable" technology.

As the problems of consumer-grade trusted devices become more obvious to the public at large, the natural need will arise for a different "class" of trusted device, and even a different "class" of trusted device user - all for critical purposes.

The Trusted Device Licensing Board

Acting largely as a bureaucratic extension of the technical capabilities of the Public Key Infrastructure, we could envision a board that would have the following capabilities:

The granting of a certificate derived from the Trusted Device Root Certificate.
The revocation of the certificate (and therefore the repudiation of all further images created by the Trusted Device owned by the licensee).
The licensing of a different grade of trusted device with special authority (not unlike the president' pen).

The board would derive its income through licensing fees, and would operate as an independent entity. It would have its own investigative powers to resolve claims of misconduct (similar to how the bar might investigate and remedy claims of legal malfeasance), which would act as a source of tension in the system to encourage honest behavior amongst the operators.

So Who Gets Licensed And Who Do They Work For?

Choosing who should get the devices is tricky (and fraught with danger). But I think it's easy to imagine a partial list of who ideally should be able to use them:

Clerks of congresses, governments and courts recording official proceedings.
Notaries, coroners, surveyors and other local officials.
Trusted journalists documenting proof of disasters, atrocities and war-crimes.
Licensed and accredited 3rd party agencies who provide photographic evidence for insurance claims and lawsuits as a service.

A follow-on question is whether these persons belong to the organizations they service, or are acting as external, 3rd party service vendors.

In my opinion, it is better for Trusted Device Licensees to act as independent, third party service vendors. The alternative comes with risks:

An internal Trusted Device Owner may feel pressured to misuse the device in order to keep their job, especially when it is in the institution's interest to do so (this is the classic multiple principal problem).
The existence of an independent body of licensees would reinforce the notion of the 'Trusted Device Owner' as a distinct role and entity, and therefore give the trustworthiness of the system a concrete referent to attach to.

Plus, the economics just work out better with an independent board and 3rd party licensees. Persons licensed by the board would derive their livelihood from providing verifiable photographic evidence as a third party service. Therefore, license holders would have a vested interest in acting honestly due to the threat of license revocation (which would come with a loss of income).

A Dose of Realism

To be transparent, I don't think it's reasonable to expect the creation of a licensing board ex nihilo.

You have to sell people on the tech first, allow the nature of the problem of misuse to be revealed over time, and then sell people on the idea of more restricted distribution from a more selective root certificate.

Here's how it would go:

Integrate HSMs into devices purchasable by the consumer
Wait for the inevitable abuse of the system
Create a second "class" of devices.
Repeat 1-3 until a licensing board becomes inevitable.

Multi-Level Certificates for Scalable Enforcement Against Misuse

Supposing we do end up with a selective licensing board, there's also the problem of scale.

In order for the benefits of Trusted Devices to be felt at multiple levels in society (down the local level, for example), there has to be a certain level of hierarchical distribution and ownership, and therefore hierarchical levels of accountability for the misuse of devices.

Therefore, it may be worthwhile to create sub-certificates under the TDRC corresponding to the hierarchy of organizations owning the Trusted Devices. This would be paired with a policy which revokes the higher level certificate if enough abuses at the lower level. This aligns the interests of all the individual actors in the Trusted Device ecosystem with the greater public good of having a trustworthy system.

Quis custodiet ipsos custodes?

Old-fashioned corruption is also a possibility, as it is in any institution.

But if that is going to be our reason to say "no", what's our alternative here? I can easily envision a world in the not-too-distant future where:

Insurance adjustors will routinely 'deepfake' photographic evidence to reduce payouts on insurance claims
Photographic evidence will be inadmissible in court because any image could be a deepfake (and frequently is)
Despotic leaders will commit war crimes with impunity because any photographic evidence can be hand-waived away as a deepfake

I don't want to present a false dichotomy between this system and a complete breakdown of the value of digital media, but I have a strong intuition that I'm not too far off the mark.

My sincere hope is that this post might serve a blueprint for when we inevitably will need to do something drastic about it anyway.

DEV Community