Shufflecake: plausible deniability for multiple hidden filesystems on Linux

Introduction

Shufflecake is a tool for Linux that allows to create multiple hidden volumes on a storage device in such a way that it is very difficult, even under forensic inspection, to prove the existence of such volumes. This is useful for people whose freedom of expression is threatened by repressive authorities or dangerous criminal organizations, in particular: whistleblowers, investigative journalists, and activists for human rights in oppressive regimes. You can consider Shufflecake a "spiritual successor" of tools such as Truecrypt and Veracrypt, but vastly improved: it works natively on Linux, it supports any filesystem of choice, and can manage multiple nested volumes per device, so to make deniability of the existence of these partitions really plausible.

In Shufflecake, each hidden volume is encrypted with a different secret key, scrambled across the empty space of an underlying existing storage medium, and indistinguishable from random noise when not decrypted. Even if the presence of the Shufflecake software itself cannot be hidden - and hence the presence of secret volumes is suspected - the number of volumes is also hidden. This allows a user to create a hierarchy of plausible deniability, where "most hidden" secret volumes are buried under "less hidden" decoy volumes, whose passwords can be surrendered under pressure. In other words, a user can plausibly "lie" to a coercive adversary about the existence of hidden data, by providing a password that unlocks "decoy" data. Every volume can be managed independently as a virtual block device, i.e. partitioned, formatted with any filesystem of choice, and mounted and dismounted like a normal disc. The whole system is very fast, with only a minor slowdown in I/O throughput compared to a bare LUKS-encrypted disk, and with negligible waste of memory and disc space.

Shufflecake is FLOSS (Free/Libre, Open Source Software). Source code is available in the install section and released under the GNU General Public License v3.0 or superior.

Usage

Shufflecake is still experimental software, please do not rely on its security for anything important!

A user must first init a device, for example, a physical disc, or a partition therein, or a virtual block device such as a file-backed loop device. This will first overwrite the disc with random data and then create an encrypted header section at the beginning of the device. The header contains metadata and allocation tables for 15 Shufflecake volumes. The user is asked to provide N different passwords (where N is between 1 and 15). Then, the first N sections of the header will be encrypted with each of the N passwords, while the others will be left random. The order of the given passwords is important, because it establishes a hierarchy from "less hidden" to "more hidden" volumes. Notice that it is impossible to know how many volumes there are without decrypting.

Then the user can open the volumes inside a given Shufflecake-initialised device. This is done by providing only one of the N given passwords, which unlocks one of the 15 slots in the header, and hence a device area allocated for the corresponding volume. Furthermore, the unlocked slot contains a key that allows to decrypt the previous (i.e. "less hidden") slot in the hierarchy, thereby allowing to automatically open all the less sensitive volumes recursively. All these volumes appear as virtual block devices under /dev/mapper and can be mounted, formatted, and used to store data.

Finally, a user can close a device and all the supported volumes therein with a single command.

Install

Shufflecake is made of two components: dm-sflc, which is a kernel module implementing the Shufflecake scheme as a device-mapper target for the Linux kernel, and shufflecake-userland, which is a command-line tool allowing the user to create and manage hidden volumes. The kernel module must be loaded before using the userland tool.

Source code and installation instructions can be found on the project repository.

For now the support is limited to Debian/Ubuntu and similar derivatives. Testing has been done on the Linux kernel v 5.13. Work is in progress to improve this.

Documentation

Shufflecake is originally based on the EPFL M.Sc. Thesis "Hidden Filesystems Design and Improvement".

A more up-to-date research paper will be available soon.

For an overview of different plausibly deniable storage approaches check the paper "SoK: Plausibly Deniable Storage".

Usage documentation can be found on the project repository.

How does Shufflecake work?

In a nutshell, Shufflecake allocates space for each volume as encrypted slices at random positions of the underlying device. Slices are allocated dynamically, as soon as the kernel module decides that more space than the currently used quota is required, and are interleaved to make forensic analysis more difficult. Data about the position of used and unused slices is stored in a volume-specific "position map", which is indexed within an encrypted header at the beginning of the device. Both position map and header are indistinguishable from random data without the correct decryption key, and every slot in the header (currently up to 15 volumes) has a field containing the decryption key for the previous (i.e., "less hidden") header and volume, thereby recursively linking all volumes and allowing the user to open all of them with a single password. This also makes overcommitment possible, i.e., if you have a 1 GiB device and you create 3 Shufflecake volumes on it, by default you will see each of these 3 volumes being 1 GiB in size (although you will start receiving I/O errors if you try to write more than 1 GiB total across all 3), which is also crucial for plausible deniability, because an adversary can never tell for sure how many other volumes are there. Notice, in fact, that if some volumes are left unopened they are not considered for the total space allocation.

This sounds wasteful, how much space is occupied by headers, position maps, etc?

Actually very little: for a 1 TiB device, less than 1% space is occupied by these encrypted metadata. Options are being evaluated to sacrifice further space in exchange for extra useful features.

Is this fast?

Quite fast: I/O slowdown is roughly 2x compared to a "normal" LUKS encrypted volume, which is still barely noticeable for daily desktop use. A decent amount of memory (roughly 60 MiB per open volume) is required to manage the position maps in-RAM for better efficiency. There is certainly room for improvement, we didn't focus too much on optimization for the first release, performance will surely get better in future versions.

If I do not open all volumes but only some of them, what happens to the other unopened volumes?

Likely they will get corrupted badly. This is a desired behaviour: It is necessary for plausible deniability, because the adversary must observe a consistent random on-demand slice allocation even if not all volumes are opened. So the recommended behavior for the user is to unlock all volumes on a device for daily use, even if without using/mounting them, and only unlock a subset of them if under coercion.

What filesystems are supported?

Anything you want. Shufflecake is filesystem-agnostic, so users can format the volume as they wish. Certain filesystems, for example Ext4, exhibit better granularity features that improve performance a bit.

Can you boot Linux from a Shufflecake-encrypted partition?

In theory yes, and we are working on that. It requires a bit of work, because you need to load Shufflecake at bootloader time, but yes, it's possible.

Can I use Shufflecake on any platform? Mobile? Embedded? VM?

In theory yes, but Shufflecake is designed and tested mainly on laptop/desktop platforms, so YMMV.

Is Shufflecake similar to Truecrypt / Veracrypt?

Similar, yes, but vastly improved. Shufflecake is indeed inspired by Truecrypt and similar solutions, but with the precise goal of overcoming many technical limitations that make the adoption of such software unrealistic nowadays. Most importantly: Shufflecake works natively on Linux, supports arbitrary filesystems, and can manage different nested volumes per device, so to make deniability of the existence of these partitions really plausible.

Is this steganography?

Not really. Steganography is like "there is no encrypted information at all here", while plausible deniability is "there is encrypted information, but I forgot the password", or "here is my decryption key, I swear I do not have any other one".

Why not just encrypting my disc with LUKS / BitLocker / etc?

None of these systems provide plausible deniability. In a nutshell: XKCD 538.

Who needs this? What is the use case?

People who live in constant danger of being interrogated with coercive methods to reveal sensitive information. Think of: an undercover journalist who is investigating a ring of organized crime or a corruption scandal in some low-democracy country and needs to maintain the safety not only of themselves but also of their informants, a human right activist in a repressive regime with information on other members of a prosecuted minority or about the organizion of an upcoming protest, or a whistleblower who is about to become the next Edward Snowden provided they're not caught first and processed in some secret military trial.

Can criminals use Shufflecake?

Aren't you concerned that criminals can use Shufflecake?

Yes, of course we are concerned about criminals in general. If there were a magic switch that would allow us to make Shufflecake only be used by people without nefarious purpose we would gladly press it. Sadly, this switch does not exist. And, overall, we believe that the current status of humanity is such that we need more rather than less protection against invasive surveillance and coercive interrogations as a whole.

How secure, really, is Shufflecake?

To be honest, probably not much currently. We believe that the Shufflecake scheme itself offers a great balance of usability VS security and we think it can be developed into a very robust solution. A cryptographic security proof is also available. However, the Shufflecake implementation is currently a bit more than a prototype. There is still lot of work to do, features planned but missing, probably bugs. We would need an independent security audit at some point, but not before a good cleanup of the code and a stable milestone. If you would like to help us to make it happen feel free to contribute.

What if I am monitored by a trojan / keylogger?

Then it's game over. Shufflecake only aims at protecting against a very specific threat scenario, and does not replace sound security practice.

Can Shufflecake protect against forensic inspection of used disc sector traces?

Currently, not. The so-called "multi-snapshot adversary" is a very strong security model that takes into account the fact that, especially on modern devices such as SSDs, overwriting a logical sector often results in the underlying physical sector being simply marked as "unused" rather than being really overwritten, thereby leaving "traces" or "snapshots" of the data content at previous points in time. This in turn can (in theory) allow to break plausible deniability because empty, unused space should not change over time. Multi-snapshot attacks are a well-known issue in plausible deniability systems (Truecrypt and derivatives are also very vulnerable in this sense), there are techniques for mitigation but they come with drawbacks. That said, consider the following: 1) multi-snapshot attacks are very complex and expensive. There is so much circuitry and complexity involved that 100% evidence of the presence of a hidden volume based only on past sector traces is unlikely to be reached, and an accusation in this sense will probably not stand in most courts. In fact, we are not aware of a single case in public literature of a conviction due to forensic detection of hidden data due to multi-snapshot attacks. On the contrary, there is many documented cases where even a simple system such as Truecrypt was enough to grant acquittal of a suspect. 2) Thanks to its hierarchical design, Shufflecake scrambles volumes in such a random way that an analysis of unused sectors in this sense is likely to be even more complex than in Truecrypt. 3) Regardless, the ability of Shufflecake to manage indipendently different volumes belonging to the same hierarchy offers us ways to protect against multi-snapshot attacks by simulating the action of a virtual user on empty space. More concretely, we have plans to add another component to Shufflecake in the future: a daemon that simulates user queries on the empty space of the topmost unlocked volume, regardless of whether there is further volumes hidden thereby or not. We believe that this strategy can thwart multi-snapshot attacks effectively at a marginal performance cost. See the documentation for more details.

Why is Shufflecake not based on more secure techniques such as ORAMs?

ORAMs (Oblivious Random Access Machines) are cryptographic schemes that aim at obfuscating the access patterns (in addition to the data content itself) of a trusted agent accessing an untrusted storage. The connection between ORAMs and plausibly deniable storage systems has been discovered and investigated since the breakthrough HIVE paper of 2014. In a nutshell, the idea is that if we use an ORAM to access a device, then nobody, not even a run-time backdoor in the device firmware, can know which volume we access and how. However, ORAMs are extremely slow. They are so slow, in fact, that precise theoretical bounds are known, telling us that no secure ORAM can be faster than extremely slow. The HIVE paper circumvented this problem with the following observation: If we are not worried by run-time backdoors in the device firmware, but are only concerned about "traditional" multi-snapshot adversaries, i.e. post-arrest investigation of the device physical layer, then we do not need a fully-fledged ORAM, because read operations do not change the state of the device. So all we need is a "write-only" ORAM (WORAM) that only obfuscates write requests. The advantage is that there is no currenlty known (yet) efficiency bounds for WORAMs, and in fact existing WORAM constructions seem to be slightly better than fully-fledged ORAMs. When initially designing Shufflecake, we also considered WORAMs, but eventually we opted against this solution for the following reasons: 1) even the most performant WORAM schemes known are still very slow or wasteful. For example, HIVE has a slowdown of roughly 200x I/O throughput, while some recent constructions reach a slowdown of "only" 5x but at the cost of wasting 75% of the disc space. We wanted Shufflecake to be practical. 2) WORAMs are themselves not bulletproof. In fact, we believe that the idea that read requests do not change the underlying state of the physical device is a somewhat strong assumption, and hard to justify with modern, complex SSDs that might, for example, cache read requests in some undocumented memory area of the firmware, etc. The only way to be 100% safe would be to use a full ORAM (which, again, would not be practical for daily use). That said, we are still very interested in ORAM techniques, and we are keeping an eye on the evolving research on this field. If anything changes in the future we might consider rewriting Shufflecake by keeping the overall functionality but replacing the underlying slice mapping algorithm with an ORAM scheme.

Why did you write this in C?

Because we are old-school graybeards. More seriously, we are investigating Rust, and might port Shufflecake to Rust in the future.

Why did you not host it on GitHub?

We believe that the current git hosting provider we use (Codeberg, a German provider backed by a no-profit organization) offers better guarantees in terms of freedom of expression and protection of the digital rights sought by the GNU GPL license we use (Autopilot, we're looking at you).

Who is behind the Shufflecake Project?

Please see the About section.

How do I know you are not an NSA honeypot?

Because we are not anonymous, at least somewhat known in the cryptography community, and because the whole idea makes absolutely no sense since the code is open source.

Can I contribute to the project?

Absolutely yes! Please check the project repository.

About

websiteATshufflecake.net

The Shufflecake Project (including code, website, and all infrastructure and communication) is created and maintained by Elia Anzuoni and Tommaso Gagliardoni. All opinions expressed herein belong to us only, and do not necessarily respect the point of view of anyone else.

Shufflecake was initially developed in 2022 as an EPFL Master Thesis project by Elia Anzuoni under supervision of Dr. Tommaso Gagliardoni and Prof. Edouard Bugnion during an internship at the Cybersecurity Research Team of Kudelski Security.

Shufflecake