1

Binary Privacy

 2 years ago
source link: https://matklad.github.io//2022/05/29/binary-privacy.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Binary Privacy

May 29, 2022

This post documents one rule of thumb I find useful when coding:

Either make all fields of a type public, or make none of them public

Being a rule-of-thumb, it naturally has exceptions, but those are relatively few. The primary context here is application development. Libraries with semver-constrained API have other guidelines — the rules are different at the boundaries.

This privacy rule is a manifestation of the fact that the two most popular kinds of entities in programs are:

  • Abstract data types — complex objects with opaque implementation which guard interior invariants and expose intentionally limited API to the outside world

  • Data — relatively simple objects which group a bunch of related attributes together

If some fields of a type are private, it can’t be data. If some fields of a type are public, it can still be an ADT, but the abstraction boundary will be a bit awkward. Better to just add getters for (usually few) fields which can be public, to make it immediately obvious what role is played by the type.

An example of ADT would be FileSet from rust-analyzer’s virtual file system implementation.

#[derive(Default)]
pub struct FileSet {
  files: HashMap<VfsPath, FileId>,
  paths: HashMap<FileId, VfsPath>,
}

impl FileSet {
  pub fn insert(&mut self, file_id: FileId, path: VfsPath) {
    self.files.insert(path.clone(), file_id);
    self.paths.insert(file_id, path);
  }

  pub fn len(&self) -> usize {
    self.files.len()
  }

  pub fn file_for_path(
    &self,
    path: &VfsPath,
  ) -> Option<FileId> {
    self.files.get(path).copied()
  }

  pub fn path_for_file(
    &self,
    file: &FileId,
  ) -> Option<&VfsPath> {
    self.paths.get(file)
  }

  pub fn iter(&self) -> impl Iterator<Item = FileId> + '_ {
    self.paths.keys().copied()
  }
}

This type maintains a bidirectional mapping between string paths and integral file ids. How exactly the mapping is maintained (hash map, search tree, trie?) is irrelevant, this implementation detail is abstracted away. Additionally, there’s an invariant: files and paths fields are consistent, complimentary mappings. So this is the case where all fields are private and there’s a bunch of accessor functions.

An example of data would be Directories struct:

#[derive(Debug, Clone, Default)]
pub struct Directories {
    pub extensions: Vec<String>,
    pub include: Vec<AbsPathBuf>,
    pub exclude: Vec<AbsPathBuf>,
}

This type specifies a set of paths to include in VFS, a sort-of simplified gitignore. This is an inert piece of data — a bunch of extensions, include paths and exclude paths. Any combination of the three is valid, so there’s no need for privacy here.

Connections

This rule is very mechanical, but it reflects a deeper distinction between flavors of types. For a more thorough treatment of the underlying phenomenon, see “Be clear what kind of class you’re writing” chapter from Alexandrescu’s “C++ Coding Standards” and “The Expression Problem” from ever thought-provoking Kaminski.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK