2

Syncing all the things

 2 years ago
source link: https://lwn.net/Articles/861978/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Syncing all the things

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

Computing devices are wonderful; they surely must be, since so many of us have so many of them. The proliferation of computers leads directly to a familiar problem, though: the files we want are always on the wrong machine. One solution is synchronization services that keep a set of files up to date across a multitude of machines; a number of companies have created successful commercial offerings based on such services. Some of us, though, are stubbornly resistant to the idea of placing our data in the hands of corporations and their proprietary systems. For those of us who would rather stay in control of our data, systems like Syncthing offer a possible solution.

The core idea behind synchronization systems is essentially the same for all of them: given a list of directories and a list of systems, ensure that those directories have the same contents on each system. If a file is added on one, it is copied out to the rest; modifications and deletions are (usually) propagated as well. The trouble is always in the details, though; from fiddly setup procedures to data corruption and security problems, there are a lot of ways in which synchronization can go wrong. So users have to put a lot of trust in these systems; open source code is an important step toward that goal, but it is also necessary to believe that the developers involved have thought carefully through the issues.

Syncthing

[Syncthing management screen]

When the northern-hemisphere summer sets in and fresh news becomes relatively scarce, the opportunity arises to finally get around to checking out an interesting software project or two. Your editor, thus, has been duly playing around with the Syncthing 1.17.0 release, obtained from the Fedora and CentOS (EPEL) repositories. Starting the system on Fedora is just a matter of running the syncthing command to start the synchronization daemon and trying to not be dismayed at the volume of log data generated. The EPEL package, instead, appears to install Syncthing as a systemd user service, meaning that all that was needed was to log into the system and the daemon started automatically.

The daemon is managed through an internal web server that shows up, by default, on port 8384; see the example image on the right. There is initially no authentication required; users will likely want to fix that as one of the first things they do. The web server is, by default, only accessible via the loopback interface; a change to a configuration file can make it available to the Internet as a whole, but that sounds like a daring thing to do even with authentication enabled. An alternative for gaining access to the web interface on a remote machine, as suggested in the documentation, is to set up an SSH tunnel from the local system.

When it starts for the first time, Syncthing generates a "device ID" identifying the local system; it looks like this:

    RS5RZ7K-CORJAP3-TZECYOH-IBLFDZM-KSFWOXB-VBEIYSB-F7MWECH-VQCGLAZ

Setting up synchronization between two machines resembles the Bluetooth pairing process; it is done by providing each side with the device ID belonging to the other. Use of copy-and-paste is advisable here. Alternatively, if both systems are on the same local net, they will discover each other through broadcasts and ask (through the management interface) whether a connection should be established.

After a connection between systems is made, users must tell Syncthing which directories should be synchronized; that is a matter of setting up folders and sharing them with any or all of the known remote systems (which Syncthing calls "devices"). Once the share has been accepted on the remote end, file changes will be propagated back and forth. When possible, Syncthing requests file-change notifications from the kernel; that leads to relatively fast propagation times.

There are a lot of options that can be set to control sharing. Sharing can be made one-way, for example, so that a particular system might create files and send them out without accepting changes from the other systems. One especially interesting (though new and "beta") feature is the ability to share files to specific systems in encrypted form. If one system is, for example, a cloud server that is used primarily for backup or distribution purposes, it can be given encrypted data that it cannot read. Any other system in the sharing network that has the correct password will be able to read those files, though. There are also various ways of handling versioning, which keeps older versions of files around when one system changes them.

It's worth noting that, while it is possible to configure a set of Syncthing clients all connected to a central server, nothing in Syncthing requires that sort of architecture. Systems can be connected in any way that seems to make sense. If a system finds that it needs files that have already propagated to multiple connected peers, it can receive the needed data in blocks, BitTorrent-style, from whichever system can provide it first.

Discovery and security

Interestingly, neither host names nor IP addresses are involved in any stage of the configuration process — by default, at least; the systems find each other based only on the device ID regardless of which networks they are attached to. This, clearly, requires some third-party help. The Syncthing project runs a set of "discovery" servers that will help systems find each other based on their device IDs. There is also a set of "relay servers" that can relay data in situations where the systems involved cannot reach each other directly — when they are both behind NAT firewalls, for example.

Some thought has clearly gone into the security implications of this architecture. Data only goes through relay servers if there is no alternative, for example, and it is encrypted at the endpoints. But there is still some information that a hostile discovery or relay server could obtain that might worry some users. For anybody who is truly worried, the code for both types of server is available; anybody can set up private servers and configure their Syncthing instances to use only those.

According to the documentation, device IDs need not be kept secret, since an affirmative action is required on both sides to set up a connection. One might wonder whether an attacker might try to set up a system with a target's device ID and thus gain access to the managed files. That ID, though, is essentially a public key, and the connection process involves proving possession of the associated private key, so such an attack should not be possible. This page describes device IDs in more detail.

Syncthing on the move

[Syncthing Android app]

Perhaps the most common use of synchronization on today's net is copying photos from a phone handset to a central server. Since Android phones, at least, are Linux-based, one need only set up a normal shell environment on it and put Syncthing there to achieve this goal; the process shouldn't take more than a day or so. Or one could just install the Android app, which is available on F-Droid and the Google Play Store as well. This app, shown on the right, comes with a folder for the camera (set for send-only sharing) configured out of the box, so it is just a matter of setting up the peers. And, lest one worry about typing one of those device IDs with an on-screen keyboard, the app can read the QR code that the web interface will helpfully provide, easing that process considerably.

One slightly surprising behavior is that the app asks for location permission, which doesn't seem like something it would need. That permission is needed to determine which WiFi network (if any) the phone is on, which is useful for the feature configuring when synchronization should (and should not) be performed. Users of metered WiFi services may want to use this mechanism to avoid synchronization when it could cost them money. In the absence of this permission, the app will, by default, perform synchronization whenever it is connected to any WiFi network.

One need not look far to find complaints from users that the Android app drains the battery quickly. Your editor has not observed this behavior in a limited amount of testing; it is possible that the worst problems have already been fixed.

Closing thoughts

The project states that "security is one of the primary project goals", and the developers do appear to have put some thought into the issue. Encryption is used in the right places, certificates are verified, etc. A quick CVE search turns up two entries over the last four years, one of which enabled the overwriting of arbitrary files. Exploiting that vulnerability would require first gaining control of one of the machines in the sharing network, at which point the battle is likely lost anyway. It does not seem that any sort of formal security audit has been done, but the Syncthing developers are at least making the right kinds of noises.

With regard to reliability, it is not hard to search for (and find) various scary stories from users who have lost data with Syncthing. It seems that many of those problems are the result of operator error; if you set up a system and allow it to delete all your data, it may eventually conclude that you want it to do exactly that. Synchronization can be amazingly efficient at propagating mistakes. Use of versioning can help, as can avoiding the use of two-way synchronization whenever possible. Syncthing doesn't seem like it has a lot of data-losing bugs, but backups are always a good idea.

Syncthing has been syncing things since at least 2013, when the first commit appears in its Git repository; LWN looked at it in 2014. The project is written mostly in Go, and is distributed under the Mozilla Public License. The current Syncthing release is 1.18.0; it came out on July 6 — while this article was being written. The project shows a nearly monthly release cadence in the last year; 1.7.0 was released on July 7, 2020. There have been 728 non-merge commits to the Syncthing repository over the last year from 40 developers; the top three developers (Simon Frei, Jakob Borg, and Jesse Lucas) account for just over 76% of of those commits. The project is thus not swarming with developers, but it appears healthy enough for now.

A company called Kastelo offers support subscriptions for Syncthing and provides significant resources for Syncthing development. The company also is part of the Syncthing Foundation which, in turn, manages the project's infrastructure and makes grants for development projects.

All told, Syncthing leaves a favorable impression. The developers seem to have done the work to create a system that is capable, reliable, secure, and which performs reasonably well. But they have also done the work to make it all easy to set up and make use of — the place where a lot of free-software projects seem to fall down. It is an appealing tool for anybody wanting to take control of their data synchronization and replication needs.


(Log in to post comments)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK