X11: How does “the” clipboard work?
source link: https://www.uninformativ.de/blog/postings/2017-04-02/0/POSTING-en.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
blog · git · desktop · images · music · contact & privacy · gopher
X11: How does “the” clipboard work?
2017-04-02
If you have used another operating system before you switched to something that runs X11, you will have noticed that there is more than one clipboard:
- Sometimes, you can use the mouse to select some text, switch to another window, and then hit the middle mouse button to paste text.
- Sometimes, you can select text, then hit some hotkey, e.g. Ctrl+C, switch to another window, hit another hotkey, e.g. Ctrl+V, and paste said text.
- Sometimes, you can do both.
Those two clipboards usually don’t interfere. You can keep the content of the “Ctrl+C clipboard” while using the “middle mouse clipboard” to copy and paste something else.
How does that work? Is there more than one clipboard? How many are there? Do all X11 clients support all forms of clipboards?
Here’s the approriate section of ICCCM on this topic.
Selections as a form of IPC
First things first, in X11 land, “clipboards” are called “selections”.
Yes, there is more than one selection and they all work independently. In fact, you can use as many selections as you wish. In theory, that is. When using selections, you make different clients communicate with each other. This means that those clients have to agree on which selections to use. You can’t just invent your own selection and then expect Firefox to be compatible with it.
Looking at it from a very high altitude, it goes like this:
Client A X Server Client B
----------------------------------------------------------------
(1) | I own selection FOO! |
| -------------------> |
| Write sel. FOO to BAR! | (2)
| <--------------------- |
| Write sel. FOO to BAR! |
| <--------------------- |
| Here is FOO.
| -------------------------:-----------------------> |
Okay, got it. |
| <------------------------:------------------------ |
(1) means every client can claim ownership of any selection at any time. It only informs the X server about that – no data is transferred yet. This is an important thing to understand. The X server is nothing more but a broker. It takes a note of which client owns which selection.
In (2), another client asks the X server to send it the content of selection “FOO”. The X server simply relays that request to the current owner of that selection. Client A is then responsible for actually transmitting the data to client B.
How are selections identified?
Above, I just called it “selection FOO”, meaning it’s a rather arbitrary identifier that you can choose. If you have worked with X11 before, this won’t be surprising to you: Selections are identified by atoms.
Quick recap: Atoms are a way to identify something in X11 and they are basically strings. Internally, a number is allocated for each atom, but you rarely need to ask the X server, “what’s the name of atom number 42?”
There are three “standard” selection names:
PRIMARY
: The “middle mouse clipboard”SECONDARY
: Virtually unused these daysCLIPBOARD
: The “Ctrl+C clipboard”
“Standard” means that they are specified by ICCCM 2.6.1. Yes, it’s confusing that one of the selections is named “clipboard”.
Program 1: Query selection owners
Knowing what we know now, we can ask the X server to tell us who owns
which selection. This is xowners.c
:
#include <stdio.h>
#include <X11/Xlib.h>
int
main()
{
Display *dpy;
Window owner;
Atom sel;
char *selections[] = { "PRIMARY", "SECONDARY", "CLIPBOARD", "FOOBAR" };
size_t i;
dpy = XOpenDisplay(NULL);
if (!dpy)
{
fprintf(stderr, "Could not open X display\n");
return 1;
}
for (i = 0; i < sizeof selections / sizeof selections[0]; i++)
{
sel = XInternAtom(dpy, selections[i], False);
owner = XGetSelectionOwner(dpy, sel);
printf("Owner of '%s': 0x%lX\n", selections[i], owner);
}
return 0;
}
Compilation of this program (and all of the following ones in a similar manner):
cc -Wall -Wextra -o xowners xowners.c -lX11
FOOBAR
is a non-standard selection name. It’s perfectly valid to use
it, but don’t expect it to work with all clients. :-)
As you can see, the program prints IDs of windows:
$ ./xowners
Owner of 'PRIMARY': 0x60080F
Owner of 'SECONDARY': 0x0
Owner of 'CLIPBOARD': 0x1E00024
Owner of 'FOOBAR': 0x0
Windows are another basic form of communication between clients, meaning they not necessarily work as “boxes of pixels”. Unmapped windows can exist in an X11 session just fine (and there usually are many of them).
We can use the xwininfo
tool to find out more about those two windows:
$ xwininfo -id 0x60080F | grep '^xwininfo'
xwininfo: Window id: 0x60080f "xiate"
$ xwininfo -id 0x1E00024 | grep '^xwininfo'
xwininfo: Window id: 0x1e00024 "lariza"
Aha, so xiate
is holding the PRIMARY
selection, while lariza
owns
CLIPBOARD
.
Let’s have a look at the full output of one of these commands:
$ xwininfo -id 0x60080F
xwininfo: Window id: 0x60080f "xiate"
Absolute upper-left X: -100
Absolute upper-left Y: -100
Relative upper-left X: -100
Relative upper-left Y: -100
Width: 10
Height: 10
Depth: 0
...
Map State: IsUnMapped
...
This is, in fact, an unmapped window. Clients often do this. They create a window with the sole purpose of managing selections. Clients could use their visible window, but that’s problematic. Sometimes, visible windows are short-lived and ownership of a selection is lost when the window dies.
Content type and conversion
So far, so good. And so simple.
Things start to get complicated once you realize that some clients might use clipboards for text, others might use it for images, some might use it for audio data, and some other client might use it for some form of data that you have never heard of.
And then there are situations where you can provide the same data in different forms. To illustrate this, just select some text in a web browser. Copy it and paste it into Vim. You’ll get plain text. But if you paste the same selection into a program like LibreOffice Writer, you’ll not only get text but also text attributes, like “this is bold, this is a code block”, and so on.
Recall the diagram from above. Step 2 said: Client B tells the X server to write selection “FOO” to “BAR”. (We have not yet covered what “BAR” is, but we’ll get there soon.) Actually, it’s more like this: “Write selection ‘FOO’ to ‘BAR’ as content type ‘BAZ’.” In other words, client B can request the current content of selection “FOO” as text. Or as an image. Or as something else.
That’s why the library call to “get” the current content of a selection
is called XConvertSelection()
instead of XGetSelection()
.
Program 2: Get clipboard as UTF-8
This is an example of “client B”:
#include <stdio.h>
#include <X11/Xlib.h>
void
show_utf8_prop(Display *dpy, Window w, Atom p)
{
Atom da, incr, type;
int di;
unsigned long size, dul;
unsigned char *prop_ret = NULL;
/* Dummy call to get type and size. */
XGetWindowProperty(dpy, w, p, 0, 0, False, AnyPropertyType,
&type, &di, &dul, &size, &prop_ret);
XFree(prop_ret);
incr = XInternAtom(dpy, "INCR", False);
if (type == incr)
{
printf("Data too large and INCR mechanism not implemented\n");
return;
}
/* Read the data in one go. */
printf("Property size: %lu\n", size);
XGetWindowProperty(dpy, w, p, 0, size, False, AnyPropertyType,
&da, &di, &dul, &dul, &prop_ret);
printf("%s", prop_ret);
fflush(stdout);
XFree(prop_ret);
/* Signal the selection owner that we have successfully read the
* data. */
XDeleteProperty(dpy, w, p);
}
int
main()
{
Display *dpy;
Window owner, target_window, root;
int screen;
Atom sel, target_property, utf8;
XEvent ev;
XSelectionEvent *sev;
dpy = XOpenDisplay(NULL);
if (!dpy)
{
fprintf(stderr, "Could not open X display\n");
return 1;
}
screen = DefaultScreen(dpy);
root = RootWindow(dpy, screen);
sel = XInternAtom(dpy, "CLIPBOARD", False);
utf8 = XInternAtom(dpy, "UTF8_STRING", False);
owner = XGetSelectionOwner(dpy, sel);
if (owner == None)
{
printf("'CLIPBOARD' has no owner\n");
return 1;
}
printf("0x%lX\n", owner);
/* The selection owner will store the data in a property on this
* window: */
target_window = XCreateSimpleWindow(dpy, root, -10, -10, 1, 1, 0, 0, 0);
/* That's the property used by the owner. Note that it's completely
* arbitrary. */
target_property = XInternAtom(dpy, "PENGUIN", False);
/* Request conversion to UTF-8. Not all owners will be able to
* fulfill that request. */
XConvertSelection(dpy, sel, utf8, target_property, target_window,
CurrentTime);
for (;;)
{
XNextEvent(dpy, &ev);
switch (ev.type)
{
case SelectionNotify:
sev = (XSelectionEvent*)&ev.xselection;
if (sev->property == None)
{
printf("Conversion could not be performed.\n");
return 1;
}
else
{
show_utf8_prop(dpy, target_window, target_property);
return 0;
}
break;
}
}
}
This is more code than you expected? Yup. But bear with me. We’ll go through it step by step.
First, let’s uncover what “BAR” is. You see that the code above creates
a target_window
and an atom target_property
. These two things
together are “BAR”. When client A sends the content of a selection to
client B, it does so by writing the data to a property on a window. This
is virtually the only way two X11 clients can communicate arbitrary data
through the X server.
Remember that X11 is network transparent. Clients A and B need not run on the same host. They need not even use the same network protocols. One might use TCP/IP, the other might use … whatever. ICCCM uses DECnet as an example, which nobody uses anymore today, probably. As a result, they must not communicate directly, but only through the X server.
Okay. Our target “BAR” is a window and a property.
We also need a content type. Here, I used UTF8_STRING
. You won’t find
this atom name in ICCCM. UTF-8 did not even exist when ICCCM was first
published. Newer clients support it, though.
We then ask the X server to “perform” the conversion:
XConvertSelection()
. Now look closely at the first diagram at the top
of this article. There is no immediate response to
XConvertSelection()
. The X server must first relay that request to
client A, provided that there even is a selection owner right now. Then,
at some point in the future, client A decides to do its work – or maybe
not. This means that we can only wait for some X event to happen. That’s
what the loop at the bottom of the code is for. The event
SelectionNotify
tells us that a conversion has happened or failed. We
can then go ahead and read the property on our very own window; client A
should have written its data to that property.
Some things to note:
- Client A might fail to deliver its data. It may have crashed. Or whatever. Client B must not block and wait for the data transfer to be finished.
- Client A might fail to convert the data. This happens, for example, when you ask The GIMP to give you UTF-8 from the clipboard, when GIMP has actually stored image data.
- The call to
XDeleteProperty()
tells client A that we have successfully read the data. - It’s not required to ask for the current owner of a selection before asking for a conversion. I only did that to check if there is a selection owner right now. (If you don’t do that, you just get a “conversion failed”.)
Program 3: Owning a selection
This is the other direction. A client that claims ownership of
CLIPBOARD
and provides data if asked for type UTF8_STRING
. So, this
is client A:
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <X11/Xlib.h>
void
send_no(Display *dpy, XSelectionRequestEvent *sev)
{
XSelectionEvent ssev;
char *an;
an = XGetAtomName(dpy, sev->target);
printf("Denying request of type '%s'\n", an);
if (an)
XFree(an);
/* All of these should match the values of the request. */
ssev.type = SelectionNotify;
ssev.requestor = sev->requestor;
ssev.selection = sev->selection;
ssev.target = sev->target;
ssev.property = None; /* signifies "nope" */
ssev.time = sev->time;
XSendEvent(dpy, sev->requestor, True, NoEventMask, (XEvent *)&ssev);
}
void
send_utf8(Display *dpy, XSelectionRequestEvent *sev, Atom utf8)
{
XSelectionEvent ssev;
time_t now_tm;
char *now, *an;
now_tm = time(NULL);
now = ctime(&now_tm);
an = XGetAtomName(dpy, sev->property);
printf("Sending data to window 0x%lx, property '%s'\n", sev->requestor, an);
if (an)
XFree(an);
XChangeProperty(dpy, sev->requestor, sev->property, utf8, 8, PropModeReplace,
(unsigned char *)now, strlen(now));
ssev.type = SelectionNotify;
ssev.requestor = sev->requestor;
ssev.selection = sev->selection;
ssev.target = sev->target;
ssev.property = sev->property;
ssev.time = sev->time;
XSendEvent(dpy, sev->requestor, True, NoEventMask, (XEvent *)&ssev);
}
int
main()
{
Display *dpy;
Window owner, root;
int screen;
Atom sel, utf8;
XEvent ev;
XSelectionRequestEvent *sev;
dpy = XOpenDisplay(NULL);
if (!dpy)
{
fprintf(stderr, "Could not open X display\n");
return 1;
}
screen = DefaultScreen(dpy);
root = RootWindow(dpy, screen);
/* We need a window to receive messages from other clients. */
owner = XCreateSimpleWindow(dpy, root, -10, -10, 1, 1, 0, 0, 0);
sel = XInternAtom(dpy, "CLIPBOARD", False);
utf8 = XInternAtom(dpy, "UTF8_STRING", False);
/* Claim ownership of the clipboard. */
XSetSelectionOwner(dpy, sel, owner, CurrentTime);
for (;;)
{
XNextEvent(dpy, &ev);
switch (ev.type)
{
case SelectionClear:
printf("Lost selection ownership\n");
return 1;
break;
case SelectionRequest:
sev = (XSelectionRequestEvent*)&ev.xselectionrequest;
printf("Requestor: 0x%lx\n", sev->requestor);
/* Property is set to None by "obsolete" clients. */
if (sev->target != utf8 || sev->property == None)
send_no(dpy, sev);
else
send_utf8(dpy, sev, utf8);
break;
}
}
}
It creates an invisible window and then claims ownership of CLIPBOARD
.
As you can see, not “the client” owns a selection, but a window does.
The program then waits for events. SelectionClear
is simple: Some
other client has claimed ownership of the clipboard. Yes, that can
happen at any time.
SelectionRequest
is sent to client A by the X server. It’s the event
that the X server generates due to a call to XConvertSelection()
by
client B. We now simply check if target
is UTF8_STRING
. If it’s not,
we deny the request. But if it is, we call XChangeProperty()
to alter
the given property on the given target window. Once we’ve done that, we
generate a SelectionNotify
event and send it to client B.
This client sends the current date and time to requestors. I did this to illustrate further how selections don’t store data in the X server. Data is converted (and possibly generated) only when another client asks for it.
Program 4: Content type TARGETS
There are some special content types. You can ask the owner of a
selection to convert the selection into the type TARGETS
. This sounds
a bit weird, but it’s simple. Client A will not respond with the actual
data but with a list of atoms. Each atom is a valid target for the
current data.
#include <stdio.h>
#include <X11/Xatom.h>
#include <X11/Xlib.h>
void
show_targets(Display *dpy, Window w, Atom p)
{
Atom type, *targets;
int di;
unsigned long i, nitems, dul;
unsigned char *prop_ret = NULL;
char *an = NULL;
/* Read the first 1024 atoms from this list of atoms. We don't
* expect the selection owner to be able to convert to more than
* 1024 different targets. :-) */
XGetWindowProperty(dpy, w, p, 0, 1024 * sizeof (Atom), False, XA_ATOM,
&type, &di, &nitems, &dul, &prop_ret);
printf("Targets:\n");
targets = (Atom *)prop_ret;
for (i = 0; i < nitems; i++)
{
an = XGetAtomName(dpy, targets[i]);
printf(" '%s'\n", an);
if (an)
XFree(an);
}
XFree(prop_ret);
XDeleteProperty(dpy, w, p);
}
int
main()
{
Display *dpy;
Window target_window, root;
int screen;
Atom sel, targets, target_property;
XEvent ev;
XSelectionEvent *sev;
dpy = XOpenDisplay(NULL);
if (!dpy)
{
fprintf(stderr, "Could not open X display\n");
return 1;
}
screen = DefaultScreen(dpy);
root = RootWindow(dpy, screen);
sel = XInternAtom(dpy, "CLIPBOARD", False);
targets = XInternAtom(dpy, "TARGETS", False);
target_property = XInternAtom(dpy, "PENGUIN", False);
target_window = XCreateSimpleWindow(dpy, root, -10, -10, 1, 1, 0, 0, 0);
XConvertSelection(dpy, sel, targets, target_property, target_window,
CurrentTime);
for (;;)
{
XNextEvent(dpy, &ev);
switch (ev.type)
{
case SelectionNotify:
sev = (XSelectionEvent*)&ev.xselection;
if (sev->property == None)
{
printf("Conversion could not be performed.\n");
return 1;
}
else
{
show_targets(dpy, target_window, target_property);
return 0;
}
break;
}
}
}
Running this when a typical GTK client currently owns a simple text selection reveals something interesting:
$ ./xtargets
Targets:
'TIMESTAMP'
'TARGETS'
'MULTIPLE'
'SAVE_TARGETS'
'UTF8_STRING'
'COMPOUND_TEXT'
'TEXT'
'STRING'
'text/plain;charset=utf-8'
'text/plain'
X11 is old and many conventions exist on how to specify data types. Some of them are legacy, some are ambiguous, many not even mentioned by ICCCM. MIME types are fine today, but ICCCM does not talk about MIME types in any way.
This feels a little messy, yes. Being compatible with today’s clients and clients from 30 years ago isn’t easy.
Handling binary data using xclip
I’ve been wondering for a long time why I’m unable to paste an image
using xclip
. It should be simple: xclip -o >foo.img
. Well, no.
Knowing what I know now, it finally is simple. :-)
First, copy an image using a tool like The GIMP.
xclip
can query TARGETS
:
$ xclip -o -target TARGETS -selection clipboard
TIMESTAMP
TARGETS
MULTIPLE
SAVE_TARGETS
image/png
image/tiff
image/x-icon
image/x-ico
image/x-win-bitmap
image/vnd.microsoft.icon
application/ico
image/ico
image/icon
text/ico
image/bmp
image/x-bmp
image/x-MS-bmp
image/jpeg
Choose something that you like. And then ask for the data:
$ xclip -o -target image/png -selection clipboard >foo.png
$ file foo.png
foo.png: PNG image data, 373 x 309, 8-bit/color RGBA, non-interlaced
Not a big deal. Using xclip
to copy image data works the same way,
just specify a MIME type using -t
.
Large amounts of data
You might have noticed that program 2 aborts if there’s something
involved called INCR
. This is one of the many hacks in the world of
X11 selections.
Properties on windows can only hold a limited amount of data, because they live in the memory of the X server. If you want to transfer several megabytes by using selections, you can still do that. You just have to chunk your data and client B must read data in chunks. Usually, the size of each chunk is about 256 kB. Not that much, but sufficient in most cases. It makes clients more complicated, though, because each client must implement that chunking mechanism.
Clipboard managers
In your everyday work, you might have noticed this: You open a window, select some text, hit Ctrl+C, and then close the window. What happens? The selection is lost. Of course it is, the client window that owned the selection is gone. This is different from other operation systems. And even if all operating systems worked like that, it would still be annoying.
There is no “clean” solution to this problem in X11. Instead, ICCCM suggests the use of clipboard managers. They work like this:
- The clipboard manager shall claim ownership of a selection.
- Once it loses ownership, it will:
- Ask the current owner for the content.
- Provide the content itself.
- Re-claim ownership.
This feels like there are many race conditions involved. It will also
break when a client does not support the TARGETS
target. Yes,
supporting this target is required by ICCCM, so it “should” work.
Summary
I think it’s important to understand that the X server is just a broker. Clients talk to each other (via the server), exchanging content. There is no clipboard “inside” of the server. Data is converted on the fly. You can have as many selections as you like, but not all clients support all of them.
One final thing to note: At first sight, selections in X11 appear to be simple. I fear, though, that they are almost as complicated as time zones. Even the “standard” utility xclip isn’t strictly ICCCM-compliant and contains the occasional “FIXME”. There are many race conditions and many corner cases.
tl; dr: If possible, use a library.
Addendum, 2019-07-28
The original code from 2017 contained some calls to XSelectInput()
. I
wanted to tell the X server that my windows are to receive events like
SelectionNotify
. Doing this was wrong, though. There simply are no
masks for those events and you don’t need to select for them.
Thanks, Ulrich!
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK