The Science of Visual Data Communication: What Works

Abstract

Effectively designed data visualizations allow viewers to use their powerful visual systems to understand patterns in data across science, education, health, and public policy. But ineffectively designed visualizations can cause confusion, misunderstanding, or even distrust—especially among viewers with low graphical literacy. We review research-backed guidelines for creating effective and intuitive visualizations oriented toward communicating data to students, coworkers, and the general public. We describe how the visual system can quickly extract broad statistics from a display, whereas poorly designed displays can lead to misperceptions and illusions. Extracting global statistics is fast, but comparing between subsets of values is slow. Effective graphics avoid taxing working memory, guide attention, and respect familiar conventions. Data visualizations can play a critical role in teaching and communication, provided that designers tailor those visualizations to their audience.

This report presents research-backed guidelines for creating powerful and intuitive visualizations oriented toward communicating data to students, coworkers, and the general public. We begin by reviewing guidelines for helping viewers extract data from visualizations in precise and unbiased ways, avoiding a set of known illusions and distortions. We then describe when visual processing of visualizations is powerful (processing broad statistics) versus where it slows to a crawl (making individual comparisons), and we provide a tool kit for avoiding that slowdown. We review guidelines for ensuring that a viewer properly maps visualized values to the right concepts in the world (e.g., viewers can extract the size of an error bar on a graph, but do they understand what it means?), allowing viewers to use visualizations as effective tools for reasoning. We then review guidelines for conveying uncertainty and risk (e.g., how could a physician express survival odds for a treatment to a patient?). Finally, we summarize a set of guidelines for creating visualizations that communicate clearly and suggest resources for readers interested in learning more.

Data visualizations range from simple graphs in elementary school classrooms, to depictions of uncertainty in election forecasts in news media, to complex data displays used by scientists and analysts. When designed effectively, these displays leverage the human visual system’s massive processing power, allowing rapid foraging through patterns in data and intuitive communication of those patterns to other viewers. But when designed ineffectively, these displays leave critical patterns opaque or leave viewers confused about how to navigate unfamiliar displays.

We review methods, empirical findings, theories, and prescriptions across the fields of visual perception, graph comprehension, information visualization, data-based reasoning, uncertainty representation, and health risk communication. These research communities study similar questions and use complementary expertise and styles of inquiry, yet they too rarely connect. We ignore artificial boundaries among these research fields, and instead integrate across them.

The Importance of Visualization Design and Literacy

Thinking and communicating with data visualizations is critical for an educated public (Börner et al., 2019). Science education standards require students to use visualizations to understand relationships, to reason about scientific models, and to communicate data to others (National Governors Association Center for Best Practices and Council of Chief State School Officers, 2010; National Research Council, 2013). Evidence-based public policy prescriptions about climate change, vaccines, and policing are argued to be most effectively built (Kohlhammer et al., 2012) and communicated to the public (Otten et al., 2015) with visualizations. Journalists at The New York Times Upshot, FiveThirtyEight, The Economist, and The Washington Post use visualizations to communicate data and evidence about statistics and policy. Data visualizations are ubiquitous in the workplace—in data-analysis software, in data-overview dashboards, and in millions of slide presentations created each day (Berinato, 2016; Parker, 2001). Physicians rely on them to show data about the risks of medical procedures, and meteorologists use them to show the uncertainty in a hurricane’s potential path (Ancker et al., 2006; Ruginski et al., 2016).

In each of these domains, low graphical literacy and ineffective design lead many viewers to struggle to understand these otherwise powerful thinking tools. Many students can find textbook visualizations too challenging to understand or integrate with nearby text (Nistal et al., 2009; Shah & Hoeffner, 2002). Public policy visualizations can be counterintuitively designed, leading many viewers to draw a conclusion opposite the one suggested by the depicted data (Engel, 2014). Dozens of best-selling guides have decried the state of visualizations in the workplace and offered prescriptions for more powerful, clear, and persuasive graphs (see the Recommended Practitioner Books section at the end of this article and Ajani et al., 2021, for a more exhaustive list). Medical-risk visualizations can lead patients to fundamentally misunderstand the base rates or risk factors for diseases or medical procedures (Ancker et al., 2006). When a prediction has a high level of uncertainty that is not intuitively conveyed, the public can lose trust in scientists and analysts. For example, when a hurricane’s path deviates somewhat from the most likely trajectory, or when a politician with a 20% predicted chance to win an election prevails, these outcomes may be consistent with the uncertainty inherent to the predictions. But if the forecaster does not effectively visually communicate that uncertainty, their reputation can suffer when their prediction is “wrong” (Padilla et al., 2021).

Who Studies the Design and Comprehension of Visualizations?

Research on the design and pedagogy of data visualizations takes place in several communities. A psychologist focusing on perception might study the mapping between a color value in a heat map and the abstract magnitude that an observer extracts from it (Stevens, 1957). A cognitive psychologist might explore how working memory limits the complexity of the statistical relationships that a viewer might extract (Halford et al., 2007; Padilla et al., 2018). An education researcher might try to remove roadblocks for students struggling to translate visual depictions to their underlying concepts (Börner & Polley, 2014; Shah & Hoeffner, 2002) or seek multimedia design principles for designing effective graphics and integrating them with text (e.g., Mayer & Fiorella, in press). Researchers in public policy communication or political science might study why viewers find some visualizations to be more trustworthy or persuasive than others (Nyhan & Reifler, 2019). Health communication researchers evaluate how to effectively communicate the risk of a medical procedure to patients with low numeracy (i.e., ability to work with numbers and mathematics; Ancker et al., 2006). Specialists in statistical cognition and communication seek ways to communicate uncertainty across election outcomes, bus arrival times, and hurricane paths (Hullman, 2019). Finally, a research community housed in computer and information sciences studies data visualization at multiple levels, from data types and algorithms to the creation of user task taxonomies, to design prescriptions for visually powerful displays and fluid interaction (Munzner, 2014).

In this article, we also draw advice from communities of practitioners who might not engage in empirical research but use extensive in-context experience to generate prescriptions for powerful and intuitive visualizations. At the end of this review, we include a list of recommended visualization-design guidebooks. Although many of these guides are oriented toward business analysts, their prescriptions extrapolate directly to science, education, and public policy visualizations. We also discuss design techniques used by a new wave of journalists focused on communicating data analysis to the general public.

The Structure of Our Review

This review focuses on how to effectively design visualizations that communicate data to students and the general public. We review evidence-based prescriptions for designing visualizations that help people understand and reason about the patterns, models, and uncertainties carried by a data set. Another important topic, which we do not cover systematically here, is how to measure visualization literacy and the effectiveness of teaching techniques that improve it (see Börner et al., 2019; Lee et al., 2016). We also restrict our scope to quantitative visualizations, omitting discussion of qualitative visualizations of text data, diagrams, and processes (see Hegarty, 2011; Henderson & Segal, 2013, for review). We focus on research and prescriptions that are most relevant for communication to nonspecialist audiences, instead of the design of powerful tools for data analysis within expert communities.

We first illustrate why visualizations can be such powerful tools for thinking about data. Because the human visual system is highly developed for rapid parallel extraction of behavior-relevant features and relationships, visualizations allow us to process some types of patterns across an entire two-dimensional array of values at once. We describe the limited set of visual channels that can effectively depict magnitudes to a viewer, such as the position of a value in a dot plot, the size of a circle hovering over a map, or the color intensity of an activation pattern in a functional MRI (fMRI) image.

We then discuss design guidelines for ensuring that the human eye accurately decodes those depicted values. We review evidence for a ranking of some visual channels (e.g., position) as more precise than others (e.g., color intensity) for at least one common task but also discuss how new work has begun to dismantle that ranking for a broader array of tasks. We list a set of common errors and illusions that cause viewers to extract the underlying values from visual channels incorrectly—for example, y-axis manipulations that exaggerate differences among values, confusion about whether circles depict values with their size or diameter (which can change the extracted value by an order of magnitude), a common illusion produced by line graphs, and other illusions and categorical distortions that can arise when depicting value with color intensity. We also include a brief review of accessibility considerations for viewers with color blindness. Finally, we discuss best practices for distinguishing between groups of data (say, two groups of points on a scatterplot) by marking them with different shapes or colors.

Next, we discuss an important dissociation in visual processing power: Whereas computing statistics across an image is broad and instantaneous, making comparisons among subsets of values is slow and limited to two or three comparisons per second. We review the types of grouping cues that loosely control what information is compared by a typical viewer and further techniques for precisely guiding a viewer to the right comparison. We discuss the importance of respecting a viewer’s limited working memory, including avoiding legends and animated displays that can engage but also confuse. Finally, we review evaluations of whether visualizations should have rich and memorable designs, as opposed to a minimalist and clean aesthetic.

The next section introduces visualization schemas, or knowledge structures that include default expectations, rules, and associations that a viewer uses to extract conceptual information from a data visualization. We illustrate the importance of schemas by introducing the reader to a small set of new visualization designs that will likely be unfamiliar. We then provide examples of common schema elements that are known to more graphically literate audiences (but not always respected by designers), such as the assumption that larger values are plotted upward. We shift to a brief review of human reasoning about visualizations, including formal models that draw links from visual depictions, to numeric values, to their underlying concepts and the designer’s intended message. We then explore two case studies: reasoning about graphs illustrating scientific concepts and reasoning about graphs of mathematical functions.

The subsequent sections review research on visualizing uncertainty or risk. Communication failures can start with a lack of understanding of critical statistical concepts, even among scientists. We give examples of how viewers tend to misread error bars as depicting the edges of a range of data instead of correctly understanding them as parameters of a distribution. Probability information expressed as risk is critical for people such as patients considering a medical procedure and potential evacuees who may be in the path of a hurricane, but depictions of risk are frequently misunderstood. We review guidelines for showing uncertainty or risk more intuitively, including depicting samples of discrete outcomes, showing probability density functions, and depicting data with arrays of icons.

Finally, we summarize a set of evidence-based prescriptions for creating powerful visualizations for intuitive communication of data and provide a list of recommended practitioner guides (Box 1), websites, and data-journalism outlets for further reading and inspiration (for a concise review of similar guidelines, see also Zacks & Franconeri, 2020).

Box 1

The Power of Visualization

Visualizations let viewers see beyond summary statistics

Visualizations allow powerful processing of an entire two-dimensional rectangle of information at once, in stark contrast to the limitation of reading handfuls of symbolic numbers per second. As a demonstration, Figure 1 (top left) contains four sets of 11 pairs of values. Take a moment to compare those columns, and notice that reading symbolically represented numbers takes time. As you seek patterns within each set, or make comparisons among the four sets, progressively processing more pairs of values becomes increasingly difficult. Worse, these tasks quickly exhaust memory capacity, such that new numbers or patterns tend to displace ones that were previously seen. These limitations on symbolic processing of numbers lead viewers to rely instead on summary statistics that compress data sets into a single group of numbers. For the four sets of numbers in Figure 1, those statistics on the bottom row—means, standard deviations, and correlation coefficients—are identical, which might lead you to believe that the numbers contributing to the statistics are similar (Anscombe, 1973). However, because statistics summarize larger sets of numbers by abstracting over them and making assumptions about the patterns that they might contain, many sets of numbers can generate the same statistics. For these four sets of numbers, relying on statistics turns out to be dangerous.

Fig. 1. Examples of how visualizations can let viewers see beyond summary statistics. At left, four sets of 11 numbers have identical statistics but dramatically different patterns, as revealed by the scatterplots below each column. At right is a more extreme example of nine dramatically different scatterplots (including one that looks suspiciously like a dinosaur) depicting data with identical statistics, down to the second decimal place. The graphs on the right are adapted with permission of the Association for Computing Machinery, from “Same Stats, Different Graphs: Generating Datasets With Varied Appearance and Identical Statistics Through Simulated Annealing,” by J. Matejka and G. Fitzmaurice, CHI ’17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (https://doi.org/10.1145/3025453.3025912). Copyright 2017 Association for Computing Machinery.

The differences between the four sets of numbers in Figure 1 are immediately visible when translated into images in the form of scatterplots (bottom left). These images allow you to leverage the two-dimensional processing power of your visual system, the largest single processing system of the brain (Van Essen et al., 1992). If you are familiar with statistics, then the first image at the bottom left likely matches what you assumed the numbers should look like given the statistics on the bottom rows: an orderly positive relationship. But the other sets are clearly different in important ways. The right side of Figure 1 depicts frames of an animation that provides a more sophisticated example (Matejka & Fitzmaurice, 2017), in which visual processing allows one to immediately see that, despite identical statistics (to the second decimal place) for each scatterplot, the nine plots contain saliently different patterns. When exploring a new data set, it is good practice to visualize every column of data with a histogram, and every potentially interesting pairing of columns with scatterplots, before turning to statistical summaries (Moore et al., 2017; Wongsuphasawat et al., 2015).

Visual channels translate numbers into images

Visualizations rely on several visual channels to transform numbers into images that the visual system can efficiently process (Bertin, 1983; Mackinlay, 1986; see Munzner, 2014, for a more complete list). Knowing these channels allows a designer to consider which might be best suited for a given data set and context—particularly given that each is associated with differential levels of precision and potential illusions, as described in the following sections. The first column of Figure 2 depicts five of the more frequently used channels. Dot plots and scatterplots, such as those in Figure 1, represent values as position. Bar graphs represent values not only as positions (of the tips of the bars) but also as one-dimensional lengths (and, some argue, even two-dimensional areas; Yuan et al., 2019). If two bars do not rest on the same baseline, such as segments within the same bar in a stacked bar graph, the comparison relies only on length or area. Next, two circles code numbers exclusively as two-dimensional areas (typically circles), a technique often used to overlay values across maps. Angle typically emerges when points are connected to form a line graph, organically allowing an encoding of the difference between adjacent points (a bigger difference creates a steeper slope and, typically, a longer line). Outside of pie charts, angle is less frequently used to depict numbers directly—perhaps on local areas of a map to represent wind directions. Numerosity is omitted from the figure, but it often implicitly shows higher-level attributes of data. For example, you can immediately estimate the number of points in a scatterplot, segments in a stacked bar chart, or icons in an infographic. Intensity is an umbrella term (often also called lightness or value) for either luminance contrast or color saturation, as used in a heat map or fMRI activation map. Motion is also not included in the figure, but animating a scatterplot to show values changing over time can encode the rate and direction of change in the speed and direction of the dots’ motion.

Fig. 2. A summary of the power and limits of visual data processing. The two columns on the left show a quick reference guide to channels that can depict data visually and common illusions for each channel. The column in the center presents a summary of how visual statistics are powerful. The two columns on the right illustrate how comparisons are severely limited and present a set of design techniques that focus viewers on the “right” ones.

How to Design a Perceptually Accurate Visualization

Understand how to leverage visual channels

Visual channels are ranked by their perceptual accuracy

These channels differ in how precisely they convey numeric values to a viewer, and knowing the ranking of these channels allows a designer to prioritize what information to show most precisely. The leftmost column of Figure 2 presents five of the channels that can depict metric data to the human visual system. This list is ordered by the typical precision with which a viewer can verbally state the ratios between the two values shown; more precise ways of communicating numbers are at the top and less precise ways are at the bottom (Cleveland & McGill, 1984, 1985; Heer & Bostock, 2010). It should be clear from the figure that the 1:7 ratio can be relatively precisely extracted for position, but that task is a bit tougher for area, and far more difficult for intensity, at the bottom of the list.

Because position is the clear winner for precision, visualization designers often prioritize the vertical and horizontal dimensions of two-dimensional space when depicting or organizing quantitative data. Faced with a single column of numbers in a spreadsheet, a visualization designer might depict those data vertically with position (in a bar or line graph) and rely on horizontal position to organize the values into categories, as in a typical bar chart. If faced with two columns of numbers, a designer might simply create two of those same types of graphs or organize each set of numbers along both the vertical and horizontal axes of position, as in a scatterplot.

The advantage of position over length for precisely depicting ratios between numbers is demonstrated in the second column of Figure 2, which shows a horizontally oriented stacked bar graph in its second row of examples. Because the black segments of the bars are aligned on a common axis at their left, their right tips provide a precise position code, allowing the viewer to see the delicate 0.9:1 ratio between the bars. However, the next set of medium-gray segments are tougher to distinguish because the positions of their right tips are no longer useful, so the viewer must rely on length—a lower-precision channel—to extract the same ratio.

Mapping visual ratios back to numbers can cause perceptual errors

Using visualizations can unlock powerful data-pattern processing. However, a designer must be aware of several perceptual illusions that can lead viewers to map visual depictions back to their original numeric values incorrectly (Huff, 1954; Tufte, 1983). If two plotted values have a 1:7 ratio, then the visualization should cause a typical viewer to see that 1:7 ratio veridically. Even for a precise visual channel such as position, this requirement can be tougher than anticipated. For example, see the dot plot and bar graph at the top of the second column of Figure 2. The dot plot uses position as its visual channel, and the bar graph depicts the same data with both position and length. The second value appears to be roughly double the first value. Look more closely at the y-axis: The second value is only about 1% bigger than the first; the difference appears greater because the axis baseline does not start at zero. In theory, the data are transparently depicted—but in reality, such graphs are frequently misinterpreted.

Figure 3 illustrates some real-world examples of this problem. The line graphs in the upper left, adapted from Darrell Huff’s classic 1954 book How to Lie With Statistics, show how a line graph’s scale can be stretched to make a trend appear steeper (Huff, 1954). In March 2014, a version of the bar graph at the upper right appeared on Fox News, a network with an avowedly opposite political orientation to Barack Obama, the U.S. president at the time. Around 6 million U.S. citizens had signed up for a new health-care program sponsored by the president, and the government specified a goal of 7 million sign-ups by March 31. Although the numbers presented are honest (a 6:7 ratio), the visualization’s truncation of the y-axis tells a different story (a 1:3 ratio) to the viewer’s visual system, suggesting a failure of the president’s plan.

Fig. 3. Deceptive axis manipulations across a line graph (top left) and a bar graph (top right). So, should data always be plotted relative to zero? The graph on the bottom left depicts climate change by plotting temperature data from a baseline of 0 °F, yet most would agree that graph is less informative than the version to its right, which allows the differences in temperatures to be seen. At the bottom right of the figure, a cover of The Economist from September 2019 maps the same restricted range of data to the full range of a blue-to-red color scale. The graphs at top left are inspired by Huff (1954); those at bottom left are inspired by Correll et al. (2020). The magazine cover at bottom right is reprinted, with permission, from The Economist (September 19, 2019).

Will people not simply read the y-axis labels and easily overcome this initial misperception? Unfortunately, both real-world anecdotes and laboratory studies suggest that this practice can be deeply misleading (Correll et al., 2020; Hofman et al., 2020; Pandey et al., 2014). For example, using a visualization similar to the Fox News example, researchers asked crowdsourced workers to rate the contrast between the two depicted values on a 1-to-5 scale. Ratings in a zero-baseline condition averaged around 1.5, whereas ratings in a deceptive-baseline condition averaged 2.8 (Pandey et al., 2015). In that study, stretching the y-axis also strongly affected ratings of the strength of the trend. The crowdsourced workers were not simply inattentive: Only participants who passed attention-check trials were included. Moreover, the deceptive effect persisted when participants were asked to type the numeric values represented by each bar before making their effect-size rating and were reminded of the y-axis’s truncation, which was indicated by a “broken axis” symbol (such as that shown in Fig. 2, top right) at the base of the y-axis (Correll et al., 2020).

The prevalence of this deceptive effect has led to quantitative prescriptions for how to set y-axis boundaries to produce accurate measures of statistical effect sizes by typical viewers (Witt, 2019; B. W. Yang et al., 2021). For example, if the relevant data are far from zero, starting the y-axis at zero can make effect sizes illegible. One approach to increase legibility is to center the y-axis on the data’s mean, then extend the y-axis 0.75 SD above and below the mean (Witt, 2019). This approach can be appropriate when the overall scale is not essential.

Visually conveying an appropriate difference often depends on what “appropriate” means. Many visualization designers subscribe to the principle that a y-axis must always start at zero so that the visually depicted ratios match the ratios in the data. By contrast, others argue that this guideline must be subject to context (for a distillation of arguments by designers, see Correll et al., 2020). For example, it seems clear that designers should consider the practically relevant range of data values when small but essential differences would be tough to see on a zero-baseline graph. A now-infamous (and now-deleted) story in the National Review used a graph similar to one at the bottom left of Figure 3 to suggest that the recent global temperature rise is inconsequential (Correll et al., 2020). It even went beyond using a zero baseline for the axis, intriguingly using −10 °F as a lower bound. The graphic to its right shows the pattern of data widely agreed by the scientific community to be more honest: a “hockey stick” pattern indicating a recent rapid rise.

The image at the bottom right of Figure 3 is a now-famous cover of The Economist magazine from September 2019, which similarly mapped a restricted range of data to the full range of a blue-to-red color scale. Another issue highlighted by these examples of temperature data is that not all variables have a single natural zero point. Zero degrees Fahrenheit is not a meaningful value for ratio comparisons. Temperature in Celsius has a somewhat meaningful zero point (the typical freezing point of water on earth), but that baseline is not relevant to climate change.

These mismappings of visually depicted ratios are not isolated to the position and length channels. One can imagine similar differences in interpretations arising for size or intensity. Imagine a map of a country with a circle over each major city. One might depict crime rates with the intensity of the circles’ color, but with what link to the original data? If zero were mapped to gray and 100 violent crimes per capita per year (a very high rate) to gray with a small amount of red, the country might seem safe regardless of the underlying data. Another depiction might link 100 violent crimes per year to a bright red. Side by side, those two depictions would suggest a huge difference in crime rates despite showing the same data. One could imagine the same trick, but this time mapping crime rates to the circles’ size. If 100 violent crimes were linked to a 1 mm circle, the country would seem safe, but if the same number of crimes were linked to a 10 mm circle, it would seem more dangerous.

Which visual channel is linked to the data values?

In other cases, viewers can misunderstand which visual property encodes the data. In a classic example at the top left of Figure 4, values are encoded by one-dimensional length (the height of each person), producing a 1:2 ratio of the two numbers. However, you might find that your estimate of the depicted values is determined instead by the area taken up by each person, leading to something closer to a 1:4 ratio (or even a 1:16 ratio, if the icons suggest three-dimensional volume). People do indeed make this error, even when the numeric data are printed saliently near the visual representation (Pandey et al., 2015).

Fig. 4. A set of common visual confusions, illusions, and distortions. At upper left, the two human icons represent vastly different ratios depending on whether the data are represented by their one-dimensional heights, their two-dimensional areas, or their three-dimensional volumes. The donut graph to their right shows that three-dimensional depictions can artificially inflate data values in the two-dimensional plane. At bottom left, the same fMRI data are plotted with two color mappings. The brain image on the left produces less of a spurious categorical boundary effect, whereas the image on the right shows a common red-to-green color map that makes continuous variation appear exaggerated when it maps onto transitions from one color category to another. At center, an illusion prohibits the accurate recovery of differences between values in a line graph. At right, correlation is easier to detect from the scatterplot, but individual histograms for each dimension are easier to see when plotted separately.

In the donut plot to the right of the human icons in Figure 4, there are two ways that numbers are potentially miscommunicated. First, although the data are mapped onto angle, viewer judgments are more strongly determined by two-dimensional area (with some potential contribution from arc length, which is tough to dissociate from area). Because the two donuts are different sizes, this means that the larger donut’s value will be artificially inflated. Second, the graph is depicted in 3D. If the viewer can recover the actual three-dimensional geometry from the two-dimensional depiction, then the values should be accurately perceived. By contrast, if the viewer pulls values from the two-dimensional image (the amount of green or purple pixels on the screen), the values in the “front” will be inflated because of the perspective projection. Unfortunately, this technique is indeed substantially misleading because static two-dimensional projections do not typically lead to effective recovery of three-dimensional structures (Tittle et al., 2001; but see Brath, 2014).

Avoid common illusions and misperceptions

A common optical illusion in line graphs

In the line graph in the middle of Figure 4, the two curves are identical (y = x3), but the darker line is translated vertically upward by a constant of 1,000. Even if the viewer knows that the two shapes are identical, it is difficult to see that the vertical distance between the two lines is the same across their entire horizontal span. Instead, given any point on the dark line, viewers tend to see its distance from the closest point on the gray line, which becomes progressively smaller as both lines increase. This illusion makes it difficult to visually estimate differences between lines, especially lines with steep slopes (Cleveland & McGill, 1984). A similar example is depicted in the second column of Figure 2. This illusion is well known to electrophysiology researchers: When faced with visualizing the difference between two measured waves, they will explicitly plot a “difference wave” that shows the difference as a single line (Luck & Kappenman, 2012).

Illusory contrast effects for intensity

The bottom of the second column of Figure 2 shows a final illusion that can warp our perception of visualized data. Both on the map and in the rectangle, the two vertically separated circles have the same luminance value. However, the lower circle is subjectively darker to the eye because it has been placed on a lighter background and has a higher contrast with its surroundings. In the real world, converting luminance to contrast is a critically important mechanism for seeing accurate luminance and color despite changes in the brightness and color profile of light in the environment (Purves et al., 2004). However, this correction leads to misperceptions of intensity-coded values in the artificial world of data visualizations (Szafir, 2018). One rule of thumb is never to plot intensities on top of other intensities that vary, as in the map in Figure 2.

Misleading illusions that combine separate values

When plotting two sets of numbers on a map, designers typically rely on intensity for one set of values and map the other set of values to the area of circles. This design solution works because, rather than plotting one intensity on top of another intensity, which creates an integral representation of contrast, intensity and area are relatively separable representations (Garner, 1974). Other examples of integral representations include encoding two sets of data in rectangles—one set in their widths and one set in their heights. Instead of seeing these values separately, the eye is tempted to translate them into the aspect ratio and the area of each rectangle. The eye then focuses on the ratios and multiplication of each pair of values (Shechter & Hochstein, 1992). As an extreme case, it is unwise to attempt to use the red component of a single color (imagine using the RGB sliders to change the color of an object in presentation software) to depict one number and the green component for another. Red and green will combine in an integral fashion when both are at their highest value, and the viewer will see a single integral percept of yellow (Ware, 2019).

A final example is shown in the scatterplot in Figure 4. Once two sets of numbers are combined into a single two-dimensional plot, new integral percepts emerge, such as the distance between any two points across both their x and y values, points that are outliers on both axes, or the global shape of all points that we can easily interpret as a correlation (F. Yang et al., 2019). However, there is a trade-off. The distribution of values of either set in isolation is now tougher to disentangle (Mackinlay, 1986). This is why data scientists often pair a scatterplot with “marginal histograms” that allow them to see those data in a separable way.

The biasing effect of categorical perception

When continuous values are encoded through visual channels, those values can be warped by categorical perception. A classic example is the seven discrete colors that we see in a rainbow, which are not present in the rainbow itself. Those color categories are created by an automatic process that systematically bins continuous wavelengths into one of several perceptual categories, exaggerating metric differences among values that straddle those boundaries and shifting percepts toward prototypes for the categories (Goldstone & Hendrickson, 2010; Newcombe et al., 1999). This same phenomenon occurs when data are depicted by hues, as in the bottom left of Figure 4. Across the two brain images, the one on the right uses a greater variety of hues to depict activation values. These additional hues create new color-category boundaries that can dramatically exaggerate the differences between values that straddle them (Y. Liu & Heer, 2018; Quinan et al., 2019). In The Economist cover shown in Figure 3, the blue-to-red scale creates a salient categorical color boundary at the color transition point, which makes the temperature increase in the past few years especially salient. Similar category boundaries can affect the perception of values depicted by other channels such as position or length. In a pie graph or stacked bar graph, values that are near gridlines or the implicit 50% mark in the middle of the bar or pie are recalled as being farther from that category boundary (Ceja et al., 2021; McColeman et al., 2021; Spence & Krizel, 1994; Xiong, Ceja, et al., 2020).

Design for color-vision impairments

Beyond illusions and biases, certain combinations of colors in visualizations can be problematic for people who are color-blind or have other color-vision impairments. Color-vision impairments are estimated to affect 4% of the global population (Olson & Brewer, 1997), or roughly 300 million people. Further, older adults can have less sensitivity to color (Silva et al., 2011). Some color blindness results in viewers not being able to distinguish between various colors; protanopia, or red–green color blindness, is the most common, but various other versions of color blindness exist. Color-vision impairments are highly problematic for visualizations, as a portion of the audience may literally not see important patterns in data.

Numerous online color-blindness simulators allow users to upload an image to learn how someone with color blindness would see it (Asada, 2019). For example, the first row in Figure 5 shows a scatterplot encoded with two colors, green and purple. People with typical vision can see that the green dots have a steep positive correlation and the purple dots make a flat line. However, when the scatterplot is processed through a color-blindness simulator, the colors look the same, and all the dots appear to show a positive correlation. The simplest way to make visualizations accessible to viewers with color blindness is to avoid using hue as the only encoding channel or allow viewers to change the color palette (Silva et al., 2011). Designers can also double-encode a variable, using hue and another encoding channel (Plaisant, 2005), as in the second row of Figure 5. The most thorough and inclusive option is to use color palettes that are safe for people with color-blindness, such as those proposed by Harrower and Brewer (2003), as seen in the bottom row of Figure 5.

Fig. 5. Three ways to encode data for two groups in a scatterplot, as seen by observers with typical color perception and those with protanopia, a form of color blindness.

Design for perceptual accuracy across a broad array of tasks

The ranking of precision for visual channels depicted in the first column of Figure 2 is based on a measure of precision in a particular task: the average error in verbal reports of the ratio between two depicted values (1:7 for each of the examples in the first column; Cleveland & McGill, 1984; Heer & Bostock, 2010).

Although this task is surely important when comparing two values, it is not the only task (or perhaps even the primary task) that viewers complete when examining visualizations (Bertini et al., 2020). One proposed taxonomy of graph-interpretation tasks separated them into three levels: elementary, intermediate, and “overall,” roughly corresponding to simple fact retrieval, comparison and identification of trends, and gist understanding (Bertin, 1981). Additional taxonomies have been developed in the context of cognitive models of graph comprehension. Other taxonomies are based on analyses of cognitive processes involved in different tasks (Tan & Benbasat, 1990), reliance on local versus global features (Carswell, 1992), or other criteria (MacDonald-Ross, 1977; Washburne, 1927; Wickens, 1989). Work in the data-visualization literature has begun to catalog such tasks and to test the types of displays that best support performance for each. One list of tasks is distilled from questions submitted by students who were asked to analyze a large data set (Amar et al., 2005). This list includes analytic tasks such as “retrieve value,” “sort,” “determine range,” “correlate,” and “characterize distribution.” Another list of tasks focuses on how people generate summary statistics for plotted data (Szafir et al., 2016).

The ranking derived from performance on two-value ratio judgments does not always extrapolate across these alternative tasks. For example, depicting data with a line graph, which relies on “precise” position coding, can lead to lower efficiency in seeing big-picture statistical properties such as means. Intriguingly, the opposite is true for intensity coding. Multiple studies have shown that for identifying particular values, position is far more precise than intensity, but for judging an average across many values, intensity is more precise (Albers et al., 2014; Correll et al., 2012). The reason for this dissociation is not well understood. It is possible that the lower precision of intensity better allows its values to blend together to construct aggregate values, or perhaps intensity is simply processed by a different mechanism that affords aggregate judgments (Szafir et al., 2016). Some visualization designs purposely use the low precision of the intensity channel to focus viewers on the big-picture trend of the data instead being distracted by precise details, as in the visualization from the cover of The Economist magazine (Fig. 3), which focused viewers on the big picture of climate while de-emphasizing more detailed (but less relevant) variability in monthly weather.

Other work has tested whether people can more quickly or accurately complete the types of tasks listed above (judge mean, correlate, etc.) given different graph formats (scatterplots, bar graphs, tables, etc.) and data-set sizes (small vs. large; Y. Kim & Heer, 2018) or different arrangements of data within a graph (Jardine et al., 2019; Ondov et al., 2019). Although such studies have found consistent interactions—better performance on Task X with Design Y in Arrangement Z for Data-Set Size S—the existing pattern of results is currently too complex to generalize to novel combinations. To create generalizable guidelines, the research community will likely require a more complex model of the underlying perceptual operations that produce these complex interactions (Jardine et al., 2019; Ondov et al., 2021).

Among these alternative tasks, the perception of correlation is particularly well studied (Harrison et al., 2014; Jardine et al., 2019; Rensink & Baldridge, 2010; F. Yang et al., 2019; Yuan et al., 2019), leading to initial suggestions of its underlying perceptual operations. For example, scatterplots allow for more accurate judgments of correlation than pairs of bar charts presenting the same data (Harrison et al., 2014).

Effectively distinguish among groups in data

Scatterplots present two metric variables in relation to each other. Imagine that the horizontal position represents the longitude of a collection of oil wells and the horizontal position represents their latitude. This example makes clear that scatterplots and data maps share DNA. Suppose the designer now wants to differentiate the set of points according to a nominal (categorical) variable, such as the company that owns the oil well. Because the two dimensions of position (vertical and horizontal) are already being used to represent the metric variables, designers would typically differentiate points from each group with either different colors or shapes.

Plotting the groups as categorically different colors is often the first choice because the visual system processes color differences more efficiently than shape differences across the two-dimensional visual plane, as measured by performance in visual-search and texture-segmentation tasks (Wolfe & Horowitz, 2017). These findings, based on simple displays used in laboratory studies, extrapolated to a data-visualization context in which viewers compared the average heights of multiple color-coded or shape-coded clouds of points in a scatterplot: Color coding produced far better performance (Correll et al., 2012).

When color differences distinguish data from two separate groups or classes, differentiating those classes is easier if the encoded colors are farther apart in a perceptual color space. For example, it is easier to differentiate red from blue than from orange-red. Researchers have constructed effective palettes from perceptually informed color spaces (e.g., CIELAB; Y. Liu & Heer, 2018) to suggest colors to use to differentiate N nominal classes; colors become progressively crowded together as N increases. One such set that is viewable (and customizable) online is ColorBrewer 2.0 (Brewer, 1994a, 1994b). Another tool, Colorgorical, balances perceptual differentiation with aesthetic considerations (Gramazio et al., 2016).

Picking a color for a nominal value should also be constrained by the semantic congruence of the value and the color. If the nominal values are lemons and cherries, it is easier for viewers to answer questions about a chart that labels those values with yellow and red, compared with a standard palette that does not consider semantic congruence (Lin et al., 2013; Schloss et al., 2018). An algorithm can automatically generate intuitive color choices for a given noun by analyzing the color profile of images pulled from Web searches for the noun and then optimizing color assignments in terms of perceptual spacing and semantic fit. Such algorithms can perform as well as human experts in quickly picking color palettes for nominal data (Lin et al., 2013; Setlur & Stone, 2015).

Sometimes a visualization designer needs to show a second set of nominal values in the same plot. Thinking back to our map, imagine that we have used a unique color to represent each oil company (A, B, C) but also wish to depict each company’s nationality (Canada, Brazil, Mexico). Typically, shape would be used to show that second nominal variable. Because it is less perceptually effective than color, it should be used for the less important variable, or the one with fewer values to differentiate. The shape sets used in commercial software (e.g., Microsoft Excel) gravitate toward intuitive shapes, such as circles, triangles, squares, and diamonds, that are not actually well separated in perceptual space.

Researchers have begun to explore perceptual shape space, and some have extrapolated from pairwise subjective similarity ratings of combinations of candidate shapes (Demiralp et al., 2014). Other work has relied on objective performance tasks on actual simulated scatterplots (Burlinson et al., 2017; Huang, 2020). This work is also new, but so far, human shape space (at least for the simple shapes used in visualizations, and at least for the types of tasks tested so far) appears to prioritize the difference between open (circle, square, triangle) and closed (×, +, *) shapes, such that differentiating points is easier when they differ in that property (Burlinson et al., 2017; Huang, 2020). An initial full three-dimensional perceptual shape space (Huang, 2020) adds the additional properties of intersection and spikiness; Figure 6 depicts a clear improvement in shape differentiability compared with the typical sets used even in professional data-visualization software.

Fig. 6. The standard shape set for Microsoft Excel (left) compared with a perceptually spaced set (right; inspired by Huang, 2020). Try to pick out the four instances of each shape in each display—you should find that task easier on the right side.

Finally, some software automatically differentiates nominal variables with both color and shape, under the assumption that more differentiation is better. However, work has shown that color is already so dominant in its effectiveness that redundant encoding does not substantially improve visual processing efficiency (Gleicher et al., 2013) unless the viewer has color-vision impairments or the viewer’s task is exceptionally difficult (Nothelfer et al., 2017). Given anecdotal claims from some expert designers that redundant encoding can cause confusion in viewers, who typically expect color and shape to signal different nominal variables (Tufte, 1983), the lack of evidence for a perceptual advantage suggests that redundant encoding should be avoided in most cases, except when used to make visualizations accessible for viewers with color-vision impairments.

How to Design a Perceptually Efficient Visualization

Use visualizations to allow viewers to powerfully compute statistics

One core advantage of visualizations is that they capitalize on our visual system’s ability to extract information efficiently. The encoding properties in the first column of Figure 2 are properties that we process in parallel across two-dimensional images (Treisman, 1998; Wolfe & Horowitz, 2017). Faster than an eyeblink, we can pull statistics from data encoded as positions, one-dimensional lengths, two-dimensional areas, angles, or intensities. In the natural world, we typically use these statistics to recognize the type of scene we are in (e.g., a beach scene containing light brown, blue, and horizontal angles from the sand, sky, and horizon; Oliva, 2005) and to help provide a summary of an otherwise complex world (Brady et al., 2009; M. A. Cohen et al., 2016). A substantial literature in visual cognition has examined the power and limits of this ability to explore what types of statistics can be extracted under what conditions (e.g., Baek & Chong, 2020; Haberman & Whitney, 2012).

Much of this work in visual cognition relies on simplified laboratory displays of colored squares and circles, which are conveniently similar to the structure of the artificial worlds of data visualizations (for a review of insights from this literature for data-visualization design, see Szafir et al., 2016). The third column of Figure 2 summarizes the types of statistics (minimums, means, maximums, and outliers) that can be pulled from any of the five visual properties in the first column. In the dot plot (encoding numbers as positions), a viewer can easily extract the minimum, mean, or maximum height. In the stacked bar chart below, the lighter segments present numbers through both their lengths and the positions of their tips because their bases are aligned. The darker segments are offset in arbitrary ways by the lighter segments underneath, so only their lengths visually represent values—yet the observer can still pick out minimum, mean, or maximum values quickly. The same is true for the bubble chart and the angles of the slope graph (a specialized line graph whose lines have only two points). Finally, for the heat map at the bottom that represents values as luminance contrasts, the lightest (minimum) and darkest (maximum) values are easy to pick out, and one can also quickly imagine the contrast of the mean value.

Avoid a visual processing limit: making comparisons

Within their first glance at a data visualization, viewers can immediately pull general statistics from the positions, lengths, areas, slopes, and intensities (Szafir et al., 2016). This provides viewers with a starting point in understanding the distribution of the data and whether there might be outliers. However, the second step is to extract relations from the data by making a sequence of comparisons, a critical mental operation when viewing visualizations (Franconeri, 2021; Gleicher et al., 2011; Tufte, 1983). Comparisons include local comparisons between elements (“this point is higher than that point”; “this line segment is shallower than that one”), groups of elements (“these two bars have a greater range than those two”; “these red circles are on average larger than the green ones”), or global trends (“this section of the heat map is more saturated than that section”).

Each of those verbal sentences describes a single visual comparison that must be extracted serially, as part of a sequence (Michal & Franconeri, 2017; Roth & Franconeri, 2012). These comparisons can each take hundreds of milliseconds to process (Franconeri et al., 2012; Logan & Compton, 1998; Wolfe, 1998), so that viewers tend to process only a handful per second (Franconeri et al., 2012). Extracting the necessary multitude of comparisons from a visualization is therefore not like instantly recognizing a particular face, place, or Pokémon but instead more like the serial and controlled reading of a paragraph (Carpenter & Shah, 1998; Shah et al., 2005; see Franconeri, 2013, for a discussion of the different types of attention demanded by these two scenarios). Each single comparison is limited in its complexity as well—by some estimates, encompassing interactions among approximately four variables at maximum (Halford et al., 2007).

This severe capacity limit has been confirmed for realistic comparison tasks in data visualizations. When viewers are asked to perform the types of tasks shown in the fourth column of Figure 2—for example, to locate pairs of graphed values in which the second value is bigger among pairs in which the second value is smaller—the time need to make each of those comparisons cumulates in painfully slow overall performance (Nothelfer & Franconeri, 2019). Likewise, the bar graph in Figure 7 (left) shows two test scores for each student, one before and one after an intervention. You can feel the sluggishness of those comparisons by answering the question: Who is the only student who got worse?

As a second example, when viewers are challenged to complete a tough comparison task with the blue dots of the scatterplot shown in Figure 7 (right), they can fail to process the positions of the green dots, such that 93% fail to notice the presence of a dinosaur shape among them (Boger et al., 2021).

Fig. 7. Visual comparison as a serial process. In the bar graph on the left, which student has a second bar that is lower than the first? To find the answer, the viewer needs to process each set of bars individually, rather than all at once. On the right, viewers tasked with processing the blue set of marks failed to notice the dinosaur shape created by the green set of marks. The image on the right is reprinted from Boger et al. (2021).

A few hundreds of milliseconds of processing time for a single comparison is not noticed. However, across many comparisons, that time quickly adds up. Imagine a bar graph showing the performance of two groups, A and B, in both treatment and control conditions, for a total of four bars. Even in this small set, there are six possible pairwise comparisons, plus two main effects, plus multiple ways to look at interactions. In a graph with 12 bars, there are 66 pairwise comparisons alone!

When visualization designers create data-heavy displays that require fast processing, such as a business-metric-monitoring dashboard, many implicitly realize that comparisons are a sluggish visual operation. One common solution is to remove the need for the viewer to explicitly compare values by providing a second view of the data plotting the differences between values (Fig. 2, upper right; Few, 2009). Known as directly depicting deltas, these dashboards commonly show differences between current financial values, such as revenue, and baseline financial values, such as revenue from the same month in the previous year, or between financial and budget values.

Control comparison with visual grouping cues

A visualization that is designed to guide viewers to make the “right” visual comparisons can lead those viewers to make meaningful insights than they would gain on their own. One major cue for this guidance is to visually group the data values that should be compared with each other or should be compared as a group to other groups. The same classic grouping cues studied in the perception literature can control which values are selected and compared. Figure 8 depicts four of the primary cues roughly in order of strength: connecting lines, position proximity, and similarity in either color or shape (Brooks, 2015; Palmer, 1995). The two rightmost columns of Figure 2 also show examples of grouping-defined comparisons.

Fig. 8. Several grouping cues that can control how data values are compared. Connecting lines are particularly powerful cues, followed by proximity, color, and shape (Brooks, 2015).

Many of these grouping cues may stem from metaphors about real-world scenarios (Tversky, 2001). Objects that are spatially nearby are likely to stem from the same source, such as tomatoes from the same plant or birds nesting in the same nest. When three objects fall close to a common line, their spatial arrangement naturally conveys conceptual ordering because of how processes produce spatial ordering: If an animal leaves a trail, the order of the prints naturally reflects the timing of how they were laid down. Lines imply connections because connected things in the world tend to belong to the same objects, such as grapes connected by vines. Closed contours such as circles or blobs define objects because objects in the world tend to have closed, discernible borders or boundaries.

In line graphs, the choice of which variable is grouped by connecting lines (e.g., Fig. 8) has a profound impact on the interpretation of data. For example, one study presented viewers with graphs that showed the effects of two independent variables on a third variable. The data were depicted in line graphs; one variable was plotted as separate lines and the other was plotted as connected lines. Viewers could answer more sophisticated questions about the quantitative relationships depicted by the connected lines but only relatively simple questions about relationships between the separate lines (Shah & Carpenter, 1995). In many cases, viewers were unable to recognize the same data plotted with a different choice of how points were connected by lines. Relationships between values that straddle different lines or different panes need to be integrated with multiple comparisons, which requires controlled processing, taxes working memory, and introduces a risk of error.

The graphs at the top of Figure 9 show how these grouping cues can control which comparisons are prioritized. In the example to the left, both proximity and color facilitate comparisons among categories (Social Security is highest) and overall between years (the yellow bars have a larger range than the green bars). However, you are less likely to compare a single category across the same year, which requires a jump of your eye to find the companion bar. The opposite is true of the example to the right (Shah & Freedman, 2011; Shah et al., 1999). The word clouds at the bottom of Figure 9 show an example of controlling comparisons with proximity grouping. In the word cloud on the left, the relatively weak color grouping makes it tougher to identify the common theme in each of five groups of words. On the right, identifying the common themes (restaurant, baseball, hands, etc.) is easier because the more powerful cue of proximity grouping has been added (Hearst et al., 2020).

Fig. 9. How visual grouping cues can control visual comparison. At top, a combination of color and proximity grouping lead the viewer to different visual comparisons across the two bar graphs. At the bottom, comparisons in a word cloud are weakly controlled by color grouping, and more strongly controlled with proximity grouping.

Guide the viewer to the most important comparison

A good visualization relies on the grouping techniques described in the previous section, including connectivity and proximity, to help guide a viewer to compare one set of values or another. However, even within that one set, there are still many possible comparisons to make.

When multiple groups compete for comparison, that competition tends to be won by groups that are different or brighter in color, largest in size, or presented at the top or left of a display. Such visual salience can be modeled by showing viewers pictures or visualizations, recording their eye movements, and then feeding the images and responses into computational models that predict human attention (Bylinskii et al., 2017). Many of these models exist for salience in natural scenes, ranging in complexity from simpler weighted linear combinations of relative differences in features (uniqueness in color, or orientation, at various locations and spatial scales) to more complex models that extract object contours or predict salience in movies (Borji et al., 2013). But many of these models fail to predict salience in the novel context of visualizations because the statistical profiles of those images differ substantially, containing large areas of blank space, text, axes, and titles (Haass et al., 2016). New models of salience for artificial information displays, trained on eye movements or on mouse-tracking data that are closely correlated with eye movements (N. W. Kim et al., 2017), can reach higher levels of predictive power (Bylinskii et al., 2017; Matzen et al., 2018).

Visualization designers will often deliberately control visual salience to bring the viewer’s eye straight to the critical comparison. One technique is to add salient color highlighting a single group of items (Fig. 10) to ensure that viewers process that comparison first (Ajani et al., 2021; Grant & Spivey, 2003; Hullman & Diakopoulos, 2011; Mayer & Moreno, 2003). Some practitioner guides also recommend using color-coded accompanying text, as in the graph at the top of Figure 10, to ensure that the viewer will match the pattern in the data to the relevant reference in the visualization designer’s argument (e.g., Knaflic, 2015). More generally, a good visualization should place verbal information near relevant visual information so that viewers do not need to glance back and forth in a time-consuming search to see what text matches what visual pattern (Moreno & Mayer, 1999). These methods of guiding viewers to the most relevant comparisons are particularly important for low-knowledge readers, who benefit from visualizations that present or highlight only the relevant comparisons (Canham & Hegarty, 2010). These techniques are less important for experts, who rely on prior experience to guide their attention.

Fig. 10. Color highlighting and direct annotation to help viewers make the right comparison first and know what conclusion is supported by that pattern in the data. The graphic at the top illustrates a color-highlighting technique suggested in business-oriented practitioner guides (e.g., Knaflic, 2015). The graphs at the bottom (inspired by Bostock et al., 2012) are an adaptation of a graph by data journalists using grouping, highlighting, and verbal annotation.

Some research has studied the techniques of data journalists, who are tasked with clearly communicating data-based arguments to nonexperts. This work has cataloged techniques used by news outlets such as The New York Times Upshot, The Washington Post, The Economist, and FiveThirtyEight (Hullman & Diakopoulos, 2011; Hullman, Diakopoulos, & Adar, 2013; Segel & Heer, 2010) and has used computation to automatically generate visualizations that use particular strategies (Gao et al., 2014; Hullman, Drucker, et al., 2013; N. W. Kim et al., 2017). These outlets employ specially trained reporters who convey data-based political, economic, health, and science stories to the lay public. They will often show a single pattern in the data at a time, relying on the viewer to step or scroll (Amabili, 2019) through a sequence of patterns. In each step, a pattern of data is highlighted to ensure that the viewer’s limited processing capacity is directed to the values simultaneously described in a verbal annotation. The bottom row of Figure 10 illustrates how data journalists might redesign the example on the left, which requires viewers to navigate text that is placed far from the patterns that it describes and to use their imagination to fill in the patterns referred to by the text. The example on the right addresses both of these issues, leading to more effective communication of a key data pattern (Ajani et al., 2021). The two rightmost columns of Figure 2 summarize these techniques, showing that grouping, highlighting, and annotating can help viewers quickly make the right comparisons.

If helping viewers prioritize critical comparisons in a visualization is so important, then why do so few presenters do it? One likely reason is that presenter have a curse of knowledge—an inability to simulate the perspective of the naive viewer because they cannot ignore what they know and see (Birch & Bloom, 2007; Camerer et al., 1989). Figure 11 presents an empirical demonstration of this curse from a lab study (Xiong, Van Weelden, & Franconeri, 2020). Participants heard an intriguing story about a dip and rise of a political candidate’s popularity in the polls (the top line in the top left graph); the story made those patterns stick out to the participants. They were then told to forget the story and predict what patterns another person, naive to the story, would find interesting or salient in the graph. The graph at the upper right presents their collective predictions—and makes it clear that people think that others will see what they see, even when they know that others do not have the same expertise. The graphs in the bottom row show how, when the original story concerned a different candidate, the pattern of results changed accordingly. It is impossible to “turn off” your own expertise, which makes it difficult to see through the eyes of nonexperts.

Fig. 11. A demonstration of the curse of knowledge in data visualizations. Participants were told a story involving the data patterns highlighted in the top two lines (top left) or the bottom two lines (bottom left) of a line graph. They were then told to forget the story and to circle the patterns that a naive viewer would notice first on an unannotated version of the graph. Their predictions (bottom right) mirrored the story they had been told, showing a “curse of knowledge,” or an inability to inhibit relevant expertise. Adapted with permission from C. Xiong, L. Van Weelden, and S. Franconeri (2020), “The Curse of Knowledge in Visual Data Communication,” IEEE Transactions on Visualization and Computer Graphics, 26(10), p. 3051–3062, https://doi.org/10.1109/TVCG.2019.2917689. Copyright 2020 by IEEE.

Persuade with visualizations

If grouping, highlighting, and annotation can guide viewers toward a certain comparison or pattern, then for better or worse, a visualization designer has some control over what their viewers see in a data set. Figure 12 shows an example, adapted from The New York Times, in which the same unemployment data might be seen—or designed to be seen—in different ways. During Barack Obama’s presidency, a member of his party (the blue glasses) might see a drop in unemployment rates and highlight that downward-sloping pattern in the orientation channel. But a member of the rival party (the red glasses) might see a failure to meet a goal of 8% unemployment and emphasize the information in the area channel by highlighting the area under the curve (Bostock et al., 2012). These visual manipulations are similar in spirit to choosing which of those features to highlight in a verbal argument.

Fig. 12. An example of emphasizing different perspectives in a single data set (inspired by Bostock et al., 2012). One data set can be seen with dramatically different perspectives, depending on which patterns an observer does and does not extract.

An analysis of distortions and biased annotations in news media visualizations showed that rhetorical biases are pervasive in data graphics and that labels and framing may be skewed toward progressive or conservative positions, depending on the news outlet (Mehta & Guzmán, 2018). A taxonomy of “visualization rhetoric” likens visualization design to an editorial process in which decisions about what data to include, how to encode them, how to use titles and labels, how to describe the data’s provenance, and what interactions to allow represent rhetorical choices aimed at guiding viewers toward preferred interpretations (Hullman & Diakopoulos, 2011). People’s perceptions of a visualization’s message, and their ability to recall it, are particularly influenced by the visualization’s title (Kong et al., 2018, 2019). For an engaging tour of the complexities of truth and deception in visualization with an emphasis on a journalism perspective, we point the curious reader to the books of Alberto Cairo (2016, 2019).

Viewers can also be influenced by expectations or social pressures, even for relatively low-level visual judgments. For example, when participants made judgments about correlations in scatterplots, their estimates were higher when the data were labeled as personality variables, which one would expect to be correlated, than when the data were unlabeled (Freedman & Smith, 1996). In another study, participants were asked to make judgments about visualizations (e.g., rating the linear association in a scatterplot) presented either alone or with a histogram plotting other people’s ratings. The other ratings were either true ratings or a distribution of the true ratings shifted by 1 SD. When participants were provided with the manipulated histograms, their judgments were biased in the direction of the social influence (Hullman et al., 2011).

Finally, the format of a visualization can also guide the types of conclusions that viewers draw from the underlying data. Imagine data showing that students who eat breakfast more often tend to have higher GPAs. A viewer might see this correlation and assume a causal relationship whereby a good breakfast causes better grades. Although plausible, this conclusion cannot be drawn from these data. When shown visualizations like these, viewers made unwarranted claims about similar correlational data, and they did so more often when the visualizations aggregated the data into fewer groups (e.g., a two-bar graph), compared with more groups (e.g., a scatterplot showing all of the individual data values; Xiong, Shapiro, et al., 2020), perhaps because seeing the data in fewer groups is implicitly associated with those data being gathered by an experimental manipulation.

Avoid taxing limited working memory

Given that comparisons are already highly capacity limited, any extraneous demands on working memory due to the design of visualizations should be avoided. Interpreting the graphs in the middle and right sides of Figure 13 requires individuals to map the symbols and colors in the graphs to their referents in the legends below. This task is highly demanding of limited working memory resources. If information is lost in interpreting a graph, viewers might make interpretation errors or require extra time to reinspect the legend. Indeed, one study found that people answered questions about data both faster and more accurately when the data were directly labeled in graphs compared with when there was a legend (Lohse, 1993). Therefore, instead of legends, use direct labels whenever possible (e.g., Wong, 2010). Note that there may be exceptions to this recommendation, if legends provide a way to index data values that labels cannot. For example, a map of an amusement park might list locations under the map to allow the viewer to browse the locations alphabetically or clustered by type (rides, food, etc.), which is not possible for map locations that are already organized in space.

Fig. 13. A demonstration of the advantage of direct labels over legends. Take a moment to state the names of the four groups shown in the line graph at left in top-to-bottom order. (Answer: b, d, a, c.) Now do the same for the graphs at center and right, which require coordination with color and shape legends. You should notice a substantial slowdown because of the need to frequently look back and forth between the graph and the legend. If you attempt to memorize the legend first, you will experience the capacity limit of your working memory.

When visualizations are well designed, they can help viewers overcome working memory limits by offloading information storage and some types of processing to a page or screen (Kirsh, 2005; Tversky, Heiser, et al., 2002). Compared with the slow difficulty of reading and comparing symbolic numbers, a visualization can allow these steps to unfold far more quickly and efficiently. For example, older adults with normal cognitive decline were asked to compare multiple health-care plans. They were given plan information (e.g., about monthly premiums, deductibles, and gap coverage [supplementary insurance to cover medical costs incurred before reaching the deductible]) either in a table full of text and numbers or in a table with visual categorical encodings of the information (e.g., green circles for the best gap coverage and red circles for no coverage). The adults using the visually encoded table made better and, in some cases, faster decisions about which health-care plan to select because they found it easier to make comparisons (Price et al., 2016).

Beware the working memory load of animation

Limits on working memory can also be strained by animation. Some kinds of visual motion, such as patterns of translation or expansion that accompany moving the head or walking, can be tracked efficiently and automatically (Gibson, 1979). However, our capacity for tracking the motion of objects that move in arbitrary directions is highly limited, to as few as one or two objects at a time (Alvarez & Thompson, 2009; Scimeca & Franconeri, 2014; Xu & Franconeri, 2015), including in moving scatterplots (Chevalier et al., 2014). If a viewer’s capacity is overwhelmed during an animation, they may not retain the motion information before the animation is over. Education researchers have termed this the transient information effect, a “loss of learning due to information disappearing before the learner has time to adequately process it or link it with new information” (Sweller et al., 2011, p. 220).

Examples of both of these problems can be seen in studies of mechanical diagrams. In a review of roughly 100 studies on the use of animated diagrams for teaching complex mechanical, biological, or computational systems, researchers found that students’ descriptions of processes were no more accurate with animations than with labeled static diagrams (Tversky, Morrison, & Betrancourt, 2002). In one experiment, students saw static or animated diagrams of the mechanical processes involved in flushing toilet tanks. Both groups could identify the number of sequential stages. However, students who saw the animated diagram made more errors (20% for animated vs. 5% for static) about individual stages, such as whether air or water ends the flushing process (Kriz & Hegarty, 2007). Even when viewers can extract global patterns from an animated diagram, the capacity limitation in processing animations can generate a cost for other information in the scene.

The mere presence of animation can also induce an illusion of understanding, such that it erroneously inflates observers’ confidence in their percept. For example, observers who saw animated diagrams of a toilet’s flushing mechanism were not only less accurate at recalling the names and functional roles of parts relative to observers who saw static diagrams; they also reported less perceived difficulty and higher engagement than did the more successful static-diagram learners (Paik & Schraw, 2013).

There is little evidence that animation facilitates understanding of information displays, but there is one important exception: animation used to convey probabilistic processes and uncertainty, in which draws from a distribution provide a metaphor for random sampling (Hofman et al., 2020; Hullman et al., 2015; Kale et al., 2019). Such animations are effective in conveying sampling uncertainty. One reason appears to be automatic (i.e., not requiring conscious attention) processing of frequency information (Hasher & Zacks, 1984). Using animation to convey uncertainty does not tax working memory because neither the sequence of samples nor the specific properties of individual samples are important for understanding that variability.

Allowing a viewer to control an animation manually may prepare them to focus their limited capacity on the right subsets of information at the right time and to replay critical portions of an animation. Viewer-controlled animation has shown some success (ChanLin, 1998; Faraday & Sutcliffe, 1997; Mayer & Moreno, 2003; Schwan & Riempp, 2004; Tversky, Morrison, & Betrancourt, 2002). However, empirical evaluation shows that interactivity does not always improve performance. One famous example of an animated data visualization is Gapminder’s Trendalyzer (Gapminder Foundation, 2007), a scatterplot containing circles for countries and plotting, for example, the countries’ gross domestic product (GDP) on the y-axis, life expectancy on the x-axis, and population as the size of their circle. The Trendalyzer uses animation to show trends over time, moving the dots to show how each of these factors changes. When participants were shown a similar display with interactive controls for the animation, they were better able to answer precise questions about each country than participants without interactive controls, but they also became worse at extracting global trends across countries compared with participants shown static alternatives (Abukhodair et al., 2013; Robertson et al., 2008). Furthermore, user-controlled interactivity is not always possible for some systems or audiences.

Nevertheless, viewers frequently report that animated diagrams and data visualizations are more engaging and enjoyable than static versions, which may explain animation’s continued use (Abukhodair et al., 2013; Robertson et al., 2008; Tversky, Morrison, & Betrancourt, 2002). In a communication context, anecdotal evidence suggests that an engaging animation can still communicate patterns in data when carefully deployed. The example in Figure 14 depicts a TED talk by Hans Rosling (2006), in which Rosling used the moving display to depict changing world health statistics over time. Rosling made his dynamic data story easy to understand by carefully using language to guide his audience (“look at this cluster, it’s moving up . . .”), paired with exaggerated gestural cues to help the audience focus on the relevant data values. Such cues have been shown to help students integrate verbal information when interpreting diagrams (Mautone & Mayer, 2007), graphs (Michal et al., 2018), animations (de Koning & Tabbers, 2011), and other educational materials (Goldin-Meadow, 1999). Future research inspired by Rosling’s case study might help outline concrete rules for using animation in ways that would allow his success to be replicated.

Fig. 14. A screenshot from Hans Rosling’s (2006) TED talk on the power of visualized data. Hans Rosling helped viewers see relevant patterns in a complex animated visualization with exaggerated gestures and clear linguistic guidance toward the critical visual comparisons that supported his arguments.

Should your visualizations be rich or minimalistic?

Given the limited working memory resources of visualization viewers, designers often recommend a minimalist aesthetic that strips away any design elements that are not critically needed (Few, 2004; Knaflic, 2015; Tufte, 1983). One popular mantra is to keep a maximal “data–ink ratio” (Tufte, 1983), though the definition of “ink” can be frustratingly vague (Correll & Gleicher, 2014a). Figure 15 depicts variants of a visualization based on this prescription. The visualization at the top is filled with “clutter”: grid lines, a background pattern, and varied colors across the bars. The middle image is a “decluttered” visualization that omits these elements. Although newer editions of Microsoft Excel have eliminated some forms of clutter shown in the top image, critics of the 2007 edition argued that the software encouraged users to create graphs containing dense grid lines, unnecessary labels, unneeded color variation, and even three-dimensional effects that transformed simple bars into cylinders or pyramids (Kirk, 2012; Kosslyn, 2010; Su, 2008; Ware, 2010, 2019).

Fig. 15. A “cluttered” visualization (top), a minimalist “decluttered” version (middle), and a version that incorporates pictorial embellishment (bottom). The graph at the bottom was created by Nigel Holmes for TIME Magazine and was reprinted in his 1984 book, Designer’s Guide to Creating Charts & Diagrams. Used with permission.

Despite strong calls to declutter visualizations (e.g., Tufte, 1983), there is only mixed evidence that this practice improves aesthetic ratings and little evidence that the prescription affects objective performance. Several studies have measured aesthetic ratings for cluttered versus decluttered charts, and some have shown clear preferences for decluttered versions (Ajani et al., 2021) and others, surprisingly, showing the opposite (Hill et al., 2017; Inbar et al., 2007). Researchers who have found the opposite have typically argued either that viewers’ higher level of familiarity with cluttered charts make those charts more attractive or that decluttered charts that are too minimalistic become boring. Another possibility is that users may prefer particular depiction styles for particular purposes, mindful of their audience and goals (Levy et al., 1996). Objective performance measures, such as the speed with which viewers can compute means across values in a bar graph, also present mixed evidence. For example, that speed can be slightly faster when some forms of “clutter,” such as axis tick marks, are removed but slower when other elements are removed (Gillan & Richman, 1994). Given the large number of design elements that could count as clutter, combined with the large number of tasks that one could complete on a visualization, some have argued that a simple rule for whether to declutter is unlikely to arise and have discouraged further empirical testing given the small and mixed effects found so far (Ajani et al., 2021).

The bottom of Figure 15 shows the same bar graph embedded within a cartoonish monster. The addition of such pictorial elements that embellish data with anthropomorphic or metaphorical elements—intended to enhance engagement or memory—has been demonized as “chartjunk” (e.g., Tufte, 1983). Various studies have shown that adding these elements leads to no improvement in memory for the data (Helgeson & Moriarty, 1993; Kelly, 1989), mixed results depending on the details of the task and context (Gillan & Richman, 1994; Li & Moacdieh, 2014), or better memory for the data content or message (Bateman et al., 2010; Borkin et al., 2016; Haroz et al., 2015b). Like animation, these visual embellishments can increase ratings of engagement and aesthetic value (Li & Moacdieh, 2014). And despite mixed evidence as to whether their presence improves memory for the data, pictorial elements do improve memory for the fact that a visualization was previously seen, both in the short and the longer term (Borkin et al., 2013).

How to Design an Understandable Visualization

Use familiar designs to show data intuitively

Visualizations can be powerful, but a poorly designed visualization can easily confuse or even mislead (Burns et al., 2020; Cairo, 2019; Szafir, 2018). Because the interpretation of visualized data is in the eye and mind of the human beholder, we must consider the psychology of the observer as the translator of images into an understanding of the original data and the patterns that they hold. Below, we outline a set of common translation errors that can confuse and mislead.

Understanding a visualization can depend on a graph schema: a knowledge structure that includes default expectations, rules, and associations that a viewer uses to extract conceptual information from a data visualization. Figure 16 serves as an example of why a graph schema is often needed to interpret a data visualization. It depicts the GDP (on a log scale) and population of the 10 most populous countries. Take a moment to interpret the data.

Fig. 16. An example of a visualization with an unclear schema.

If you are having trouble extracting the data from this visualization, it is not your fault—you do not have the needed schema. First, if you have never seen this type of visualization, you cannot know which aspects of its variation are meaningful. The bubbles differ in their areas, hues, horizontal positions, vertical positions, proximity and enclosure relationships, and depth planes, which causes confusion. Why are Russia, Brazil, and the USA “hiding” behind the other countries? Why does India enclose Mexico? Why are there two main clusters? Is it meaningful that Bangladesh and Pakistan have the same horizontal position? Which of these many sources of variability should the viewer translate back to the original data?

You are experiencing the same confusion felt by a novice viewer encountering a given type of visualization for the first time. A young student might see rectangles with widths, heights, tops, bottoms, left sides, right sides, legend entries, and gridlines that vary in vertical position. Reading a bar graph requires a schema that allows the viewer to do what you do when viewing a bar graph: to focus instantly on the positions of the bar tops, because they signal the numbers depicted by the bars.

Here is a hint for Figure 16: Only the size and hues matter; the rest of the variation is irrelevant. Areas map to population, and hues to GDP. Now, how does the variability in those channels map back to numeric values? You will guess that larger sizes are linearly related to a larger population because you have implicitly memorized that association from previous experience with visualizations. However, how do the hues map to GDP? It looks like the variance scales between red and blue, but is red high because it is “hot” or low because it is “bad”? You may have already inferred that red maps to higher by stepping back to your knowledge of the data to realize that China has a high GDP, so red probably maps to high. However, if you did not have that knowledge, your existing schema would not have allowed you to recover that relationship confidently.

Humans develop similar schemas for many other domains. A face schema, for example, specifies what features are typically in a face (eyes, ears, mouth, nose . . .), how those features tend to vary (many eyes have brown irises, but some are blue, green, or gray), what their relations tend to be (noses are between, and below, the eyes; Palmer, 1975). Having well-developed schemas enables a viewer to more effectively perceive, understand, and remember—without a face schema, a face presents a set of tens of thousands of independent pixels that vary in unpredictable ways. When one views a graph, one may depend on graph schemas that encode knowledge about the graph type, and also schemas that encode knowledge about what the graph is about—for example, understanding a high-low graph of stock prices requires knowledge about that graph type and also about stocks and the stock market.

Schemas can bias judgments when a new instance deviates from the schema (Mandler & Ritchey, 1977; Rumelhart, 1980; Tversky, 2001). In one example of bias in the context of graphs, participants saw simple displays labeled as either “graphs” or “maps” and were asked to draw the displays from memory. When the displays were called “graphs,” participants distorted a central line as being closer to a 45° angle. When these same displays were called “maps,” the participants distorted the lines as being closer to horizontal or vertical, suggesting that their memory for the image was influenced by canonical representations of these two types of visualizations (Tversky & Schiano, 1989).

In much of the world, children learn about common graph types such as bar graphs, line graphs, and scatterplots in school—but an audience without such schooling may not appreciate the conventions for how Cartesian space is used and how axes are drawn and labeled. Even relatively common visualizations such as scatterplots are misunderstood by 37% of adults in the United States (Goo, 2015).

Given that the reader of this article is likely familiar with common visualizations, we remind the reader of the need to learn how a new visualization works, by introducing examples that are less likely to be familiar. Expert communities have developed idiosyncratic visualizations that serve important data-analysis or -communication purposes, but they require learning new graph schemas. This can lead to severe difficulties for new viewers. For a broader tour of a curated “zoo” of novel visualizations, see Heer et al. (2010). For those ready for a true safari of new visualization schemas, we recommend browsing the Xenographics website (Lambrechts, n.d.).

The left column of Figure 17 depicts four visualizations that are likely familiar to the reader, and the right column shows four equivalent visualizations that likely require a new schema. The first row shows two line graphs, plotted in the same space. An alternative way to plot those data is with a connected scatterplot shown at right (Haroz et al., 2015a; Peebles & Cheng, 2003). For each time point on the x-axis of the two-line graph, plot the point for Line 1 on the x-axis of the scatterplot and Line 2 on the y-axis. Now connect all points with a line, in a temporal order. Engineers and economists use this format because it facilitates views of relationships between two sets of data: The line segments’ slope and direction reflect their ratios and changes in polarity.

Fig. 17. Four common types of visualizations and equivalent representations used by expert communities.

The second row depicts standard bar graphs on the left, showing the square footage, price, and time on the market for three houses. It becomes tougher to find relationships among similar houses in the bar graph as the number of metrics increase or the data set’s size grows because both the measures and the houses become more spatially separated. Parallel coordinates, as shown on the right, are a popular visualization format for analysts facing a much larger data set. The same data are now plotted as the height of lines that straddle parallel lines representing each measure. When combined with interactivity, this format allows users to extract individual values and perceive correlations between neighboring measures: flat lines reflect positive relationships, and slanted lines, negative relationships.

The third row shows a simple hierarchy of the content of a hard drive, in the style of an organizational diagram. In this representation, vertical position relies on a “levels of height” metaphor to indicate a folder’s place in the hierarchy, whereas horizontal position is typically irrelevant. The alternative depiction at right is called a tree map: tree because it shows a hierarchy, and map because of its two-dimensional spatial layout (Shneiderman, 1992). Reading a tree map requires the schema of knowing that folders are now differently sized rectangles (with area indicating the size of each file and color somewhat arbitrarily mapping to different branches of the hierarchy). Critically, the hierarchy itself is no longer shown through a vertical “levels” metaphor but is instead depicted through an “enclosure” metaphor, as in a physical file drawer. Higher-level folders are now akin to bento boxes enclosing other bento boxes.

The final row shows network data as a node-link diagram, which could depict a social network. A node-link diagram can intuitively show connections (e.g., between people) with lines, making it useful for lay audiences. However, for complex analysis of large data sets, specialists prefer variants of the depiction at the right, in which each “node” is both a row and a column in the matrix, and their intersecting square is filled only if those nodes are connected (Heer et al., 2010). Similar in spirit to the correlation matrix, it is symmetric across the diagonal if all connections are bidirectional. However, if the connections are directional, then the two halves carry specific information.

Respect assumptions about how visual channels map to “less” and “more”

Some elements of schema are relatively universal for moderately experienced visualization users. For example, as illustrated in Figure 18, people have expectations for which “end” of some visual channels (e.g., bottom vs. top for position, light vs. dark for luminance) should map to smaller and larger numerical values. Larger values tend to be mapped to higher vertical positions, perhaps because of the natural metaphor of stacking more objects in higher piles, or larger people’s being taller. Likewise, many communities—particularly those whose languages are read from left to right—use a consistent mapping of larger values as being on the right side of horizontal space, as in the number lines found on the walls of elementary school classrooms (Tversky, 2000, 2001). Note that you implicitly knew that the phrase “first column” at the start of this paragraph indicated the column on the far left.

Fig. 18. Four types of counterintuitive graph designs. The graph at the top left shows a confusing flip of the typical mapping: positive numbers are plotted below the y-axis and increase as they move downward (Pandey et al., 2015). At right, larger values naturally map to dark values when placed on a light background (recommended), but the mapping is less clear when the colors are placed on a dark background (Schloss et al., 2018). As shown on the bottom left, a bar graph is a counterintuitive way to plot a nominal variable: Mapping the country of origin of a car company to a bar graph makes the data “feel” metric (Mackinlay, 1986). At right, bar graphs encourage discrete comparisons between points (“150-pounders are taller”), whereas line graphs encourage descriptions of trends (“One tends to get taller as one becomes more Dutch”); in either case, viewers may extract the wrong conceptual message (Zacks & Tversky, 1999).

When these mappings are violated, viewers can become confused (Gattis & Holyoak, 1996). The brain electrophysiology research community shifted from a convention of plotting negative values (for microvolts) up on the y-axis to a more intuitive convention of plotting positive values up, forcing experienced researchers to relearn how to read familiar patterns that were now inverted (Handy, 2005). Visualizations from popular news media have been criticized for violating the convention of using higher position to indicate larger numbers, leading to confusion among viewers (Gattis & Holyoak, 1996). One such visualization depicted an increase in gun deaths in Florida after a permissive gun law was passed (Engel, 2014; see also Kosara, 2014). Each used a counterintuitively inverted y-axis, smaller values at the top and larger values at the bottom, so that the designer could evoke a metaphor of downward streams of blood. Critics reported being initially confused by the break in convention, seeing increases in deaths as decreases. One study found that when graphs used this unconventional mapping (Fig. 18), viewers incorrectly assumed that positive values plotted below the y-axis midpoint were negative values, failing to notice the critical change in the y-axis labels (Pandey et al., 2015).

For length and area, the mapping is straightforward: Larger is more. Because these channels cannot intrinsically represent negative numbers (sizes cannot be negative), other channels must pitch in to differentiate negative values: Lengths are placed with a position under an axis, and negative areas could be red in color. For angle, larger angles correspond to more, perhaps relying on the metaphor of climbing a hill: A steeper slope leads one higher, a flat slope does not change elevation, and a downward slope leads one lower. Of course, all of that assumes that you are “walking” from left to right, relying on a standard mapping of time and narrative as progressing from left to right (Boroditsky, 2011; Tversky et al., 1991).

For intensity, some previous work had shown that people tend to map larger values to darker colors, whereas other work had suggested the same for colors that appear more opaque. One study appears to have solved this long-standing uncertainty by showing that both factors affect the mapping depending on the background color. In Figure 18, in the map at the upper right with a white background, darker colors also appear more opaque, so darkness and opacity predict the same mapping: Either appears to show the larger number. However, in the map with a dark background, the darker areas can be seen as “see-through” so that the lighter areas seem comparatively opaque. Now, associations with darker as larger compete with associations of more opaque as larger. The clear mapping on the left is preferred, and the unclear mapping on the right must be rescued with adjustments to color combinations (Schloss et al., 2018).

One could imagine many alternative rules that could govern the mappings from numbers to visual properties. A computer vision system could be programmed to map larger numbers to higher positions, as humans do, or just as easily programmed to map them to lower positions, leftward positions, shorter lengths, or lighter values. Continua need not even be mapped consistently over continuous space: one could map the number 1 to position 1, 2 to 9, 3 to 5, and so on; as long as that translation is available to the computer vision system, it could easily recover the original values. The fixed architecture of the human visual system could not efficiently process those mappings.

Respect associations between visualization designs and data types

We have so far focused only on metric visualization—the visualization of continuous magnitudes that can be positive or negative. There are other types of data (Stevens, 1946) that are frequently shown in visualizations (Bertin, 1983; Mackinlay, 1986; Munzner, 2014). One is nominal data, categorical values that can be counted or can be used to divide a set of metric numbers. Data sets often include both metric and nominal data. For example, a company might have a spreadsheet recording sales with columns of metric values (number of orders, sales in dollars, shipping time in hours) and columns of nominal values (customer name, product ID, country code). One might sum over all of the metric sales numbers or split those numbers by product ID to compute averages for each ID. Another type of data falls in between metric and nominal: ordinal values, which represent ranks that can be compared and ordered but do not reflect continuous metric differences. Examples include stars or dollar signs for restaurant ratings or price levels; designations of coach, business, and first class on an airplane; or rankings of colleges in a magazine. These values may have been generated with metric data (e.g., the cost of airplane tickets, numeric ratings of colleges), but they do not contain metric information per se.

Viewers expect particular types of data to be mapped to particular visual properties. Metric data are typically mapped to position, length, area, angle, or intensity. Nominal data can be mapped to position (e.g., a bar graph showing sales for each product ID might spread those IDs from left to right or from top to bottom) or to color or shape (e.g., to identify each point’s country code on a scatterplot pitting sales against shipping time). But if nominal data are plotted with lengths or areas, viewers can become confused. An example of this type of confusing mapping is depicted in Figure 18 (adapted from Mackinlay, 1986). The varied lengths of the bars make their values “feel” metric, such that the viewer wonders why Japan is “larger” than Italy.

Figure 18 also shows that the visual system responds to points or bars as separate objects, whereas a line connecting points “feel like” a single object that has been stretched or has moved over time and left a path (Tversky, 2001). This has led to a convention of using bars to depict values from different nominal categories and using lines to depict values from different places on a metric continuum. This mapping of nominal data to bars and metric data to lines is commonly described in textbooks and guides on graphing—however, there is more to the story. What matters more than the class of the data is the intended message to be conveyed by the visualization. Suppose one wants viewers to compare two values along a continuous dimension—for example, mood at 5 p.m. versus 9 a.m. Using bars to represent mood at those two time points would encourage interpretations such as “Mood is more positive at 5 p.m. than 9 a.m.”; this could be the right choice despite the fact that time is a metric variable. Using the wrong visualization for a given conceptual message can lead to odd conclusions: In the height examples at the bottom right Figure 18, the graphs violate both the data-type convention and the intended conceptual convention. Showing values as a function of weight using a bar graph tends to lead people to conclude that “150-pounders are taller than 125-pounders,” which is reasonable but perhaps not as helpful an insight as “Height increases with increasing weight.” Likewise, showing values as a function of nationality using a line graph could lead a substantial number of viewers to conclude that “One tends to become taller as one becomes more Dutch” (Zacks & Tversky, 1999).

Connect relationships in visualizations to those in the world

Even if a viewer can extract values and data relationships from a visualization, they can fail to connect those to important real-world concepts in ways that are critical for reasoning. For example, consider the graph and corresponding text at the top of Figure 19, given to 673 high school students by Farrar (2012, as cited in Whitacre & Saul, 2016). Only 4% of the students noted that the text did not correspond to the graph’s information. In another study, high school students in honors chemistry and environmental science received similar materials on the topic of teen pregnancy, and only 25% noted a discrepancy between the text and graph (Whitacre & Saul, 2016). In a more complex task (Fig. 19, bottom), sixth through eighth graders were asked to create a position–time graph on the basis of a description of a runner’s training routine. Only 1% of the students did so correctly, and most drew “graphs” representing a map of the runner’s position.

Fig. 19. Graphical reasoning problems. When shown the graph at the top, only 4% of high school students noticed inconsistencies with the accompanying text (Farrar, 2012). The bottom shows an example item from Lai et al. (2016); asked to plot a runner’s routine on a time–distance graph, most middle-school students drew “maps,” and only 1% drew a correct graph. Bottom figure adapted by permission from Springer: Journal of Science Education and Technology, “Measuring Graph Comprehension, Critique, and Construction in Science,” by K. Lai, J. Cabrera, J. M. Vitale, J. Madhok, R. Tinker, and M. C. Linn (https://doi.org/10.1007/s10956-016-9621-9). Copyright 2016.

Although students can easily extract individual values and relationships from the climate and running visualizations shown in Figure 19, they have difficulty linking the depicted patterns with the accompanying text because graphing is often taught with a focus on plotting data points and representing quantitative functions. Instead, solving the example problems in Figure 19 requires that they see the underlying link between the visual and verbal representations of the patterns. This problem arises when visualizations must be linked to text, tables, or mathematical equations. One intriguing proposal for helping students see past extracting values, to focus on making external links, is to omit numbers on axes entirely and provide only “qualitative” graphs that distill critical categorical relations. In one study, activities in which students constructed or critiqued qualitative graphs distinctly contributed to their understanding of cancer cell division and treatment impact (Matuk et al., 2019).

In the classroom, another effective method for helping students see these links is to give them abundant and explicit practice in translating between such representations. One study found that prealgebra students generated better solutions for word problems involving simple linear functions when they first generated graphical, tabular, and equation representations of those problems, compared with a control group who generated and solved equations for similar word problems (Brenner et al., 1997). The intervention group also gained a substantially better conceptual understanding of functions compared with the control group. Graphing calculators can also help students quickly see how changing features of equations affects visualized functions (Doerr & Zangor, 2000; Hollar & Norwood, 1999; Mesa, 2008); a meta-analysis of 42 studies of graphing-calculator use found benefits from middle-school mathematics through first-semester college calculus (Ellington, 2006). Another useful technique was to switch rapidly (within seconds) among multiple representations of similar concepts (tables, graphs, equations, and verbal rules), which helped students develop an understanding of functions (Kalchman & Koedinger, 2005).

Another way to facilitate these links is by starting with familiar contexts, such as earning money for each mile walked in a walkathon, allowing students to see abstractions of those functions from concrete meaningful situations (Kalchman & Koedinger, 2005). Likewise, instructors can provide students with scenarios in which they collect and summarize data and guide them toward inventing visualizations, so that the mappings between the statistical concepts and visualizations are clear (Lehrer & English, 2018; Lehrer & Romberg, 1996; Lehrer et al., 2000).

Visually Communicating Uncertainty and Risk

Why communicating uncertainty is critical

Quantifying uncertainty is critical for analysts and scientists who want to make informed decisions from data. Uncertainty is inherent in the generation of data and models, and it comes from multiple sources. Individual measurements contain noise. Measurements of a population of values typically rely on a sample of those values that is incomplete, or even biased. To estimate average human height, for example, we must rely on measuring a subset of all people. Averaging 100 people’s heights produces an estimate, but an uncertain one, because we did not definitively measure the height of every person in the world. If we assume that the 100 people are randomly sampled from the population, then our uncertainty can be calculated using statistical theory. However, if we were to sample 100 people who all happen to be NBA players, our sample would not be representative of the world, and standard statistical approaches for random samples would not adequately capture the true uncertainty.

Effective communication of uncertainty is necessary for scientific literacy among the public; without understanding uncertainty, people cannot accurately calibrate their internal sense of how reliable a pattern or claim is. Yet communicating uncertainty is challenging as a result of many well-documented biases related to decision making, many of which are characterized by misuse of statistical evidence in reasoning (e.g., Tversky & Kahneman, 1974). Unfortunately, the level of uncertainty in analysts’ calculations is often judged to be too complex or time-consuming to communicate to lay readers or busy decision makers (Fischhoff & Davis, 2014).

Communicating uncertainty is also important for maintaining public trust. In weather forecasts, when uncertainty is presented counterintuitively (or not presented at all), people tend to discount it. Unfortunately, when the forecast turns out to be “wrong,” this can lead to a lack of trust in scientific forecasting (Binder et al., 2016; Joslyn & LeClerc, 2012). Among scientists, underemphasis or misunderstandings of sampling error (a form of uncertainty) and the likelihood of replicating experimental results may contribute to continued use of underpowered studies and the “replication crisis” across many previously accepted findings (Ioannidis, 2005).

Some common cognitive biases are related to misunderstandings of uncertainty. One reason this may be so is that many core statistical concepts, such as variability, estimation, and sampling, are complex and can be defined clearly only with reference to statistical theory. A key challenge in communicating uncertainty is that lay audiences may not appreciate that estimates are subject to variability and may not be familiar with the statistical abstractions commonly used to express these concepts (Gal, 2002).

This lack of understanding of statistical constructs is present even among researchers. In a frequentist statistical paradigm, it is not possible to make statements about the probability that a specific interval contains the true population parameter value (e.g., the true average height of U.S. males) because probability can be defined only as frequency over a long run of trials. Instead, the frequentist 95% confidence interval refers to the probability that a confidence interval constructed in the same way as the plotted interval contains the population parameter. However, even users with statistical training, such as graduate students, are prone to mistakenly conclude that it is 95% probable that a 95% confidence interval contains the true parameter (Hoekstra et al., 2012). Likewise, the relationship between statistical significance and whether or not two error bars overlap is often misunderstood: When two frequentist 95%-confidence-interval error bars do not overlap, it is correct to assume that there is a significant difference between the two quantities at an alpha level of 0.05. However, when the two intervals do overlap, researchers incorrectly assume that there is no significant difference between the two quantities (Belia et al., 2005).

Errors such as these likely stem from challenges that students of statistics and others face concerning how to interpret the sampling distribution of the mean implied by a 95% confidence interval or standard-error interval (Chance et al., 2004; Hullman et al., 2017). The sampling distribution is the distribution of means expected if one were to repeatedly draw samples of a given size n for a population. Although the sampling distribution is a more natural choice when trying to answer questions about the probability that a difference is not zero (i.e., statistical significance) and is systematically related to the variance in the underlying measurements (and in estimates of the parameter value in the population), this distribution does not directly address the sorts of questions one might ask about the effect of applying a treatment in the world.

Common visualizations of uncertainty are often misinterpreted

Uncertainty is challenging to communicate because the concept of uncertainty is difficult to understand in the first place. In graphs designed for a lay audience, depictions of uncertainty are often simply omitted, even when the graphs are intended to support inferences beyond the data sample that is shown (Hullman, 2019). When uncertainty is presented, it is most frequently shown through error bars. They typically represent either a standard-deviation range, a standard-error range, or a confidence interval (e.g., a frequentist 95% confidence interval or a Bayesian 95% credible interval). Which of these constructs is depicted has a large influence on a viewer’s impression of effect size—referring to a set of quantitative measures of the magnitude of a difference, such as the difference in means between two distributions, divided by their pooled standard deviation (Hofman et al., 2020). For example, when viewing results of an evaluation of a new drug relative to a control, one might wonder how much taking the new drug is likely to help a randomly drawn patient. When shown frequentist 95% confidence intervals representing uncertainty on bars representing the control and treatment outcomes (Hofman et al., 2020; see Fig. 20, left), lay viewers were inclined to pay more for a treatment and overestimate effect size relative to when they saw error bars showing a predictive interval representing uncertainty in the underlying measurements (Fig. 20, middle). Although not as effective as showing predictive intervals, rescaling the y-axis of a chart showing 95%-confidence-interval error bars to display the range required for the predictive interval slightly reduced overpayment and bias in effect-size estimations (Fig. 20, right).

Fig. 20. Examples of misunderstandings of variance depictions. Hofman et al. (2020) found that participants in an online study who viewed the chart on the left, showing a range of 1.96 SE, were more likely to pay more for a treatment and to overestimate the size of the treatment effect than those who saw the chart in the center, showing a range of 1.96 SD. Rescaling the y-axis of a chart showing error bars representing 1.96 SE to the range required to show 1.96 SD, as in the rightmost chart, slightly reduced overpayment and overestimation. Republished with permission of the Association for Computing Machinery, from “How Visualizing Inferential Uncertainty Can Mislead Readers About Treatment Effects in Scientific Results,” by J. M. Hofman, D. G. Goldstein, and J. Hullman, CHI ’20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (https://doi.org/10.1145/3313831.3376454). Copyright 2020 Association for Computing Machinery.

Error bars can lead lay participants to use visual reasoning strategies that result in error-prone perceptions of effect size. For example, when lay participants were asked to evaluate the respective effect size between two distributions shown as bars representing means with error bars, their answers appeared to differ mostly on the basis of the difference between the means and were unaffected by the level of uncertainty around those means (Hullman et al., 2015). One investigation of the visual reasoning strategies employed by lay participants to make effect-size judgments and decisions from error bars and other common uncertainty visualizations found that many participants used approximations based on visual distances in the chart rather than reasoning more deliberately about the size and uncertainty of effects (Kale et al., 2021).

If lay viewers are unfamiliar with standard-deviation and confidence intervals, it is perhaps unsurprising that they tend to misunderstand the depictions of those parameters through error bars or confidence envelopes. Error bars are not seen as parameter estimates for a continuous distribution but are instead mistaken as depicting the discrete range of the data. When shown bars representing 95% confidence intervals of a forecasted nighttime low temperature, participants incorrectly believed that the upper and lower bounds represented forecasted high and low temperatures (Joslyn & LeClerc, 2012). This mistake occurred despite the presence of a highly visible key that depicted the correct interpretation. Figure 21 illustrates how not just error bars but bar graphs themselves can lead to similar misunderstandings of uncertainty: Viewers appear to implicitly believe that a bar contains the full distribution of data values that it summarizes because of its metaphorical status as a container. When reminded that the tip of a bar graph represents the mean of a distribution of values, viewers still rate points that fall slightly below the tip as more likely to belong to that distribution than points placed slightly above it (Correll & Gleicher, 2014b; Newman & Scholl, 2012).

Fig. 21. Examples of misunderstandings of the boundaries of graphical objects. At left, the dots below and above the bars are equidistant from the bars’ tips, but viewers rate the point inside the bar as being more likely to belong within the distribution the bar summarizes. At right, the graph presents an adapted version of the cone of uncertainty created by Le Liu and Donald House and tested in Ruginski et al. (2016).

Another notable example of this “discrete range” error occurs with hurricane-forecast visualizations. The most popular method for displaying hurricane paths, produced by the National Hurricane Center (see Fig. 21), uses a “cone of uncertainty” to represent 66% confidence intervals across thousands of storm-track simulations. Note that this means that many predicted storms would travel outside of this cone. Unfortunately, people incorrectly believe that the cone depicts a discrete range, such that areas in the cone are in danger and areas outside are relatively safe (e.g., Padilla et al., 2017, 2020). Even worse, many viewers do not recognize the cone as reflecting the storm’s potential paths, instead assuming that it depicts growth in the storm’s size over time.

How to visualize uncertainty more intuitively

Instead of depicting summary statistics (e.g., confidence intervals) that are rarely well understood by lay audiences, a better approach is to show the underlying probability distribution. Density plots (Fig. 22, top left) and violin plots (similar, but typically mirrored to show a single symmetric shape) show distributions by mapping the probability of a given value to width. These plots can help viewers intuitively understand the predicted variability of values, location (e.g., mean), and distribution shape (e.g., normal, skewed). They can also promote intuitive assessments of how likely or surprising different values are, in ways that are more closely aligned with normative statistical definitions (Correll & Gleicher, 2014b). Several studies have found that density plots lead to better-quality decisions than error bars showing predictive or confidence intervals (Fernandes et al., 2018; Hofman et al., 2020; Kale et al., 2021).

Fig. 22. Depictions of distributions. At left, a probability density plot is represented as a quantile dot plot that uses a frequency (discrete-outcome) framing. Here, black dots or shaded regions indicate values above freezing temperatures. At right, a jittered variation with a categorical variable added via color shows uncertainty surrounding the 2020 U.S. presidential election outcome. Graph at right adapted by Anna Wiederkehr of FiveThirtyEight (for more on FiveThirtyEight’s creation of similar graphics, see Wiederkehr, 2020). Used with permission from FiveThirtyEight.

In other work, some researchers have argued that visual features such as fuzziness, color value, and unorderly line arrangement convey probability more naturally than more abstract mappings, such as mappings between uncertainty and size (MacEachren et al., 2012; see Fig. 23, left). However, representations that were judged to be more associated with uncertainty did not necessarily lead to faster or more accurate judgments. These findings illustrate a trade-off whereby some encoding types provide a subjective impression of uncertainty but may also reduce precision, making some tasks hard to complete. Experts might therefore prefer more precise representations because they have no trouble understanding how these visualizations convey uncertainty, whereas lay viewers likely benefit more from less precise visualizations that are more semantically resonant (or more naturally interpreted as uncertainty).

Fig. 23. Methods for depicting uncertainty. On the left of the figure are different visual variables for conveying uncertainty tested by MacEachren et al. (2012), who found that fuzziness, location, arrangement, and color value were rated as relatively more logical than other variables for representing uncertainty. The right shows a bivariate color map that separately encodes value (via hue) and uncertainty (via lightness). A value-suppressing uncertainty palette uses a tree metaphor to collapse value distinctions in proportion to the uncertainty surrounding the values (Correll et al., 2018). Left figure reprinted with permission from “Visual Semiotics & Uncertainty Visualization: An Empirical Study,” by A. M. MacEachren, R. E. Roth, J. O’Brien, B. Li, D. Swingley, and M. Gahegan, 2012, in IEEE Transactions on Visualization and Computer Graphics, 18(12), p. 2497 (https://doi.org/10.1109/TVCG.2012.279). Copyright 2012 by IEEE. Right figure republished with permission of the Association for Computing Machinery, “Value-Suppressing Uncertainty Palettes,” by M. Correll, D. Moritz, and J. Heer, CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (https:///doi.org/10.1145/3173574.3174216). Copyright 2018 Association for Computing Machinery.

Yet another approach for showing statistical uncertainty intuitively is to rely on visual depictions that are inherently perceptually uncertain, preventing the viewer from resolving a value precisely. If the location of a point on a map could be blurred proportionally to the uncertainty in the position, then the recovered position would intuitively contain at least that level of uncertainty. Value-suppressing uncertainty palettes attempt this for color, taking advantage of a natural relationship among hue, saturation, and lightness: Hues are easier to tell apart when they are more saturated and darker (Correll et al., 2018; Fig. 23, right). Value is mapped to hue and certainty to saturation and lightness (darker for more certain). As a result, value judgments are more difficult for more uncertain values—the most uncertain values all appear as the same shade of gray. When this approach is applied to maps, users weigh uncertainty more heavily than they do when using conventional bivariate maps separating value and uncertainty.

Present uncertainty as examples over space or time

Mounting evidence suggests that individuals can more effectively understand probabilities (e.g., 10%) when they are communicated as frequencies (1 out of 10; e.g., Hoffrage & Gigerenzer, 1998; Peters et al., 2011;Schapira et al., 2001). For example, individuals were informed about a headache medication’s side effects, framed as either a percentage (e.g., 10% of patients got a bad rash from the medication) or a frequency (e.g., 10 out of 100 patients got a bad rash from the medication), before they rated the riskiness of the treatment. Individuals with lower numeracy rated the treatment as less risky when the information was communicated with a percentage, as opposed to a frequency, framing. Individuals with high numeracy did not show a difference in their riskiness ratings as a function of framing. These results imply that the framing influenced individuals with low numeracy because the percentage was harder for them to understand than the frequency, whereas those with high numeracy understood both formats equally well (Peters et al., 2011). The frequency framing may be more effective because it more naturally maps to how we experience probability in our daily lives (Hoffrage & Gigerenzer, 1998).

Convey probability with frequency-based visualizations

Frequency-based visualizations are those that express probability as frequency or ratio, as in a set of icons that show the numerator and denominator (icon arrays; see Fig. 24) or dots that represent discrete probabilities in a distribution (quantile dot plots; see Fig. 22, left). The defining characteristic of frequency-framing visualizations is that they allow the viewer to infer probability via the frequency of visual attributes (e.g., position or size) of discrete visual elements such as icons, dots, or lines.

Fig. 24 Icon arrays from Galesic et al. (2009), which help viewers visually compare the relative rate of stroke or heart attack with and without aspirin. Copyright © 2009 by the American Psychological Association. Adapted with permission from “Using Icon Arrays to Communicate Medical Risks: Overcoming Low Numeracy,” by M. Galesic, R. Garcia-Retamero, and G. Gigerenzer, Health Psychology, 28(2), p. 211 (https://doi.org/10.1037/a0014474).

For example, a substantial body of research has demonstrated that icon arrays are one of the most effective ways to communicate probability in a health-care context because they use frequency framing (Fagerlin et al., 2005; Feldman-Stewart et al., 2007; Garcia-Retamero & Galesic, 2009; Garcia-Retamero et al., 2010; Hawley et al., 2008; Tait et al., 2010; Waters et al., 2006, 2016). In one study, students and older adults read descriptions of multiple medical scenarios framing the effects of treatments in terms of either absolute or relative health risks (e.g., “For people with symptoms of arterial disease, aspirin can reduce the risk of having a stroke or heart attack by 13%” vs. “8% of such people who did not take aspirin had a stroke or heart attack, compared with 7% of such people who did take aspirin”; Galesic et al., 2009). For a second group of participants, each text description was accompanied by two icon arrays that reinforced the numeric information (see Fig. 24). When asked to estimate outcomes with and without treatment (e.g., how many people out of 1,000 would have a stroke or heart attack if they did and did not take aspirin), participants who had received the additional icon arrays were significantly more accurate than those who had received only text-based probabilities. The benefits of icon arrays were observed for both students and older adults, and individuals with low numeracy drove the effect (Galesic et al., 2009; see also Garcia-Retamero & Galesic, 2009).

Researchers have conceptually replicated the advantages of communicating ratio data as icon arrays numerous times (e.g., Garcia-Retamero et al., 2010; Okan et al., 2012; Stone et al., 2003; Zikmund-Fisher et al., 2014; for a meta-review, see Garcia-Retamero & Cokely, 2017). Another study also found that patients trust icon arrays more than other common visualization techniques (Hawley et al., 2008), and other work has confirmed that icon arrays help patients with low numeracy correctly interpret probabilities (e.g., Galesic et al., 2009; Garcia-Retamero & Galesic, 2009; Hawley et al., 2008).

A frequency-framing approach is also recommended for conveying distributional information. In the hurricane cone-of-uncertainty example in Figure 21, viewers apply the wrong schema when translating the visualization back to the data that it represents. Viewers misinterpret the edge of the cone, which depicts error bars showing 66% confidence intervals, as ranges (Ruginski et al., 2016) or mismap the width of the cone to the storm’s size instead of its potential path (Padilla et al., 2017). Given that policymakers and the general public rely on hurricane-path information to make decisions about preparation and potential evacuation, it is critical that they understand the depicted information immediately and intuitively.

Because it seems unrealistic to attempt to teach lay viewers both the requisite underlying statistical concepts and the schema for how they are mapped to a visualization, a more fruitful route is to redesign the visualization to use a frequency-based approach that pulls from existing metaphors known to lay viewers. One strategy is to simply show numerous examples of the samples that created the summary statistics (or simulated draws from the population estimate). This mimics the experience of collecting samples in the real world: flipping coins to gauge the percentage of tails, trying multiple menu items to gauge a restaurant, or testing how various kicking styles might put spin on a soccer ball. Providing discrete-outcome framings of data can provide a frequency metaphor that reduces probability-processing errors (Hullman et al., 2015; Kale et al., 2019, 2021; Kay et al., 2016; Padilla et al., 2017).

To apply this alternative approach to hurricane forecasts, one solution to misunderstandings of the cone of uncertainty is to instead plot a sample of possible hurricane paths, allowing the viewer to intuit which areas are in most danger (Fig. 25). This method stops viewers from confusing uncertainty about the hurricane’s path with the size of the storm and helps them understand which areas are most endangered (L. Liu et al., 2018). When showing uncertainty information using a frequency metaphor, one must ensure that the samples shown are representative of the uncertainty in the data. In hurricane forecasting, track visualizations are sometimes referred to as “spaghetti plots” because they can look messy, like a plate of noodles, when randomly sampled (Fig. 25, left). To increase the usefulness of this technique, researchers have developed reconstruction procedures to ensure that the tracks clearly show the full uncertainty in a storm’s trajectory (see Fig. 25, right; L. Liu et al., 2018).

Fig. 25. Techniques for depicting uncertainty in hurricane paths. At left, randomly sampled ensemble hurricane paths, known as spaghetti plots, can look messy. At right, a new procedure for reconstructing ensemble paths can show the distribution’s full spread and reduce visual clutter. When paths are easy to distinguish, additional information—such as the storm’s size and intensity—can be added to the plot (L. Liu et al., 2018).

Density and violin plots are useful for conveying the data’s shape, but it can be difficult to use them to extract specific cumulative values (What is the probability of a randomly selected value’s being less than this one?), which requires the viewer to visually calculate an integral. In contrast, quantile dot plots provide a generalizable approach for showing a distribution yet also make values easier to read, by providing probabilities not only through the heights of the function (as in the density plot) but also through the (easily countable) number of dots that contribute to that height. As seen in Figure 22 (left), a quantile dot plot represents a distribution using stacked dots that account for a specific portion of the data; in this case, each dot depicts a 5% probability.

Using the graph at the left of Figure 22 as an illustration, imagine that the viewer’s task is to decide whether the nighttime low temperature will drop below freezing. With the quantile dot plot, the viewer could count the dots to determine that there is a 30% chance the temperature will be 32 °F or lower. This same task is much harder with a density plot because it requires the viewer to visually calculate an integral under the curve. Evidence suggests that quantile dot plots improve accuracy and memory compared with density plots (Hullman et al., 2017; Kay et al., 2016) and outperform summary plots, density plots, and text descriptions of uncertainty for decisions with risk (Fernandes et al., 2018).

To present their 2020 U.S. presidential forecast, FiveThirtyEight used a variant of a quantile dot plot, which they called a “ball swarm” plot, showing 100 random samples from their model (Fig. 22, right; for details on the creation of these plots, see Wiederkehr, 2020). Traditionally, dot plots have been used to show distributions of experimental samples (Wilkinson, 1999). In the ball-swarm plot used by FiveThirtyEight, a total of 100 balls representing 100 hypothetical elections were clustered along an x-axis denoting the margin by which the election could be won by either candidate. Some balls were colored red in proportion to the forecast’s current predictions of Donald Trump’s chance of winning; the remainder were colored blue to depict Joe Biden’s chances.

Showing multiple samples or simulated draws from a distribution can more intuitively communicate that distribution to viewers without training in statistics. Showing those samples simultaneously can lead in many cases to accurate estimates of probability, but in other cases it can lead to new misconceptions. For example, when shown a sample of possible hurricane tracks, viewers often see that set as reflecting the entire population of possibilities (no more and no fewer) instead of merely a subset. When shown visualizations such as those in Figure 25, viewers can overreact if one of the paths directly hits their town (Padilla et al., 2020), even though the locations between neighboring paths are generally as likely to be hit as the locations underneath a path.

One solution to this problem is to show those samples serially across time instead of simultaneously in space, relying on the limited capacity of working memory to force viewers to naturally abstract those samples into intuitive statistics. Hypothetical outcome plots (HOPs; Hullman et al., 2015; Kale et al., 2019) animate a set of random draws from a distribution, showing each draw for a short duration (< 500 ms). Figure 26 presents an example set of frames in which each frame is one random draw from the joint distribution represented by the error bars on the left. The frames are shown in a random sequence, which creates an animation that can give viewers an intuitive sense of the uncertainty surrounding the true trend.

Fig. 26. Example of a hypothetical outcome plot. An example set of frames (right) in which each frame is one random draw from the distribution on the left. The frames are shown in a random sequence, which creates an animation that can give viewers an intuitive sense of the uncertainty in the true mean (Hullman et al., 2015).

Presenting animated draws over time has several advantages relative to static approaches. First, joint probabilities (e.g., What is the chance that the rate of disease really is higher in Location A than in Location B?) are naturally expressed as patterns over time, but showing them in a static visualization would require intentionally choosing an encoding capable of presenting joint probability (e.g., a heat map). The visual system’s natural ability to extract summary statistics from temporal frequency (Hasher & Zacks, 1984) can allow viewers to judge effect size more easily than they could with static depictions of each distribution (e.g., as bars with error bars or violin plots, which do not convey dependence between distributions).

Second, when viewers may be likely to ignore uncertainty in favor of simpler heuristics—for instance, judging an effect size using only the size of the visual difference between the means of two variables, regardless of axis scaling—any static visualization that encodes central tendency runs the risk of allowing users to discount uncertainty in their judgments. In such cases, viewers may ignore uncertainty or use an overly simple strategy, whether a mean is depicted using a direct summary mark, such as a mean temperature forecast presented along with a confidence interval of the mean (Joslyn & LeClerc, 2012), or implicitly, on the basis of the highest (or widest) point in a density or violin plot of a Gaussian distribution (Hullman et al., 2015; Kale et al., 2021).

Finally, in visualizations that use the most precise visual channels to show data—for example, a map that uses both vertical and horizontal dimensions of position to show geographic locations—it can be difficult to convey uncertainty because that requires adding a new visual channel (e.g., saturation of position markers) to an already complex visualization. However, as long as a visualization is not already animated, HOPs can be used without requiring the designer to choose a new way to depict uncertainty. This has inspired visualization researchers to use probabilistic animation to show uncertainty in depictions of geospatial data (Ehlschlaeger et al., 1997; Fisher, 1993) as well as other complex visualizations (Feng et al., 2010).

Designers might also draw from the best of both worlds by combining animated draws with static displays. Each animated draw of a hypothetical outcome plot could leave a trace that slowly builds into a static display such as a gradient plot, or animated draws could help visually explain a static technique such as a density plot or error bar. The New York Times presented animated dots in a simulation to show inequalities in wealth distribution due to race (Badger et al., 2018). This combination of animated and static depictions of uncertainty is fertile ground for research at the intersection of psychology, statistical cognition, and data visualization.

How to communicate health risk

The rapid advancement of technology has made it increasingly easy to collect and analyze massive amounts of health-related data (Ola & Sedig, 2016). In addition to traditional medical records, doctors and patients now have access to information from population surveys, genomic sequencing, ancestral records, wearable devices, and medical implants. Given the high volume, rapid development, and diversity of this information, it is understandable that making decisions with health data can be complicated. A large body of research is dedicated to examining if visualizations can help doctors and patients incorporate data into health and wellness planning (for reviews, see Ancker et al., 2006; Garcia-Retamero & Cokely, 2017; Garcia-Retamero et al., 2012; Lipkus, 2007; Lipkus & Hollands, 1999).

A consistent finding from research on the visual communication of health data is that visual aids can clarify numeric information for both patients and doctors. A systematic review of visual aids in health research found that effectively designed visual aids can improve risk assessments, resulting in reductions in health-decision biases, improvements in healthy behaviors, and increased trust in treatments (Garcia-Retamero & Cokely, 2017). It is noteworthy that the benefits of visual aids are strongest for individuals with low numeracy (Lipkus et al., 2001) and low risk literacy (i.e., ability to evaluate one’s risk; Ancker & Kaufman, 2007). Low numeracy tends to correlate with less advanced education, which disproportionately affects minoritized groups (Rodríguez et al., 2013)—however, one study with 463 college-educated participants found that 16% to 20% incorrectly answered relative simple questions about risk and probability (e.g., “Which represents the larger risk: 1%, 5%, or 10%?”; Lipkus et al., 2001).

For visual aids to effectively help people reason about health-care data, visualizations must be designed with care (Fagerlin et al., 2011). The following section provides a summary of what works in the communication of health-care data, drawing from the most consistent and reliable evidence-based findings to date.

Convey risks with absolute—not relative—rates

A rational decision maker should not make different treatment decisions based on the way that medical risk is conveyed. Unfortunately, the presentation of medical-risk information has a profound influence on judgments, even for highly educated people. In a study of 235 practicing physicians who were asked to indicate how likely they were to treat patients for hypertension and hypercholesterolemia on the basis of information about the treatments; 41.3% were more likely to recommend a treatment when its effects were framed as a relative reduction in risk compared with an absolute reduction in risk (Forrow et al., 1992). Absolute risk is the overall probability of experiencing something, such as getting a disease, during some period of time. Relative risk is a comparison between the absolute risk of two groups (Natter & Berry, 2005). As Fagerlin et al. (2011) illustrated, the change in breast cancer risk from 4% to 2% associated with the drug Tamoxifen can be communicated in a relative format (a reduction of 50%) or an absolute format (an absolute reduction of 2%).

Absolute-risk formats are preferable: Many studies have found that the same treatment outcomes are perceived more favorably when communicated in terms of relative risk as opposed to absolute risk (for a meta-analysis, see Covey, 2007). Further, relative-risk framing can make changes in risk appear larger than absolute-risk framing (e.g., a 50% relative change incorrectly seems larger than a 2% absolute change; Akl et al., 2011; Baron, 1997; Forrow et al., 1992; Malenka et al., 1993). This framing bias makes physicians more likely to prescribe interventions when information about their effects is communicated using a relative-risk format (e.g., Bucher et al., 1994; Lacy et al., 2001; Naylor et al., 1992). A large body of evidence suggests that the absolute-risk format leads to the least biased decisions. Although this has not been tested with visualizations, the evidence points to absolute risk as the best choice.

Convey ratios with icon arrays and pictographs

In addition to conferring the previously described benefits of frequency framing, icon arrays help to reduce common errors in health-care communication, such as individuals’ focusing on the numerator and neglecting the denominator, an effect called denominator neglect (for review, see, Garcia-Retamero et al., 2012). For example, when comparing a cancer with a mortality rate of 1,286 out of 10,000 people to a cancer with a mortality rate of 24.14 out of 100 people, undergraduate participants reported that the former cancer was riskier (Yamagishi, 1997). Researchers have proposed that individuals pay more attention to the relative differences in numerators (in this case, 1,286 vs. 24.14 deaths), even though they should consider the relative ratios (12.86% vs. 24.14% mortality; e.g., Garcia-Retamero et al., 2012; Yamagishi, 1997). A variety of studies (including Galesic et al., 2009, described above) have shown that icon arrays can reduce denominator neglect by allowing patients to compare relative ratios visually (see Fig. 23; see also Garcia-Retamero & Galesic, 2009).

Icon arrays can help patients overcome common biases. In one study, participants read statistical information about two treatment options for angina—bypass surgery, which is relatively onerous but 75% successful, and balloon angioplasty, which is less onerous but only 50% successful—and anecdotes about individuals’ experiences with each procedure that were either unrepresentative or representative (50% positive and 50% negative for both procedures vs. 75% positive and 25% negative for bypass surgery). Some participants were also provided with icon arrays that reinforced the statistical information. Of the participants who did not see the icon arrays, 41% who received more anecdotes about bypass surgery’s positive effects reported that they would likely prefer surgery. Surprisingly, only 20% of those who read fewer positive stories, without the additional icon arrays, responded that they would undergo surgery. When those participants were provided with the icon arrays, the effect of the anecdotes disappeared (Fagerlin et al., 2005).

To avoid denominator neglect, it is best to use icon arrays that clearly show part-to-whole comparisons by putting both the denominator and the numerator in the same array (as in Fig. 23; Garcia-Retamero & Cokely, 2017). Designers should avoid showing only the numerator with icons, and leaving the denominator only in the text, because viewers will be likely to neglect the denominator (Stone et al., 2003). Icons should also be arranged into a systematic grid that makes them easy to count, because icon arrays that are not arranged systematically are challenging to use (Feldman-Stewart et al., 2007), particularly for viewers with low numeracy (Ancker et al., 2011; Zikmund-Fisher et al., 2012).

Designers can also consider using anthropomorphic icons, such as outlines of people. In one study, people saw icon arrays of their personalized cardiovascular risks based on data about their age, weight, and health metrics. Specifically, each individual was shown one of the icon-array formats shown in Figure 27 and was then asked to report their perceived personal risk on a scale from 0 (extremely small) to 100 (extremely big). People were also asked to report how likely they thought they were to have heart disease or a stroke within 10 years and were later asked how well they remembered the icon array’s information. Individuals with higher graph literacy and numeracy who were shown restroom icons had the highest correlation between their perceived risk and actual risk (Zikmund-Fisher et al., 2014). By contrast, those with low graph literacy and numeracy had the same relationship between their perceived risk and actual risk across all visualizations. In addition, participants’ memory for their risk scores was more accurate when they were shown anthropomorphic icons (restroom icons, heads, or photos), particularly for those with low graph literacy and numeracy. Participants also rated the restroom icons as the most preferable. An anthropomorphic font called Wee People provides an easy-to-use set of icons depicting realistic human silhouettes and is available under a Creative Commons license (Cairo & Klein, 2018).

Fig. 27. Icon arrays tested in Zikmund-Fisher et al. (2014). Original figure published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Summary of Key Guidelines

A viewer’s visual system can extract broad statistics about the data within a display, such as the mean and extrema, within a fraction of a second. Visualize your data with histograms and scatterplots before trusting statistical summaries.
Beware common visual illusions and confusions. Failing to start axes at zero can cause viewers to overestimate differences. When plotting data with circles or squares, map the data to their areas, not their diameters. The differences between lines in a line graph are increasingly perceptually distorted as the lines increase in slope. Do not plot intensities on intensities, which causes contrast illusions. Mapping a continuous set of numbers to a spectrum of different hues exaggerates differences that happen to straddle the hue boundaries. For accessibility of color-blind viewers, pair red with blue instead of green.
Although extracting global statistics is fast, comparisons between subsets of values are slow—limited to only a handful per second. So use visual grouping cues to control which set of comparisons a viewer should make, and use annotation and highlighting to narrow that set to the single most important comparison that supports your message. In a live presentation, rely on language and gesture to illustrate what you see. Do this even when you feel it is not needed: Presenters suffer from a “curse of knowledge” that causes them to overestimate how well others see what they see.
Avoid taxing working memory by converting legends into direct labels. When possible, integrate relevant text into visualizations as direct annotations. Avoid animations, which typically lead to confusion. Graphical embellishments, sometimes derided as “chart junk,” can distract if unrelated to the data, but if they are related, they can improve viewers’ memory and engagement.
New visualization formats must be learned, so try to rely on formats that are familiar to your audience. Respect common associations, such as “up” mapping to “more” for vertical position and “more opaque” mapping to “more” for intensity.
Graph comprehension depends on both bottom-up and top-down factors. Use bottom-up visual salience and top-down direct labels to drive attention to relevant features. Use a graph format that guides viewers to the conceptual message you are trying to convey, respecting their previous experience with graphs.
When communicating uncertainty to a lay audience, avoid error bars, which can be misinterpreted as data ranges. Instead, show examples of discrete outcomes, either simultaneously or over time.
When communicating risk to low-numeracy audiences, rely on absolute instead of relative rates, convey probabilities with frequencies (e.g., 3 out of 10) instead of percentages (e.g., 30%), and use well-constructed icon arrays with the same denominator.
Supporting comprehension and understanding is especially important when the intended audience may have low domain knowledge, knowledge about graphing conventions, numeracy, or working memory capacity.

Acknowledgements

We thank Evan Anderson and Madison Tyrcha for their invaluable assistance in preparing this manuscript.

Transparency

Editor: Nora S. Newcombe

Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.

Funding
S. L. Franconeri. was supported by National Science Foundation Grant IIS-CHS-1901485 and IIS-HCC-2107490. L. M. Padilla was supported by National Science Foundation Grants 2122174 and 2028374. P. Shah was supported by National Science Foundation Grants 2030059 and 2027822 and Institute of Education Sciences Grant R305A170489-F046846. J. M. Zacks was supported by National Institute on Aging Grant R01-AG062438, Office of Naval Research Grant N00014-17-1-2961, and a James S. McDonnell Foundation Opportunity Award. J. Hullman was supported by National Science Foundation Grant 1907941 and a Microsoft Research Faculty Fellowship.

References

Abukhodair, F. A., Riecke, B. E., Erhan, H. I., Shaw, C. D. (2013). Does interactive animation control improve exploratory data analysis of animated trend visualization? In Visualization and Data Analysis 2013 (Vol. 8654, p. 86540I). International Society for Optics and Photonics. https://doi.org/10.1117/12.2001874
Google Scholar | Crossref

Ajani, K., Lee, E., Xiong, C., Knaflic, C. N., Kemper, W., Franconeri, S. (2021). Declutter and focus: Empirically evaluating design guidelines for effective data communication. IEEE Transactions on Visualization and Computer Graphics. Advance online publication. https://doi.org/10.1109/TVCG.2021.3068337
Google Scholar | Crossref

Akl, E. A., Oxman, A. D., Herrin, J., Vist, G. E., Terrenato, I., Sperati, F., Costiniuk, C., Blank, D., Schünemann, H. (2011). Using alternative statistical formats for presenting risks and risk reductions. The Cochrane Database of Systematic Reviews, 2011(3), Article CD006776. https://doi.org/10.1002/14651858.CD006776.pub2
Google Scholar | Crossref

Albers, D., Correll, M., Gleicher, M. (2014). Task-driven evaluation of aggregation in time series visualization. In CHI ’14: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 551–560). Association for Computing Machinery. https://doi.org/10.1145/2556288.2557200
Google Scholar | Crossref

Alvarez, G. A., Thompson, T. W. (2009). Overwriting and rebinding: Why feature-switch detection tasks underestimate the binding capacity of visual working memory. Visual Cognition, 17(1–2), 141–159. https://doi.org/10.1080/13506280802265496
Google Scholar | Crossref

Amabili, L. (2019, August 22). From storytelling to scrollytelling: A short introduction and beyond. Medium. https://medium.com/nightingale/from-storytelling-to-scrollytelling-a-short-introduction-and-beyond-fbda32066964
Google Scholar

Amar, R., Eagan, J., Stasko, J. (2005). Low-level components of analytic activity in information visualization. In Stasko, J., Ward, M. (Eds.), IEEE Symposium on Information Visualization (InfoVis 2005, pp. 111–117). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/INFVIS.2005.1532136
Google Scholar | Crossref

Ancker, J. S., Kaufman, D. (2007). Rethinking health numeracy: A multidisciplinary literature review. Journal of the American Medical Informatics Association, 14(6), 713–721. https://doi.org/10.1197/jamia.M2464
Google Scholar | Crossref

Ancker, J. S., Senathirajah, Y., Kukafka, R., Starren, J. B. (2006). Design features of graphs in health risk communication: A systematic review. Journal of the American Medical Informatics Association, 13(6), 608–618. https://doi.org/10.1197/jamia.M2115
Google Scholar | Crossref

Ancker, J. S., Weber, E. U., Kukafka, R. (2011). Effect of arrangement of stick figures on estimates of proportion in risk graphics. Medical Decision Making, 31(1), 143–150. https://doi.org/10.1177/0272989X10369006
Google Scholar | SAGE Journals

Anscombe, F. J. (1973). Graphs in statistical analysis. The American Statistician, 27(1), 17–21. https://doi.org/10.1080/00031305.1973.10478966
Google Scholar | Crossref

Asada, K. (2019). Chromatic vision simulator. https://asada.website/cvsimulator/e/index.html
Google Scholar

Badger, E., Cain Miller, C., Pearce, A., Quealy, K. (2018, March 19). Extensive data shows punishing reach of racism for black boys. The New York Times. https://www.nytimes.com/interactive/2018/03/19/upshot/race-class-white-and-black-men.html
Google Scholar

Baek, J., Chong, S. C. (2020). Ensemble perception and focused attention: Two different modes of visual processing to cope with limited capacity. Psychonomic Bulletin & Review, 27(4), 602–606. https://doi.org/10.3758/s13423-020-01718-7
Google Scholar | Crossref

Baron, J. (1997). Confusion of relative and absolute risk in valuation. Journal of Risk and Uncertainty, 14(3), 301–309. https://doi.org/10.1023/A:1007796310463
Google Scholar | Crossref

Bateman, S., Mandryk, R. L., Gutwin, C., Genest, A., McDine, D., Brooks, C. (2010). Useful junk? The effects of visual embellishment on comprehension and memorability of charts. In CHI ’10: Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 2573–2582). Association for Computing Machinery. https://doi.org/10.1145/1753326.1753716
Google Scholar | Crossref

Belia, S., Fidler, F., Williams, J., Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods, 10(4), 389–396. https://doi.org/10.1037/1082-989X.10.4.389
Google Scholar | Crossref

Berinato, S. (2016). Good charts: The HBR guide to making smarter, more persuasive data visualizations. Harvard Business Review Press.
Google Scholar

Bertin, J. (1981). Graphics and the graphical analysis of data (Berg, W. J., Scott, P., Trans). Walter de Gruyter.
Google Scholar

Bertin, J. (1983). Semiology of graphics. The University of Wisconsin Press.
Google Scholar

Bertini, E., Correll, M., Franconeri, S. (2020). Why shouldn’t all charts be scatter plots? Beyond precision driven visualizations. In VIS: 2020 IEEE Visualization Conference (pp. 206–210). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/VIS47514.2020.00048
Google Scholar | Crossref

Binder, A. R., Hillback, E. D., Brossard, D. (2016). Conflict or caveats? Effects of media portrayals of scientific uncertainty on audience perceptions of new technologies. Risk Analysis, 36(4), 831–846. https://doi.org/10.1111/risa.12462
Google Scholar | Crossref

Birch, S. A., Bloom, P. (2007). The curse of knowledge in reasoning about false beliefs. Psychological Science, 18(5), 382–386. https://doi.org/10.1111/j.1467-9280.2007.01909.x
Google Scholar | SAGE Journals

Boger, T., Most, S. B., Franconeri, S. L. (2021). Jurassic mark: Inattentional blindness for a datasaurus reveals that visualizations are explored, not seen. arXiv. https://arxiv.org/abs/2108.05182v2
Google Scholar | Crossref

Borji, A., Sihite, D. N., Itti, L. (2013). Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing, 22(1), 55–69. https://doi.org/10.1109/TIP.2012.2210727
Google Scholar | Crossref

Borkin, M. A., Bylinskii, Z., Kim, N. W., Bainbridge, C. M., Yeh, C. S., Borkin, D., Pfister, H., Oliva, A. (2016). Beyond memorability: Visualization recognition and recall. IEEE Transactions on Visualization and Computer Graphics, 22(1), 519–528. https://doi.org/10.1109/TVCG.2015.2467732
Google Scholar | Crossref

Borkin, M. A., Vo, A. A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., Pfister, H. (2013). What makes a visualization memorable? IEEE Transactions on Visualization and Computer Graphics, 19(12), 2306–2315. https://doi.org/10.1109/TVCG.2013.234
Google Scholar | Crossref

Börner, K., Bueckle, A., Ginda, M. (2019). Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments. Proceedings of the National Academy of Sciences, USA, 116(6), 1857–1864. https://doi.org/10.1073/pnas.1807180116
Google Scholar | Crossref

Börner, K., Polley, D. E. (2014). Visual insights: A practical guide to making sense of data. MIT Press.
Google Scholar

Boroditsky, L. (2011). How languages construct time. In Dehaene, S., Brannon, E. M. (Eds.), Space, time and number in the brain (pp. 333–341). Academic Press.
Google Scholar | Crossref

Bostock, M., Carter, S., Cox, A., Quealy, K. (2012, October 5). One report, diverging perspectives. The New York Times. https://archive.nytimes.com/www.nytimes.com/interactive/2012/10/05/business/economy/one-report-diverging-perspectives.html
Google Scholar

Brady, T. F., Konkle, T., Alvarez, G. A. (2009). Compression in visual working memory: Using statistical regularities to form more efficient memory representations. Journal of Experimental Psychology: General, 138(4), 487–502. https://doi.org/10.1037/a0016797
Google Scholar | Crossref

Brath, R. (2014). 3D InfoVis is here to stay: Deal with it. In 2014 IEEE VIS International Workshop on 3DVis (pp. 25–31). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/3DVis.2014.7160096
Google Scholar | Crossref

Brenner, M. E., Mayer, R. E., Moseley, B., Brar, T., Durán, R., Reed, B. S., Webb, D. (1997). Learning by understanding: The role of multiple representations in learning algebra. American Educational Research Journal, 34(4), 663–689. https://doi.org/10.3102/00028312034004663
Google Scholar | SAGE Journals

Brewer, C. A. (1994a). Color use guidelines for mapping and visualization. Visualization in modern cartography. InMacEachren, A. M., Taylor, D. R. F. (Eds.), Visualization in modern cartography (pp. 123–147). Elsevier Science.
Google Scholar | Crossref

Brewer, C. A. (1994b). Guidelines for use of the perceptual dimensions of color for mapping and visualization. In Bares, J. (Ed.), IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology. Color hard copy and graphic arts III (Vol. 2171, pp. 54–63). International Society for Optics and Photonics. https://doi.org/10.1117/12.175328
Google Scholar | Crossref

Brooks, J. L. (2015). Traditional and new principles of perceptual grouping. In Wagemans, J. (Ed.), The Oxford handbook of perceptual organization (pp. 57–87). Oxford University Press.
Google Scholar

Bucher, H., Weinbacher, M., Gyr, K. (1994). Influence of method of reporting study results on decision of physicians to prescribe drugs to lower cholesterol concentration. BMJ, 309(6957), 761–764. https://doi.org/10.1136/bmj.309.6957.761
Google Scholar | Crossref

Burlinson, D., Subramanian, K., Goolkasian, P. (2017). Open vs. closed shapes: New perceptual categories? IEEE Transactions on Visualization and Computer Graphics, 24(1), 574–583. https://doi.org/10.1109/TVCG.2017.2745086
Google Scholar | Crossref

Burns, A., Xiong, C., Franconeri, S., Cairo, A., Mahyar, N. (2020). How to evaluate data visualizations across different levels of understanding. In 2020 IEEE Workshop on Evaluation and Beyond: Methodological Approaches to Visualization (BELIV) (pp. 19–28). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/BELIV51497.2020.00010
Google Scholar | Crossref

Bylinskii, Z., Kim, N. W., O’Donovan, P., Alsheikh, S., Madan, S., Pfister, H., Durand, F., Russell, B., Hertzmann, A. (2017). Learning visual importance for graphic designs and data visualizations. In UIST ’17 Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (pp. 57–69). Association for Computing Machinery. https://doi.org/10.1145/3126594.3126653
Google Scholar | Crossref

Cairo, A. (2016). The truthful art: Data, charts, and maps for communication. New Riders.
Google Scholar

Cairo, A. (2019). How charts lie: Getting smarter about visual information. W.W. Norton.
Google Scholar

Cairo, A., Klein, S. (2018). Our font is made of people. OpenNews.org. https://source.opennews.org/articles/our-font-made-people/
Google Scholar

Camerer, C., Loewenstein, G., Weber, M. (1989). The curse of knowledge in economic settings: An experimental analysis. Journal of Political Economy, 97(5), 1232–1254. https://doi.org/10.1086/261651
Google Scholar | Crossref

Canham, M., Hegarty, M. (2010). Effects of knowledge and display design on comprehension of complex graphics. Learning and Instruction, 20(2), 155–166. https://doi.org/10.1016/j.learninstruc.2009.02.014
Google Scholar | Crossref

Carpenter, P. A., Shah, P. (1998). A model of the perceptual and conceptual processes in graph comprehension. Journal of Experimental Psychology: Applied, 4(2), 75–100. https://doi.org/10.1037/1076-898X.4.2.75
Google Scholar | Crossref

Carswell, C. M. (1992). Choosing specifiers: An evaluation of the basic tasks model of graphical perception. Human Factors, 34(5), 535–554. https://doi.org/10.1177/001872089203400503
Google Scholar | SAGE Journals

Ceja, C. R., McColeman, C. M., Xiong, C., Franconeri, S. L. (2021). Truth or square: Aspect ratio biases recall of position encodings. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1054–1062. https://doi.org/10.1109/TVCG.2020.3030422
Google Scholar | Crossref

Chance, B., delMas, R., Garfield, J. (2004). Reasoning about sampling distributions. In Ben-Zvi, D., Garfield, J. (Eds.), The challenge of developing statistical literacy, reasoning and thinking (pp. 295–323). Springer.
Google Scholar | Crossref

ChanLin, L. J. (1998). Animation to teach students of different knowledge levels. Journal of Instructional Psychology, 25(3), 166–175.
Google Scholar

Chevalier, F., Dragicevic, P., Franconeri, S. (2014). The not-so-staggering effect of staggered animations on visual tracking. IEEE Transactions on Visualization and Computer Graphics, 20(12), 2241–2250. https://doi.org/10.1109/TVCG.2014.2346424
Google Scholar | Crossref

Cleveland, W. S., McGill, R. (1984). Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American Statistical Association, 79(387), 531–554. https://doi.org/10.1080/01621459.1984.10478080
Google Scholar | Crossref

Cleveland, W. S., McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific data. Science, 229(4716), 828–833. https://doi.org/10.1126/science.229.4716.828
Google Scholar | Crossref

Cohen, M. A., Dennett, D. C., Kanwisher, N. (2016). What is the bandwidth of perceptual experience? Trends in Cognitive Sciences, 20, 324–335. https://doi.org/10.1016/j.tics.2016.03.006
Google Scholar | Crossref

Correll, M., Albers, D., Franconeri, S., Gleicher, M. (2012). Comparing averages in time series data. In CHI ’12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1095–1104). Association for Computing Machinery. https://doi.org/10.1145/2207676.2208556
Google Scholar | Crossref

Correll, M., Bertini, E., Franconeri, S. (2020). Truncating the Y-axis: Threat or menace? In CHI ’20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3313831.3376222
Google Scholar | Crossref

Correll, M., Gleicher, M. (2014a). Bad for data, good for the brain: Knowledge-first axioms for visualization design. In Ellis, G. (Ed.), DECISIVe 2014: Workshop on Dealing With Cognitive Biases in Visualisations. IEEE VIS 2014. http://nbn-resolving.de/urn:nbn:de:bsz:352-0-329455
Google Scholar

Correll, M., Gleicher, M. (2014b). Error bars considered harmful: Exploring alternate encodings for mean and error. IEEE Transactions on Visualization and Computer Graphics, 20(12), 2142–2151. https://doi.org/10.1109/TVCG.2014.2346298
Google Scholar | Crossref

Correll, M., Moritz, D., Heer, J. (2018). Value-suppressing uncertainty palettes. In CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3173574.3174216
Google Scholar | Crossref

Covey, J. (2007). A meta-analysis of the effects of presenting treatment benefits in different formats. Medical Decision Making, 27(5), 638–654. https://doi.org/10.1177/0272989X07306783
Google Scholar | SAGE Journals

de Koning, B. B., Tabbers, H. K. (2011). Facilitating understanding of movements in dynamic visualizations: An embodied perspective. Educational Psychology Review, 23(4), 501–521. https://doi.org/10.1007/s10648-011-9173-8
Google Scholar | Crossref

Demiralp, Ç., Bernstein, M. S., Heer, J. (2014). Learning perceptual kernels for visualization design. IEEE Transactions on Visualization and Computer Graphics, 20(12), 1933–1942. https://doi.org/10.1109/TVCG.2014.2346978
Google Scholar | Crossref

Doerr, H. M., Zangor, R. (2000). Creating meaning for and with the graphing calculator. Educational Studies in Mathematics, 41(2), 143–163. https://doi.org/10.1023/A:1003905929557
Google Scholar | Crossref

Ehlschlaeger, C. R., Shortridge, A. M., Goodchild, M. F. (1997). Visualizing spatial data uncertainty using animation. Computers & Geosciences, 23(4), 387–395. https://doi.org/10.1016/S0098-3004(97)00005-8
Google Scholar | Crossref

Ellington, A. J. (2006). The effects of non-CAS graphing calculators on student achievement and attitude levels in mathematics: A meta-analysis. School Science and Mathematics, 106(1), 16–26. https://doi.org/10.1111/j.1949-8594.2006.tb18067.x
Google Scholar | Crossref

Engel, P. (2014, February 16). Gun deaths in Florida: Number of murders committed using firearms. Business Insider. https://www.businessinsider.com/gun-deaths-in-florida-increased-with-stand-your-ground-2014-2
Google Scholar

Fagerlin, A., Wang, C., Ubel, P. A. (2005). Reducing the influence of anecdotal reasoning on people’s health care decisions: Is a picture worth a thousand statistics? Medical Decision Making, 25(4), 398–405. https://doi.org/10.1177/0272989X05278931
Google Scholar | SAGE Journals

Fagerlin, A., Zikmund-Fisher, B. J., Ubel, P. A. (2011). Helping patients decide: Ten steps to better risk communication. Journal of the National Cancer Institute, 103(19), 1436–1443. https://doi.org/10.1093/jnci/djr318
Google Scholar | Crossref

Faraday, P., Sutcliffe, A. (1997). Designing effective multimedia presentations. In CHI ’97: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (pp. 272–278). Association for Computing Machinery. https://doi.org/10.1145/258549.258753
Google Scholar | Crossref

Farrar, C. (2012). Assessing the impact participation in science journalism activities has on scientific literacy among high school students (ERIC 541183). https://eric.ed.gov/?id=ED541183
Google Scholar

Feldman-Stewart, D., Brundage, M. D., Zotov, V. (2007). Further insight into the perception of quantitative information: Judgments of gist in treatment decisions. Medical Decision Making, 27(1), 34–43. https://doi.org/10.1177/0272989X06297101
Google Scholar | SAGE Journals

Feng, D., Kwock, L., Lee, Y., Taylor, R. (2010). Matching visual saliency to confidence in plots of uncertain data. IEEE Transactions on Visualization and Computer Graphics, 16(6), 980–989. https://doi.org/10.1109/TVCG.2010.176
Google Scholar | Crossref

Fernandes, M., Walls, L., Munson, S., Hullman, J., Kay, M. (2018). Uncertainty displays using quantile dotplots or CDFs improve transit decision-making. In CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3173574.3173718
Google Scholar | Crossref

Few, S. (2004). Show me the numbers. Analytics Press.
Google Scholar

Few, S. (2009). Now you see it: Simple visualization techniques for quantitative analysis. Analytics Press.
Google Scholar

Fischhoff, B., Davis, A. L. (2014). Communicating scientific uncertainty. Proceedings of the National Academy of Sciences, USA, 111(Suppl. 4), 13664–13671. https://doi.org/10.1073/pnas.1317504111
Google Scholar | Crossref

Fisher, P. F. (1993). Visualizing uncertainty in soil maps by animation. Cartographica, 30(2–3), 20–27. https://doi.org/10.3138/B204-32P4-263L-76W0
Google Scholar | Crossref

Forrow, L., Taylor, W. C., Arnold, R. M. (1992). Absolutely relative: How research results are summarized can affect treatment decisions. The American Journal of Medicine, 92(2), 121–124. https://doi.org/10.1016/0002-9343(92)90100-P
Google Scholar | Crossref

Franconeri, S. L. (2013). The nature and status of visual resources. In Reisberg, D. (Ed.), Oxford handbook of cognitive psychology (pp. 147–162). Oxford University Press.
Google Scholar | Crossref

Franconeri, S. L. (2021). Three perceptual tools for seeing and understanding visualized data. Current Directions in Psychological Science, 30(5), 367–375. https://doi.org/10.1177/09637214211009512
Google Scholar | SAGE Journals

Franconeri, S. L., Scimeca, J. M., Roth, J. C., Helseth, S. A., Kahn, L. E. (2012). Flexible visual processing of spatial relationships. Cognition, 122(2), 210–227. https://doi.org/10.1016/j.cognition.2011.11.002
Google Scholar | Crossref

Freedman, E. G., Smith, L. D. (1996). The role of data and theory in covariation assessment: Implications for the theory-ladenness of observation. The Journal of Mind and Behavior, 17(4), 321–343. https://www.jstor.org/stable/43853709
Google Scholar

Gal, I. (2002). Adults’ statistical literacy: Meanings, components, responsibilities. International Statistical Review, 70(1), 1–25. https://doi.org/10.1111/j.1751-5823.2002.tb00336.x
Google Scholar | Crossref

Galesic, M., Garcia-Retamero, R., Gigerenzer, G. (2009). Using icon arrays to communicate medical risks: Overcoming low numeracy. Health Psychology, 28(2), 210–216. https://doi.org/10.1037/a0014474
Google Scholar | Crossref

Gao, T., Hullman, J. R., Adar, E., Hecht, B., Diakopoulos, N. (2014). NewsViews: An automated pipeline for creating custom geovisualizations for news. In CHI ’14: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 3005–3014). Association for Computing Machinery. https://doi.org/10.1145/2556288.2557228
Google Scholar | Crossref

Gapminder Foundation . (2007). Gapminder tools. https://www.gapminder.org/tools/#$chart-type=bubbles
Google Scholar

Garcia-Retamero, R., Cokely, E. T. (2017). Designing visual aids that promote risk literacy: A systematic review of health research and evidence-based design heuristics. Human Factors, 59(4), 582–627. https://doi.org/10.1177/0018720817690634
Google Scholar | SAGE Journals

Garcia-Retamero, R., Galesic, M. (2009). Communicating treatment risk reduction to people with low numeracy skills: A cross-cultural comparison. American Journal of Public Health, 99(12), 2196–2202. https://doi.org/10.2105/AJPH.2009.160234
Google Scholar | Crossref

Garcia-Retamero, R., Galesic, M., Gigerenzer, G. (2010). Do icon arrays help reduce denominator neglect? Medical Decision Making, 30(6), 672–684. https://doi.org/10.1177/0272989X10369000
Google Scholar | SAGE Journals

Garcia-Retamero, R., Okan, Y., Cokely, E. T. (2012). Using visual aids to improve communication of risks about health: A review. The Scientific World Journal, 2012, Article 562637. https://doi.org/10.1100/2012/562637
Google Scholar | Crossref

Garner, W. (1974). The processing of information and structure. Erlbaum.
Google Scholar

Gattis, M., Holyoak, K. J. (1996). Mapping conceptual to spatial relations in visual reasoning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(1), 231–239. https://doi.org/10.1037/0278-7393.22.1.231
Google Scholar | Crossref

Gibson, J. J. (1979). The ecological approach to visual perception. Houghton Mifflin.
Google Scholar

Gillan, D. J., Richman, E. H. (1994). Minimalism and the syntax of graphs. Human Factors, 36(4), 619–644. https://doi.org/10.1177/001872089403600405
Google Scholar | SAGE Journals

Gleicher, M., Albers, D., Walker, R., Jusufi, I., Hansen, C. D., Roberts, J. C. (2011). Visual comparison for information visualization. Information Visualization, 10(4), 289–309. https://doi.org/10.1177/1473871611416549
Google Scholar | SAGE Journals

Gleicher, M., Correll, M., Nothelfer, C., Franconeri, S. (2013). Perception of average value in multiclass scatter plots. IEEE Transactions on Visualization and Computer Graphics, 19(12), 2316–2325. https://doi.org/10.1109/TVCG.2013.183
Google Scholar | Crossref

Goldin-Meadow, S. (1999). The role of gesture in communication and thinking. Trends in Cognitive Sciences, 3(11), 419–429. https://doi.org/10.1016/S1364-6613(99)01397-2
Google Scholar | Crossref

Goldstone, R. L., Hendrickson, A. T. (2010). Categorical perception. Wiley Interdisciplinary Reviews: Cognitive Science, 1(1), 69–78.
Google Scholar | Crossref | Medline

Goo, S. K. (2015, July 16). The art and science of the scatterplot. Pew Research Center. https://www.pewresearch.org/fact-tank/2015/09/16/the-art-and-science-of-the-scatterplot/
Google Scholar

Gramazio, C. C., Laidlaw, D. H., Schloss, K. B. (2016). Colorgorical: Creating discriminable and preferable color palettes for information visualization. IEEE Transactions on Visualization and Computer Graphics, 23(1), 521–530. https://doi.org/10.1109/TVCG.2016.2598918
Google Scholar | Crossref

Grant, E. R., Spivey, M. J. (2003). Eye movements and problem solving: Guiding attention guides thought. Psychological Science, 14(5), 462–466. https://doi.org/10.1111/1467-9280.02454
Google Scholar | SAGE Journals

Haass, M. J., Wilson, A. T., Matzen, L. E., Divis, K. M. (2016). Modeling human comprehension of data visualizations. In Lackey, S., Shumaker, R. (Eds.), VAMR 2016: Virtual, augmented and mixed reality (Lecture Notes in Computer Science, Vol. 9740). Springer.
Google Scholar | Crossref

Haberman, J., Whitney, D. (2012). Ensemble perception: Summarizing the scene and broadening the limits of visual processing. In Wolfe, J., Robertson, L. (Eds.), From perception to consciousness: Searching with Anne Treisman (pp. 339–349). Oxford University Press.
Google Scholar | Crossref

Halford, G. S., Phillips, S., Wilson, W. H., McCredden, J., Andrews, G., Birney, D., Baker, R., Bain, J. D. (2007). Relational processing is fundamental to the central executive and is limited to four variables. In Osaka, N., Logie, R. H., D’Esposito, M. (Eds.), The cognitive neuroscience of working memory (pp. 261–280). Oxford University Press.
Google Scholar | Crossref

Handy, T. C. (Ed.). (2005). Event-related potentials: A methods handbook. MIT Press.
Google Scholar

Haroz, S., Kosara, R., Franconeri, S. L. (2015a). The connected scatter plot for presenting paired time series. IEEE Transactions on Visualization and Computer Graphics, 22(9), 2174–2186. https://doi.org/10.1109/TVCG.2015.2502587
Google Scholar | Crossref

Haroz, S., Kosara, R., Franconeri, S. L. (2015b). ISOTYPE visualization: Working memory, performance, and engagement with pictographs. In CHI ’15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 1191–1200). Association for Computing Machinery. https://doi.org/10.1145/2702123.2702275
Google Scholar | Crossref

Harrison, L., Yang, F., Franconeri, S., Chang, R. (2014). Ranking visualizations of correlation using weber’s law. IEEE Transactions on Visualization and Computer Graphics, 20(12), 1943–1952. https://doi.org/10.1109/TVCG.2014.2346979
Google Scholar | Crossref

Harrower, M., Brewer, C. A. (2003). ColorBrewer.org: An online tool for selecting colour schemes for maps. The Cartographic Journal, 40(1), 27–37. https://doi.org/10.1179/000870403235002042
Google Scholar | Crossref

Hasher, L., Zacks, R. T. (1984). Automatic processing of fundamental information: The case of frequency of occurrence. American Psychologist, 39(12), 1372–1388. https://doi.org/10.1037/0003-066X.39.12.1372
Google Scholar | Crossref

Hawley, S. T., Zikmund-Fisher, B., Ubel, P., Jancovic, A., Lucas, T., Fagerlin, A. (2008). The impact of the format of graphical presentation on health-related knowledge and treatment choices. Patient Education and Counseling, 73(3), 448–455. https://doi.org/10.1016/j.pec.2008.07.023
Google Scholar | Crossref

Hearst, M., Pedersen, E., Patil, L. P., Lee, E., Laskowski, P., Franconeri, S. (2020). An evaluation of semantically grouped word cloud designs. IEEE Transactions on Visualization and Computer Graphics, 26(9), 2748–2761. https://doi.org/10.1109/TVCG.2019.2904683
Google Scholar | Crossref

Heer, J., Bostock, M. (2010). Crowdsourcing graphical perception: Using mechanical turk to assess visualization design. In CHI ’10: Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 203–212). Association for Computing Machinery. https://doi.org/10.1145/1753326.1753357
Google Scholar | Crossref

Heer, J., Bostock, M., Ogievetsky, V. (2010). A tour through the visualization zoo. Communications of the ACM, 53(6), 59–67. https://doi.org/10.1145/1743546.1743567
Google Scholar | Crossref

Hegarty, M. (2011). The cognitive science of visual-spatial displays: Implications for design. Topics in Cognitive Science, 3(3), 446–474. https://doi.org/10.1111/j.1756-8765.2011.01150.x
Google Scholar | Crossref

Helgeson, R. D., Moriarty, R. A. (1993). The effect of fill patterns on graphical interpretation and decision making (Accession No.ADA276274) [Master’s thesis]. Air Force Institute of Technology, Wright-Patterson Airforce Base. https://apps.dtic.mil/sti/citations/ADA276274
Google Scholar

Henderson, S., Segal, E. H. (2013). Visualizing qualitative data in evaluation research. New Directions for Evaluation, 139, 53–71. https://doi.org/10.1002/ev.20067
Google Scholar | Crossref

Hill, S., Wray, B., Sibona, C., Wilmington, N. C. (2017). Minimalism in data visualization: Perceptions of beauty, clarity, effectiveness, and simplicity. Journal of Information Systems Applied Research, 11(1), 34–46. http://jisar.org/2018-11/n1/JISARv11n1p34.html
Google Scholar

Hoekstra, R., Johnson, A., Kiers, H. A. (2012). Confidence intervals make a difference: Effects of showing confidence intervals on inferential reasoning. Educational and Psychological Measurement, 72(6), 1039–1052. https://doi.org/10.1177/0013164412450297
Google Scholar | SAGE Journals

Hoffrage, U., Gigerenzer, G. (1998). Using natural frequencies to improve diagnostic inferences. Academic Medicine, 73(5), 538–540.
Google Scholar | Crossref | Medline

Hofman, J. M., Goldstein, D. G., Hullman, J. (2020). How visualizing inferential uncertainty can mislead readers about treatment effects in scientific results. In CHI ’20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3313831.3376454
Google Scholar | Crossref

Hollar, J. C., Norwood, K. (1999). The effects of a graphing-approach intermediate algebra curriculum on students’ understanding of function. Journal for Research in Mathematics Education, 30(2), 220–226. https://doi.org/10.2307/749612
Google Scholar | Crossref

Holmes, N. (1984). Designer’s guide to creating charts & diagrams. Watson-Guptill.
Google Scholar

Huang, L. (2020). Space of preattentive shape features. Journal of Vision, 20(4), 1–20. https://doi.org/10.1167/jov.20.4.10
Google Scholar | Crossref

Huff, D. (1954). How to lie with statistics. W.W. Norton.
Google Scholar

Hullman, J. (2019). How to get better at embracing unknowns. Scientific American. https://www.scientificamerican.com/article/how-to-get-better-at-embracing-unknowns/
Google Scholar

Hullman, J., Adar, E., Shah, P. (2011). The impact of social information on visual judgments. In CHI ’11: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1461–1470). Association for Computing Machinery. https://doi.org/10.1145/1978942.1979157
Google Scholar | Crossref

Hullman, J., Diakopoulos, N. (2011). Visualization rhetoric: Framing effects in narrative visualization. IEEE Transactions on Visualization and Computer Graphics, 17(12), 2231–2240. https://doi.org/10.1109/TVCG.2011.255
Google Scholar | Crossref

Hullman, J., Diakopoulos, N., Adar, E. (2013). Contextifier: Automatic generation of annotated stock visualizations. In CHI ’13: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2707–2716). Association for Computing Machinery. https://doi.org/10.1145/2470654.2481374
Google Scholar | Crossref

Hullman, J., Drucker, S., Riche, N. H., Lee, B., Fisher, D., Adar, E. (2013). A deeper understanding of sequence in narrative visualization. IEEE Transactions on Visualization and Computer Graphics, 19(12), 2406–2415. https://doi.org/10.1109/TVCG.2013.119
Google Scholar | Crossref

Hullman, J., Kay, M., Kim, Y. S., Shrestha, S. (2017). Imagining replications: Graphical prediction & discrete visualizations improve recall & estimation of effect uncertainty. IEEE Transactions on Visualization and Computer Graphics, 24(1), 446–456. https://doi.org/10.1109/TVCG.2017.2743898
Google Scholar | Crossref

Hullman, J., Resnick, P., Adar, E. (2015). Hypothetical outcome plots outperform error bars and violin plots for inferences about reliability of variable ordering. PLOS ONE, 10(11), Article e0142444. https://doi.org/10.1371/journal.pone.0142444
Google Scholar | Crossref

Inbar, O., Tractinsky, N., Meyer, J. (2007). Minimalism in information visualization: Attitudes towards maximizing the data-ink ratio. In ECCE ’07: Proceedings of the 14th European Conference on Cognitive Ergonomics: Invent! Explore! (pp. 185–188). Association for Computing Machinery. https://doi.org/10.1145/1362550.1362587
Google Scholar | Crossref

Ioannidis, J. P. (2005). Why most published research findings are false. PLOS Medicine, 2(8), Article e124. https://doi.org/10.1371/journal.pmed.0020124
Google Scholar | Crossref

Jardine, N., Ondov, B. D., Elmqvist, N., Franconeri, S. (2019). The perceptual proxies of visual comparison. IEEE Transactions on Visualization and Computer Graphics, 26(1), 1012–1021. https://doi.org/10.1109/TVCG.2019.2934786
Google Scholar | Crossref

Joslyn, S., LeClerc, J. (2012). Uncertainty forecasts improve weather related decisions and attenuate the effects of forecast error. Journal of Experimental Psychology: Applied, 18(1), 126–140. https://doi.org/10.1037/a0025185
Google Scholar | Crossref

Kalchman, M., Koedinger, K. R. (2005). Teaching and learning functions. In Donovan, M. S., Bransford, J. D. (Eds.), How students learn: History, mathematics, and science in the classroom (pp. 351–393). National Academies Press.
Google Scholar

Kale, A., Kay, M., Hullman, J. (2021). Visual reasoning strategies for effect size judgments and decisions. IEEE Transactions on Visualization and Computer Graphics, 27(1), 272–282. https://doi.org/10.1109/TVCG.2020.3030335
Google Scholar | Crossref

Kale, A., Nguyen, F., Kay, M., Hullman, J. (2019). Hypothetical outcome plots help untrained observers judge trends in ambiguous data. IEEE Transactions on Visualization and Computer Graphics, 25(1), 892–902. https://doi.org/10.1109/TVCG.2018.2864909
Google Scholar | Crossref

Kay, M., Kola, T., Hullman, J., Munson, S. (2016). When (ish) is my bus? User-centered visualizations of uncertainty in everyday, mobile predictive systems. In CHI ’16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5092–5103). Association for Computing Machinery. https://doi.org/10.1145/2858036.2858558
Google Scholar | Crossref

Kelly, J. D. (1989). The data-ink ratio and accuracy of newspaper graphs. Journalism Quarterly, 66(3), 632–639.
Google Scholar | SAGE Journals

Kim, N. W., Bylinskii, Z., Borkin, M. A., Gajos, K. Z., Oliva, A., Durand, F., Pfister, H. (2017). BubbleView: An interface for crowdsourcing image importance maps and tracking visual attention. ACM Transactions on Computer-Human Interaction, 24(5), Article 36. https://doi.org/10.1145/3131275
Google Scholar | Crossref

Kim, Y., Heer, J. (2018). Assessing effects of task and data distribution on the effectiveness of visual encodings. Computer Graphics Forum, 37(3), 157–167. https://doi.org/10.1111/cgf.13409
Google Scholar | Crossref

Kirk, A. (2012). Data visualization: A successful design process. Packt Publishing Ltd.
Google Scholar

Kirsh, D. (2005). Metacognition, distributed cognition and visual design. In Gärdenfors, P., Johansson, P. (Eds.), Cognition, education, and communication technology (pp. 147–180). Routledge.
Google Scholar

Knaflic, C. N. (2015). Storytelling with data. John Wiley & Sons.
Google Scholar | Crossref

Kohlhammer, J., Nazemi, K., Ruppert, T., Burkhardt, D. (2012). Toward visualization in policy modeling. IEEE Computer Graphics and Applications, 32(5), 84–89. https://doi.org/10.1109/MCG.2012.107
Google Scholar | Crossref

Kong, H. K., Liu, Z., Karahalios, K. (2018). Frames and slants in titles of visualizations on controversial topics. In CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3173574.3174012
Google Scholar | Crossref

Kong, H. K., Liu, Z., Karahalios, K. (2019). Trust and recall of information across varying degrees of title-visualization misalignment. In CHI ’19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3290605.3300576
Google Scholar | Crossref

Kosara, R. (2014, June 15). When bars point down. EagerEyes. https://eagereyes.org/journalism/when-bars-point-down
Google Scholar

Kosslyn, S. (2010). Better PowerPoint®: Quick fixes based on how your audience thinks. Oxford University Press.
Google Scholar

Kriz, S., Hegarty, M. (2007). Top-down and bottom-up influences on learning from animations. International Journal of Human-Computer Studies, 65(11), 911–930. https://doi.org/10.1016/j.ijhcs.2007.06.005
Google Scholar | Crossref

Lacy, C. R., Barone, J. A., Suh, D.-C., Malini, P. L., Bueno, M., Moylan, D. M., Kostis, J. B. (2001). Impact of presentation of research results on likelihood of prescribing medications to patients with left ventricular dysfunction. The American Journal of Cardiology, 87(2), 203–207. https://doi.org/10.1016/S0002-9149(00)01317-5
Google Scholar | Crossref

Lai, K., Cabrera, J., Vitale, J. M., Madhok, J., Tinker, R., Linn, M. C. (2016). Measuring graph comprehension, critique, and construction in science. Journal of Science Education and Technology, 25(4), 665–681. https://doi.org/10.1007/s10956-016-9621-9
Google Scholar | Crossref

Lambrechts, M. (n.d.). Xenographics: Weird but sometimes useful charts. https://xeno.graphics/
Google Scholar

Lee, S., Kim, S. H., Kwon, B. C. (2016). Vlat: Development of a visualization literacy assessment test. IEEE Transactions on Visualization and Computer Graphics, 23(1), 551–560. https://doi.org/10.1109/TVCG.2016.2598920
Google Scholar | Crossref

Lehrer, R., English, L. (2018). Introducing children to modeling variability. In Ben-Zvi, D., Makar, K., Garfield, J. (Eds.), International handbook of research in statistics education (pp. 229–260). Springer International Publishing. https://doi.org/10.1007/978-3-319-66195-7_7
Google Scholar | Crossref

Lehrer, R., Romberg, T. (1996). Exploring children’s data modeling. Cognition and Instruction, 14(1), 69–108. https://doi.org/10.1207/s1532690xci1401_3
Google Scholar | Crossref

Lehrer, R., Schauble, L., Petrosino, M. (2000). Modeling in mathematics and science. In Glaser, R. (Ed.), Advances in instructional psychology: Education design and cognitive science (Vol. 5, pp. 101–159). Erlbaum.
Google Scholar

Levy, E., Zacks, J., Tversky, B., Schiano, D. (1996). Gratuitous graphics? Putting preferences in perspective. In Tauber, M. J. (Ed.), Proceedings of the ACM Conference on Human Factors in Computing Systems (pp. 42–49). Association for Computing Machinery. https://doi.org/10.1145/238386.238400
Google Scholar | Crossref

Li, H., Moacdieh, N. (2014). Is “chart junk” useful? An extended examination of visual embellishment. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 58(1), 1516–1520. https://doi.org/10.1177/1541931214581316
Google Scholar | SAGE Journals

Lin, S., Fortuna, J., Kulkarni, C., Stone, M., Heer, J. (2013). Selecting semantically-resonant colors for data visualization. Computer Graphics Forum, 32(3 Pt 4), 401–410. https://doi.org/10.1111/cgf.12127
Google Scholar | Crossref

Lipkus, I. M. (2007). Numeric, verbal, and visual formats of conveying health risks: Suggested best practices and future recommendations. Medical Decision Making, 27(5), 696–713. https://doi.org/10.1177/0272989X07307271
Google Scholar | SAGE Journals

Lipkus, I. M., Hollands, J. G. (1999). The visual communication of risk. JNCI Monographs, 1999(25), 149–163. https://doi.org/10.1093/oxfordjournals.jncimonographs.a024191
Google Scholar | Crossref

Lipkus, I. M., Samsa, G., Rimer, B. K. (2001). General performance on a numeracy scale among highly educated samples. Medical Decision Making, 21(1), 37–44. https://doi.org/10.1177/0272989X0102100105
Google Scholar | SAGE Journals

Liu, L., Padilla, L., Creem-Regehr, S. H., House, D. H. (2018). Visualizing uncertain tropical cyclone predictions using representative samples from ensembles of forecast tracks. IEEE Transactions on Visualization and Computer Graphics, 25(1), 882–891. https://doi.org/10.1109/TVCG.2018.2865193
Google Scholar | Crossref

Liu, Y., Heer, J. (2018). Somewhere over the rainbow: An empirical assessment of quantitative colormaps. In CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery. https://doi.org/10.1145/3173574.3174172
Google Scholar | Crossref

Logan, G. D., Compton, B. J. (1998). Attention and automaticity. In Wright, R. D. (Ed.), Visual attention (pp. 108–131). Oxford University Press.
Google Scholar

Lohse, G. L. (1993). A cognitive model for understanding graphical perception. Human-Computer Interaction, 8(4), 353–388. https://doi.org/10.1207/s15327051hci0804_3
Google Scholar | Crossref

Luck, S. J., Kappenman, E. S. (2012). ERP components and selective attention. In Luck, S. J., Kappenman, E. S. (Eds.), Oxford Library of Psychology. The Oxford handbook of event-related potential components (pp. 295–327). Oxford University Press.
Google Scholar

MacDonald-Ross, M. (1977). How numbers are shown: A review of research on the presentation of quantitative data in texts. AV Communication Review, 25, 359–409.
Google Scholar | Crossref

MacEachren, A. M., Roth, R. E., O’Brien, J., Li, B., Swingley, D., Gahegan, M. (2012). Visual semiotics & uncertainty visualization: An empirical study. IEEE Transactions on Visualization and Computer Graphics, 18(12), 2496–2505. https://doi.org/10.1109/TVCG.2012.279
Google Scholar | Crossref

Mackinlay, J. (1986). Automating the design of graphical presentations of relational information. ACM Transactions on Graphics, 5(2), 110–141. https://doi.org/10.1145/22949.22950
Google Scholar | Crossref

Malenka, D. J., Baron, J. A., Johansen, S., Wahrenberger, J. W., Ross, J. M. (1993). The framing effect of relative and absolute risk. Journal of General Internal Medicine, 8(10), 543–548. https://doi.org/10.1007/BF02599636
Google Scholar | Crossref

Mandler, J. M., Ritchey, G. H. (1977). Long-term memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 3(4), 386–396. https://doi.org/10.1037/0278-7393.3.4.386
Google Scholar | Crossref

Matejka, J., Fitzmaurice, G. (2017). Same stats, different graphs: Generating datasets with varied appearance and identical statistics through simulated annealing. In CHI ’17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1290–1294). Association for Computing Machinery. https://doi.org/10.1145/3025453.3025912
Google Scholar | Crossref

Matuk, C., Zhang, J., Uk, I., Linn, M. C. (2019). Qualitative graphing in an authentic inquiry context: How construction and critique help middle school students to reason about cancer. Journal of Research in Science Teaching, 56(7), 905–936. https://doi.org/10.1002/tea.21533
Google Scholar | Crossref

Matzen, L. E., Haass, M. J., Divis, K. M., Wang, Z., Wilson, A. T. (2018). Data visualization saliency model: A tool for evaluating abstract data visualizations. IEEE Transactions on Visualization and Computer Graphics, 24(1), 563–573. https://doi.org/10.1109/TVCG.2017.2743939
Google Scholar | Crossref

Mautone, P. D., Mayer, R. E. (2007). Cognitive aids for guiding graph comprehension. Journal of Educational Psychology, 99(3), 640–652.
Google Scholar | Crossref

Mayer, R. E., Fiorella, L. (Eds.). (in press). The Cambridge handbook of multimedia learning (3rd ed.). Cambridge University Press.
Google Scholar

Mayer, R. E., Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1), 43–52. https://doi.org/10.1207/S15326985EP3801_6
Google Scholar | Crossref

McColeman, C. M., Harrison, L., Feng, M., Franconeri, S. (2021). No mark is an island: Precision and category repulsion biases in data reproductions. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1063–1072. https://doi.org/10.1109/TVCG.2020.3030345
Google Scholar | Crossref

Mehta, R., Guzmán, L. D. (2018). Fake or visual trickery? Understanding the quantitative visual rhetoric in the news. Journal of Media Literacy Education, 10(2), 104–122.
Google Scholar | Crossref

Mesa, V. (2008). Solving problems on functions: Role of the graphing calculator. PNA. Revista de Investigación en Didáctica de la Matematica, 2(3), 109–135. https://doi.org/10.30827/pna.v2i3.6198
Google Scholar

Michal, A., Shah, P., Uttal, D. H., Franconeri, S. (2018). Improving graph comprehension with a visuospatial intervention [Abstract]. In Rogers, T. T., Rau, M., Zhu, X., Kalish, C. W. (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (p. 2115). Cognitive Science Society. https://cognitivesciencesociety.org/wp-content/uploads/2019/01/cogsci18_proceedings.pdf
Google Scholar

Michal, A. L., Franconeri, S. L. (2017). Visual routines are associated with specific graph interpretations. Cognitive Research: Principles and Implications, 2(1), Article 20. https://doi.org/10.1186/s41235-017-0059-2
Google Scholar | Crossref

Moore, D. S., McCabe, G. P., Craig, B. A. (2017). Introduction to the practice of statistics. W.H. Freeman, Macmillan Learning.
Google Scholar

Moreno, R., Mayer, R. E. (1999). Cognitive principles of multimedia learning: The role of modality and contiguity. Journal of Educational Psychology, 91(2), 358–368. https://doi.org/10.1037/0022-0663.91.2.358
Google Scholar | Crossref

Munzner, T. (2014). Visualization analysis and design. AK Peters/CRC Press.
Google Scholar | Crossref

National Governors Association Center for Best Practices and Council of Chief State School Officers . (2010). Common core state standards. NGAC and CCSSO. http://www.corestandards.org/read-the-standards/
Google Scholar

National Research Council . (2013). Next generation science standards: For states, by states. The National Academies Press. https://doi.org/10.17226/18290
Google Scholar | Crossref

Natter, H. M., Berry, D. C. (2005). Effects of active information processing on the understanding of risk information. Applied Cognitive Psychology, 19(1), 123–135. https://doi.org/10.1002/acp.1068
Google Scholar | Crossref

Naylor, C. D., Chen, E., Strauss, B. (1992). Measured enthusiasm: Does the method of reporting trial results alter perceptions of therapeutic effectiveness? Annals of Internal Medicine, 117(11), 916–921. https://doi.org/10.7326/0003-4819-117-11-916
Google Scholar | Crossref

Newcombe, N., Huttenlocher, J., Sandberg, E., Lie, E., Johnson, S. (1999). What do misestimations and asymmetries in spatial judgement indicate about spatial representation? Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(4), 986–996. https://doi.org/10.1037/0278-7393.25.4.986
Google Scholar | Crossref

Newman, G. E., Scholl, B. J. (2012). Bar graphs depicting averages are perceptually misinterpreted: The within-the-bar bias. Psychonomic Bulletin & Review, 19, 601–607. https://doi.org/10.3758/s13423-012-0247-5
Google Scholar | Crossref

Nistal, A. A., Van Dooren, W., Clarebout, G., Elen, J., Verschaffel, L. (2009). Conceptualising, investigating and stimulating representational flexibility in mathematical problem solving and learning: A critical review. ZDM Mathematics Education, 41(5), 627–636. https://doi.org/10.1007/s11858-009-0189-1
Google Scholar | Crossref

Nothelfer, C., Franconeri, S. (2019). Measures of the benefit of direct encoding of data deltas for data pair relation perception. IEEE Transactions on Visualization and Computer Graphics, 26(1), 311–320. https://doi.org/10.1109/TVCG.2019.2934801
Google Scholar | Crossref

Nothelfer, C., Gleicher, M., Franconeri, S. (2017). Redundant encoding strengthens segmentation and grouping in visual displays of data. Journal of Experimental Psychology: Human Perception and Performance, 43(9), 1667–1676. https://doi.org/10.1037/xhp0000314
Google Scholar | Crossref

Nyhan, B., Reifler, J. (2019). The roles of information deficits and identity threat in the prevalence of misperceptions. Journal of Elections, Public Opinion and Parties, 29(2), 222–244. https://doi.org/10.1080/17457289.2018.1465061
Google Scholar | Crossref

Okan, Y., Garcia-Retamero, R., Cokely, E. T., Maldonado, A. (2012). Individual differences in graph literacy: Overcoming denominator neglect in risk comprehension. Journal of Behavioral Decision Making, 25(4), 390–401. https://doi.org/10.1002/bdm.751
Google Scholar | Crossref

Ola, O., Sedig, K. (2016). Beyond simple charts: Design of visualizations for big health data. Online Journal of Public Health Informatics, 8(3), Article e195. https://doi.org/10.5210/ojphi.v8i3.7100
Google Scholar | Crossref

Oliva, A. (2005). Gist of the scene. In Itti, L., Rees, G., Tsotsos, J. K. (Eds.), Neurobiology of attention (pp. 251–256). Elsevier. https://doi.org/10.1016/B978-012375731-9/50045-8
Google Scholar | Crossref

Olson, J. M., Brewer, C. A. (1997). An evaluation of color selections to accommodate map users with color-vision impairments. Annals of the Association of American Geographers, 87(1), 103–134. https://doi.org/10.1111/0004-5608.00043
Google Scholar | Crossref

Ondov, B. D., Jardine, N., Elmqvist, N., Franconeri, S. (2019). Face to face: Evaluating visual comparison. IEEE Transactions on Visualization and Computer Graphics, 25(1), 861–871. https://doi.org/10.1109/TVCG.2018.2864884
Google Scholar | Crossref

Ondov, B. D., Yang, F., Kay, M., Elmqvist, N., Franconeri, S. (2021). Revealing perceptual proxies with adversarial examples. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1073–1083. https://doi.org/10.1109/TVCG.2020.3030429
Google Scholar | Crossref

Otten, J. J., Cheng, K., Drewnowski, A. (2015). Infographics and public policy: Using data visualization to convey complex information. Health Affairs, 34(11), 1901–1907. https://doi.org/10.1377/hlthaff.2015.0642
Google Scholar | Crossref

Padilla, L. M. K., Creem-Regehr, S. H., Hegarty, M., Stefanucci, J. K. (2018). Decision making with visualizations: A cognitive framework across disciplines. Cognitive Research: Principles and Implications, 3(1), Article 29. https://doi.org/10.1186/s41235-018-0120-9
Google Scholar | Crossref

Padilla, L. M. K., Creem-Regehr, S. H., Thompson, W. (2020). The powerful influence of marks: Visual and knowledge-driven processing in hurricane track displays. Journal of Experimental Psychology: Applied, 26(1), 1–15. https://doi.org/10.1037/xap0000245
Google Scholar | Crossref

Padilla, L. M. K., Kay, M., Hullman, J. (2021). Uncertainty visualization. In Balakrishnan, N., Colton, T., Everitt, B., Piegorsch, W., Ruggeri, F., Teugels, J. L. (Eds.), Wiley StatsRef: Statistics reference online. John Wiley. https://doi.org/10.1002/9781118445112.stat08296
Google Scholar | Crossref

Padilla, L. M. K., Ruginski, I. T., Creem-Regehr, S. H. (2017). Effects of ensemble and summary displays on interpretations of geospatial uncertainty data. Cognitive Research: Principles and Implications, 2(1), Article 40. https://doi.org/10.1186/s41235-017-0076-1
Google Scholar | Crossref

Paik, E. S., Schraw, G. (2013). Learning with animation and illusions of understanding. Journal of Educational Psychology, 105(2), 278–289. https://doi.org/10.1037/a0030281
Google Scholar | Crossref

Palmer, J. (1995). Attention in visual search: Distinguishing four causes of a set-size effect. Current Directions in Psychological Science, 4(4), 118–123. https://doi.org/10.1111/1467-8721.ep10772534
Google Scholar | SAGE Journals

Palmer, T. E. (1975). The effects of contextual scenes on the identification of objects. Memory & Cognition, 3, 519–526.
Google Scholar | Crossref | Medline | ISI

Pandey, A. V., Manivannan, A., Nov, O., Satterthwaite, M., Bertini, E. (2014). The persuasive power of data visualization. IEEE Transactions on Visualization and Computer Graphics, 20(12), 2211–2220. https://doi.org/10.1109/TVCG.2014.2346419
Google Scholar | Crossref

Pandey, A. V., Rall, K., Satterthwaite, M. L., Nov, O., Bertini, E. (2015). How deceptive are deceptive visualizations? An empirical analysis of common distortion techniques. In CHI ’15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 1469–1478). Association for Computing Machinery. https://doi.org/10.1145/2702123.2702608
Google Scholar | Crossref

Parker, I. (2001, May 28). Absolute PowerPoint. The New Yorker. https://www.newyorker.com/magazine/2001/05/28/absolute-powerpoint
Google Scholar

Peebles, D., Cheng, P. C. H. (2003). Modeling the effect of task and graphical representation on response latency in a graph reading task. Human Factors, 45(1), 28–46.
Google Scholar | SAGE Journals | ISI

Peters, E., Hart, P. S., Fraenkel, L. (2011). Informing patients: The influence of numeracy, framing, and format of side effect information on risk perceptions. Medical Decision Making, 31(3), 432–436.
Google Scholar | SAGE Journals | ISI

Plaisant, C. (2005). Information visualization and the challenge of universal usability. In Dykes, J., MacEachren, A. M., Kraak, M. J. (2005). Exploring geovisualization (pp. 53–82). Elsevier. https://doi.org/10.1016/B978-008044531-1/50421-8
Google Scholar | Crossref

Price, M. M., Crumley-Branyon, J. J., Leidheiser, W. R., Pak, R. (2016). Effects of information visualization on older adults’ decision-making performance in a Medicare plan selection task: A comparative usability study. JMIR Human Factors, 3(1), Article e16. https://doi.org/10.2196/humanfactors.5106
Google Scholar | Crossref

Purves, D., Williams, S. M., Nundy, S., Lotto, R. B. (2004). Perceiving the intensity of light. Psychological Review, 111(1), 142–158. https://doi.org/10.1037/0033-295X.111.1.142
Google Scholar | Crossref

Quinan, P. S., Padilla, L. M., Creem-Regehr, S. H., Meyer, M. (2019). Examining implicit discretization in spectral schemes. Computer Graphics Forum, 38(3), 363–374. https://doi.org/10.1111/cgf.13695
Google Scholar | Crossref

Rensink, R. A., Baldridge, G. (2010). The perception of correlation in scatterplots. Computer Graphics Forum, 29(3), 1203–1210. https://doi.org/10.1111/j.1467-8659.2009.01694.x
Google Scholar | Crossref

Robertson, G., Fernandez, R., Fisher, D., Lee, B., Stasko, J. (2008). Effectiveness of animation in trend visualization. IEEE Transactions on Visualization and Computer Graphics, 14(6), 1325–1332. https://doi.org/10.1109/TVCG.2008.125
Google Scholar | Crossref

Rodríguez, V., Andrade, A. D., García-Retamero, R., Anam, R., Rodríguez, R., Lisigurski, M., Sharit, J., Ruiz, J. G. (2013). Health literacy, numeracy, and graphical literacy among veterans in primary care and their effect on shared decision making and trust in physicians. Journal of Health Communication, 18(1), 273–289. https://doi.org/10.1080/10810730.2013.829137
Google Scholar | Crossref

Rosling, H. (2006, February). The best stats you’ve ever seen [Video]. TED Conferences. https://www.ted.com/talks/Hans_rosling_the_best_stats_you_ve_ever_seen/
Google Scholar

Roth, J. C., Franconeri, S. L. (2012). Asymmetric coding of categorical spatial relations in both language and vision. Frontiers in Psychology, 3, Article 464. https://doi.org/10.3389/fpsyg.2012.00464
Google Scholar | Crossref

Ruginski, I. T., Boone, A. P., Padilla, L. M., Liu, L., Heydari, N., Kramer, H. S., Hegarty, M., Thompson, W. B., House, D. H., Creem-Regehr, S. H. (2016). Non-expert interpretations of hurricane forecast uncertainty visualizations. Spatial Cognition & Computation, 16(2), 154–172. https://doi.org/10.1080/13875868.2015.1137577
Google Scholar | Crossref

Rumelhart, D. E. (1980). On evaluating story grammars. Cognitive Science, 4(3), 313–316. https://doi.org/10.1207/s15516709cog0403_5
Google Scholar | Crossref

Schapira, M. M., Nattinger, A. B., McHorney, C. A. (2001). Frequency or probability? A qualitative study of risk communication formats used in health care. Medical Decision Making, 21(6), 459–467. https://doi.org/10.1177/0272989X0102100604
Google Scholar | SAGE Journals

Schloss, K. B., Gramazio, C. C., Silverman, A. T., Parker, M. L., Wang, A. S. (2018). Mapping color to meaning in colormap data visualizations. IEEE Transactions on Visualization and Computer Graphics, 25(1), 810–819. https://doi.org/10.1109/TVCG.2018.2865147
Google Scholar | Crossref

Schwan, S., Riempp, R. (2004). The cognitive benefits of interactive videos: Learning to tie nautical knots. Learning and Instruction, 14(3), 293–305. https://doi.org/10.1016/j.learninstruc.2004.06.005
Google Scholar | Crossref

Scimeca, J. M., Franconeri, S. L. (2014). Selecting and tracking multiple objects. Wiley Interdisciplinary Reviews: Cognitive Science, 6(2), 109–118. https://doi.org/10.1002/wcs.1328
Google Scholar | Crossref

Segel, E., Heer, J. (2010). Narrative visualization: Telling stories with data. IEEE Transactions on Visualization and Computer Graphics, 16(6), 1139–1148. https://doi.org/10.1109/TVCG.2010.179
Google Scholar | Crossref

Setlur, V., Stone, M. C. (2015). A linguistic approach to categorical color assignment for data visualization. IEEE Transactions on Visualization and Computer Graphics, 22(1), 698–707. https://doi.org/10.1109/TVCG.2015.2467471
Google Scholar | Crossref

Shah, P., Carpenter, P. A. (1995). Conceptual limitations in comprehending line graphs. Journal of Experimental Psychology: General, 124(1), 43–61. https://doi.org/10.1037/0096-3445.124.1.43
Google Scholar | Crossref

Shah, P., Freedman, E. G. (2011). Bar and line graph comprehension: An interaction of top-down and bottom-up processes. Topics in Cognitive Science, 3(3), 560–578. https://doi.org/10.1111/j.1756-8765.2009.01066.x
Google Scholar | Crossref

Shah, P., Freedman, E. G., Vekiri, I. (2005). The comprehension of quantitative information in graphical displays. In Shah, P., Miyake, A. (Eds.), The Cambridge handbook of visuospatial thinking (pp. 426–476). Cambridge University Press.
Google Scholar | Crossref

Shah, P., Hoeffner, J. (2002). Review of graph comprehension research: Implications for instruction. Educational Psychology Review, 14(1), 47–69. https://doi.org/10.1023/A:1013180410169
Google Scholar | Crossref

Shah, P., Mayer, R. E., Hegarty, M. (1999). Graphs as aids to knowledge construction: Signaling techniques for guiding the process of graph comprehension. Journal of Educational Psychology, 91(4), 690–702. https://doi.org/10.1037/0022-0663.91.4.690
Google Scholar | Crossref

Shechter, S., Hochstein, S. (1992). Asymmetric interactions in the processing of the visual dimensions of position, width, and contrast of bar stimuli. Perception, 21(3), 297–312. https://doi.org/10.1068/p210297
Google Scholar | SAGE Journals

Shneiderman, B. (1992). Tree visualization with tree-maps: 2-d space-filling approach. ACM Transactions on Graphics, 11(1), 92–99. https://doi.org/10.1145/102377.115768
Google Scholar | Crossref

Silva, S., Santos, B. S., Madeira, J. (2011). Using color in visualization: A survey. Computers & Graphics, 35(2), 320–333. https://doi.org/10.1016/j.cag.2010.11.015
Google Scholar | Crossref

Spence, I., Krizel, P. (1994). Children’s perception of proportion in graphs. Child Development, 65(4), 1193–1213. https://doi.org/10.1111/j.1467-8624.1994.tb00812.x
Google Scholar | Crossref

Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680. https://www.jstor.org/stable/1671815
Google Scholar | Crossref

Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64(3), 153–181. https://doi.org/10.1037/h0046162
Google Scholar | Crossref

Stone, E. R., Sieck, W. R., Bull, B. E., Yates, J. F., Parks, S. C., Rush, C. J. (2003). Foreground: Background salience: Explaining the effects of graphical displays on risk avoidance. Organizational Behavior and Human Decision Processes, 90(1), 19–36. https://doi.org/10.1016/S0749-5978(03)00003-7
Google Scholar | Crossref

Su, Y. S. (2008). It’s easy to produce chartjunk using Microsoft® Excel 2007 but hard to make good graphs. Computational Statistics & Data Analysis, 52(10), 4594–4601. https://doi.org/10.1016/j.csda.2008.03.007
Google Scholar | Crossref

Sweller, J., Ayres, P., Kalyuga, S. (2011). Emerging themes in cognitive load theory: The transient information and the collective working memory effects. In Sweller, J., Ayres, P., Kalyuga, S. (Eds.), Cognitive load theory (pp. 219–233). Springer.
Google Scholar | Crossref

Szafir, D. A. (2018). The good, the bad, and the biased: Five ways visualizations can mislead (and how to fix them). ACM Interactions, 25(4), 26–33. http://doi.org/10.1145/3231772
Google Scholar | Crossref

Szafir, D. A., Haroz, S., Gleicher, M., Franconeri, S. L. (2016). Four types of ensemble encoding in data visualizations. Journal of Vision, 16(5), Article 11. https://doi.org/10.1167/16.5.11
Google Scholar | Crossref

Tait, A. R., Voepel-Lewis, T., Zikmund-Fisher, B. J., Fagerlin, A. (2010). The effect of format on parents’ understanding of the risks and benefits of clinical research: A comparison between text, tables, and graphics. Journal of Health Communication, 15(5), 487–501. https://doi.org/10.1080/10810730.2010.492560
Google Scholar | Crossref

Tan, J. K., Benbasat, I. (1990). Processing of graphical information: A decomposition taxonomy to match data extraction tasks and graphical representations. Information Systems Research, 416–439. https://www.jstor.org/stable/23010666
Google Scholar

Tittle, J. S., Woods, D. D., Roesler, A., Howard, M., Phillips, F. (2001). The role of 2-D and 3-D task performance in the design and use of visual displays. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 45(4), 331–335. https://doi.org/10.1177/154193120104500414
Google Scholar | SAGE Journals

Treisman, A. (1998). The perception of features and objects. In Wright, R. D. (Ed.), Visual attention (pp. 26–54). Oxford University Press.
Google Scholar

Tufte, E. R. (1983). The visual display of quantitative information. Graphics Press.
Google Scholar

Tversky, A., Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124
Google Scholar | Crossref

Tversky, B. (2000). Remembering spaces. In Tulving, E., Craik, F. I. M. (Eds.), The Oxford handbook of memory (pp. 363–378). Oxford University Press.
Google Scholar

Tversky, B. (2001). Spatial schemas in depictions. In Gattis, M. (Ed.), Spatial schemas and abstract thought (pp. 79–111). MIT Press.
Google Scholar

Tversky, B., Heiser, J., Lee, P., Zacks, J. M. (2002). Diagrams to augment cognition. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 24, No. 24). https://escholarship.org/uc/item/3dt63840
Google Scholar

Tversky, B., Kugelmass, S., Winter, A. (1991). Cross-cultural and developmental trends in graphic productions. Cognitive Psychology, 23(4), 515–557. https://doi.org/10.1016/0010-0285(91)90005-9
Google Scholar | Crossref

Tversky, B., Morrison, J. B., Betrancourt, M. (2002). Animation: Can it facilitate? International Journal of Human-computer Studies, 57(4), 247–262. https://doi.org/10.1006/ijhc.2002.1017
Google Scholar | Crossref

Tversky, B., Schiano, D. J. (1989). Perceptual and conceptual factors in distortions in memory for graphs and maps. Journal of Experimental Psychology: General, 118(4), 387–398. https://psycnet.apa.org/doi/10.1037/0096-3445.118.4.387
Google Scholar | Crossref

Van Essen, D. C., Anderson, C. H., Felleman, D. J. (1992). Information processing in the primate visual system: An integrated systems perspective. Science, 255(5043), 419–423. https://doi.org/10.1126/science.1734518
Google Scholar | Crossref

Ware, C. (2010). Visual thinking for design. Elsevier.
Google Scholar

Ware, C. (2019). Information visualization: Perception for design (4th ed.). Morgan Kaufmann.
Google Scholar

Washburne, J. N. (1927). An experimental study of various graphs: Tabular and textual methods of presenting quantitative material. Journal of Educational Psychology, 18, 361–376, 465–476. https://doi.org/10.1037/h0074758
Google Scholar | Crossref

Waters, E. A., Fagerlin, A., Zikmund-Fisher, B. J. (2016). Overcoming the many pitfalls of communicating risk. In Diefenbach, M. A., Miller-Halegoua, S., Bowen, D. J. (Eds.), Handbook of health decision science (pp. 265–277). Springer.
Google Scholar | Crossref

Waters, E. A., Weinstein, N. D., Colditz, G. A., Emmons, K. (2006). Formats for improving risk communication in medical tradeoff decisions. Journal of Health Communication, 11(2), 167–182. https://doi.org/10.1080/10810730500526695
Google Scholar | Crossref

Whitacre, M. P., Saul, E. W. (2016). High school girls’ interpretations of science graphs: Exploring complex visual and natural language hybrid text. International Journal of Science and Mathematics Education, 14(8), 1387–1406. https://doi.org/10.1007/s10763-015-9677-7
Google Scholar | Crossref

Wickens, C. D. (1989). Attention and skilled performance. In Holding, D. (Ed.), Human skills (pp. 71–105). John Wiley & Sons.
Google Scholar

Wiederkehr, A. (2020, August 13). How we designed the look of our 2020 forecast. FiveThirtyEight. https://fivethirtyeight.com/features/how-we-designed-the-look-of-our-2020-forecast/
Google Scholar

Wilkinson, L. (1999). Dot plots. The American Statistician, 53(3), 276–281. https://doi.org/10.1080/00031305.1999.10474474
Google Scholar | Crossref

Witt, J. K. (2019). Graph construction: An empirical investigation on setting the range of the y-axis. Meta-Psychology, 3, Article MP.2018.895. https://doi.org/10.15626/MP.2018.895
Google Scholar | Crossref

Wolfe, J. M. (1998). What can 1 million trials tell us about visual search? Psychological Science, 9(1), 33–39. https://doi.org/10.1111/1467-9280.00006
Google Scholar | SAGE Journals

Wolfe, J. M., Horowitz, T. S. (2017). Five factors that guide attention in visual search. Nature Human Behaviour, 1(3), Article 0058. https://doi.org/10.1038/s41562-017-0058
Google Scholar | Crossref

Wong, D. M. (2010). The Wall Street Journal guide to information graphics: The dos and don’ts of presenting data, facts, and figures. W.W. Norton.
Google Scholar

Wongsuphasawat, K., Moritz, D., Anand, A., Mackinlay, J., Howe, B., Heer, J. (2015). Voyager: Exploratory analysis via faceted browsing of visualization recommendations. IEEE Transactions on Visualization and Computer Graphics, 22(1), 649–658. https://doi.org/10.1109/TVCG.2015.2467191
Google Scholar | Crossref

Xiong, C., Ceja, C. R., Ludwig, C. J. H., Franconeri, S. (2020). Biased average position estimates in line and bar graphs: Underestimation, overestimation, and perceptual pull. IEEE Transactions on Visualization and Computer Graphics, 26(1), 301–310. https://doi.org/10.1109/TVCG.2019.2934400
Google Scholar | Crossref

Xiong, C., Shapiro, J., Hullman, J., Franconeri, S. (2020). Illusion of causality in visualized data. IEEE Transactions on Visualization and Computer Graphics, 26(1), 853–862. https://doi.org/10.1109/TVCG.2019.2934399
Google Scholar | Crossref

Xiong, C., Van Weelden, L., Franconeri, S. (2020). The curse of knowledge in visual data communication. IEEE Transactions on Visualization and Computer Graphics, 26(10), 3051–3062. https://doi.org/10.1109/TVCG.2019.2917689
Google Scholar | Crossref

Xu, Y., Franconeri, S. L. (2015). Capacity for visual features in mental rotation. Psychological Science, 26(8), 1241–1251. https://doi.org/10.1177/0956797615585002
Google Scholar | SAGE Journals

Yamagishi, K. (1997). When a 12.86% mortality is more dangerous than 24.14%: Implications for risk communication. Applied Cognitive Psychology, 11(6), 495–506. https://doi.org/10.1002/(SICI)1099-0720(199712)11:6%3C495::AID-ACP481%3E3.0.CO;2-J
Google Scholar

Yang, B. W., Vargas-Restrepo, C., Stanley, M., Marsh, E. J. (2021). Truncating bar graphs persistently misleads viewers. Journal of Applied Research in Memory and Cognition, 10(2), 298–311. https://doi.org/10.1016/j.jarmac.2020.10.002
Google Scholar | Crossref

Yang, F., Harrison, L. T., Rensink, R. A., Franconeri, S. L., Chang, R. (2019). Correlation judgment and visualization features: A comparative study. IEEE Transactions on Visualization and Computer Graphics, 25(3), 1474–1488. https://doi.org/10.1109/TVCG.2018.2810918
Google Scholar | Crossref

Yuan, L., Haroz, S., Franconeri, S. (2019). Perceptual proxies for extracting averages in data visualizations. Psychonomic Bulletin & Review, 26(2), 669–676. https://doi.org/10.3758/s13423-018-1525-7
Google Scholar | Crossref

Zacks, J. M., Franconeri, S. L. (2020). Designing graphs for decision-makers. Policy Insights from the Behavioral and Brain Sciences, 7(1), 52–63. https://doi.org/10.1177/2372732219893712
Google Scholar | SAGE Journals

Zacks, J. M., Tversky, B. (1999). Bars and lines: A study of graphic communication. Memory and Cognition, 27(6), 1073–1079. https://doi.org/10.3758/BF03201236
Google Scholar | Crossref

Zikmund-Fisher, B. J., Witteman, H. O., Dickson, M., Fuhrel-Forbis, A., Kahn, V. C., Exe, N. L., Valerio, M., Holtzman, L. G., Scherer, L. D., Fagerlin, A. (2014). Blocks, ovals, or people? Icon type affects risk perceptions and recall of pictographs. Medical Decision Making, 34(4), 443–453. https://doi.org/10.1177/0272989X13511706
Google Scholar | SAGE Journals

Zikmund-Fisher, B. J., Witteman, H. O., Fuhrel-Forbis, A., Exe, N. L., Kahn, V. C., Dickson, M. (2012). Animated graphics for comparing two risks: A cautionary tale. Journal of Medical Internet Research, 14(4), Article e106. https://doi.org/10.2196/jmir.2030
Google Scholar | Crossref