Writing and talking about data is one of the core jobs of a scientist. But how do we give voice to our non-verbal graphs, tables, charts, photos, and so on?
Let me introduce “WALTER.” Each letter in the WALTER acronym represents an important element in a data commentary.
The idea is to include all of the WALTER elements — or as many as you think would be most helpful to a reader or listener — in any description of a graph or figure. You can use WALTER when you are writing a caption for a figure, when you are discussing your data in a paper, or when you are describing your data during a presentation.
Thank you to UCSB Computer Science Professor Tim Sherwood for telling me about WALTER and for providing the examples below from Computer Science.
|W – Why||Before you describe a figure, you need to set up the context. Sentences such as “To understand how our technique scales under heavy loads, we …” or “As performance is critical to the usability of our system, we …” set up the expectations for the figure.
It is critical that the reader be made aware of the new insight she will discover by studying your graph. None of this is obvious to the reader. This “why” part of your commentary is the most important part of describing a graph.
|A – Axes||The axes frame the results of a graph and are an opportunity to precisely describe the parameters of your experiment. Example: “We varied N, the number of virtual machines running our improved memory manager, from 1 to 64 as shown on the x-axis.” Example 2: “The y-axis shows the average power consumed by the devices as measured in Watts”.
All axes should have units, and complex metrics or units should be fully motivated and described. If the axis shows a ratio (such as speedup) it needs to be clearly indicated if this ratio is presented as a fraction or percentage.
|L – Lines||Oftentimes we need to show several experiments on the same graph so that the results can be compared directly. Here “Lines” could refer to lines on a multi-line graph, or the different types of bars in a bar chart. The point is to make sure that each line is clearly described. Example: “The solid black line shows the performance of the baseline system described in Section 1.2, while the dashed grey line shows system performance after our optimization is applied.” Example 2: “The solid grey portion of the bar shows the fraction of users that indicated they were satisfied with the user experience, while dissatisfied users are shown in black.”
Keep in mind that most publications are black and white only (there are exceptions), and that most reviewers print out the papers on non-color printers. Avoid colors that look the same when printed in grayscale (or avoid color altogether).
|T – Trends||Now that you have the stage properly set, including the motivation (the Why), the parameters (the Axes), and the types of data points (the Lines), you can begin to discuss the overall trends of the graph — the main points you want people to take away from the visual.
Do not assume that this is obvious — tell your reader directly: “Looking across all of the applications we can see that in most cases an 8% to 10% reduction in memory footprint is achieved.”
|E – Exceptions/Anomalies||In most graphs of experimental data there are some outliers and exceptions. Your readers will notice them. Don’t try to hide them (you are a scientist after all), but do try to explain them. Example 1: “While we achieve near linear scaling up to 64 processors, there is a short performance dip at 16 processors where the data structures can no longer be completely memory resident.” Example 2: “The only program for which this technique actually hurts performance is gcc. The complex control dependencies of that program are large enough that they overflow the small buffer in our design.”
The exceptions section can be made even stronger by including evidence that your theory behind the exceptions is true. Continuing Example 2: “If the buffer size is doubled for gcc, the overall speedup jumps to 5%.”
|R – Recap/Segue||Finally, now that the graph is described, discuss why these results are significant and segue on to the next result. Example: “Even after simple optimizations are applied, a very large fraction of the execution time is being spent in memory copy. In the next section we evaluate a novel copy free implementation that eliminates more than 70% of this overhead.” Example 2: “Now that we have demonstrated the stability of our routing scheme in the face of errors, we need to examine the performance of the algorithm across those same topologies.”
As you can see for these examples, the “Recap” comment can overlap nicely with the “Why” of the next data commentary in your paper or presentation.