Maximum “information throughput” can still guide legal managers when graphs display data

The legendary Prof. Edward Tufte gave a keynote presentation in September 2016 at Microsoft’s Machine Learning and Data Summit.  Tufte’s ambitious subject was “The Future of Data Analysis”.   You can listen to the 50-minute talk online.  Early on he emphasized that you display data to assist reasoning (analytic thinking) and to enable smart comparisons.

Tufte frequently referred to data visualization as a method aimed to maximize “information throughput”, yet also to be interpretable by the reader.  I took information throughput to be engineering jargon for “lots of data presented.”

Maximal information throughput, from the standpoint of legal managers, has almost no relevance.  The data sets that could be analyzed by AI or machine learning techniques or visualized by Excel, Tableau, R and other software are simply too small to justify that “Big Data” orientation and terminology.

That distinction understood, legal managers should take away from Tufte’s model and recommendation that when you create a graph, strive to present as much of the underlying information as you can as clearly as you can so that the reader of the graph can come to her own interpretations.

Create a choropleth to display data by State, country, region

When legal managers want to present data by State or by country, they can make good use of what is called a “choropleth”.  Choropleths are maps that color their regions in proportion to the count or other statistic of the variable being displayed on the map, such as the number of pending law suits per State or amounts spent on outside counsel by country.   Darker colors typically indicate more in a region and lighter shades of the color indicate fewer.

Below is an example of a choropleth that appears in Exterro’s 2016 Law Firm Benchmarking Report at page 8.  It shows how many of the 112 survey participants come from each state.

exterro-choropleth-oct-2016-post

California is the darkest with 21; the grey states had no participants.  The table below the map, which is truncated in this screen shot, gives the actual numbers by State, so someone could carp that the choropleth sweetens the eye but adds no nutritional information.  Still, it looks pretty good and it is an unusual example of an effective graphical tool.

Law firms, their number of lawyers and mentions in court documents

Data analytics and visualization can sometimes tell you about where your law firms stands relative to competitors and can therefore guide your positioning and selling efforts.   An illustration of this benefit comes from Corp. Counsel, Oct. 2016 at 44, where a table shows counts of U.S. law firms that “turn up the most in court documents.”

The table lists the firms by number of “mentions”; most often mentioned were Littler Mendelson and Ogletree Deakins at 100, least often mentioned were Fisher & Phillips, Gordon Rees, and Hunton & Williams at 24.

A more revelatory analysis matches the number of mentions of a firm to its number of lawyers.   After all, bigger firms are more likely to represent litigants than smaller firms, everything else held constant.

corpcounsel-mentions

The plot above shows data on law firm lawyers from a year or so ago, but the relative size differences among this group of 30 likely still hold.  It shows each firm’s lawyer count from the left axis and its mentions from the bottom axis, with the firm name next to its point on the scatter plot.

One possible conclusion from this plot is that firms specializing in employment litigation rack up the most mentions.

A makeover of an ILTA graph, explaining the improvements in visualization

Legal managers who create data-analysis graphs should strive to make those graphs effective communicators.  Let’s pause for a teaching moment.  I wrote a post about the 2016 ILTA/InsideLegal Technology Purchasing Survey and its question about areas of practice where respondents foresaw AI software penetrating.   

The plot in the upper right portion of page 13 that summarizes the answers to that question could be improved in several ways.

ilta-applications-original

The bar colors are nothing but distracting eye-candy, since the colors do not convey any additional information.  If a couple of bars were colored to indicate something, that would be a different matter.

Second, it was good to add the percentages at the end of the bars, rather than force readers to look down at the horizontal axis and estimate them; however, if the graph states each bar’s percentage, the horizontal axis figures are unnecessary.  Even more, the vertical grey lines can be banished.

Third, most people care less about an alphabetical ordering of the bars than they do about comparisons among the applications on percentages.  It would have been more informative to order the bars in the conventional longest-at-the-top to shortest-at-the-bottom style.

As a kudo, it was good to put the application areas on the left rather than the bottom.  Almost always there is more room on the left than in the narrower bands at the bottom.

ilta-applications

A makeover using the same data cures these problems and displays a few other visualization improvements.  The new plot removes the boundary lines around the plot, which gives a cleaner look.  It also enlarges the font on the percentages relative to the font on the applications, since those figures are likely to be the ones that readers care most about and want most emphasized.  Two final tweaks: the application names are on one line, and the axes have no “tick marks”, the tiny lines that mark the mid-point of an axis interval but that rarely add any value.