“Cognitive computing” and a consortium that promotes understanding of it

Legal managers should be aware of the vogue term “cognitive computing” and know that an organization, the Cognitive Computing Consortium  promotes it as far beyond traditional software that aids decision-making.  After all, a spreadsheet helps make up someone’s mind.

The Consortium’s definition of cognitive computing sprawls, but parts of it are useful in the legal industry.   Cognitive computing applications “identify and extract context features such as hour, location, task, history or profile to present an information set that is appropriate for an individual … engaged in a specific process at a specific time and place. They provide machine-aided serendipity by wading through massive collections of diverse information to find patterns and then apply those patterns to respond to the needs of the moment.”  The core ideas are software that wades through lots of data about a legal practice to make useful findings for managers.

By the way, if you follow this blog, you have to love “machine-aided serendipity”!

The Consortium’s website offers references, presentations, a variety of resources, and a blog.  The blog even has a category for “legal issues”, but the posts have to do with substantive legal issues, not management and operations.

An introduction to software that identifies relevant documents in litigation

Software that analyzes text documents collected electronically in litigation discovery has developed a long way in the past few years. Variously referred to as “predictive coding,” “technology assisted review“ (TAR), and “computer-assisted review”, the software’s steps can be explained simply.

You train the software on a “seed set” of documents that humans have coded to be relevant or not and then you aim the software at a random sample of other documents that have also been coded for relevance, the “validation set.”

Assuming the software has been trained sufficiently well on the seed set, its effectiveness is judged by “recall,” which is the percentage of relevant documents in the random-sample validation set that the software accurately identified as such.  As pointed out in a recent article in LTN, Oct. 2016 at 40, by three lawyers at the law firm BuckleySandler, courts have paid attention to how the seed set was constructed, but haven’t paid as much attention to the accuracy of coding the validation set.

Once the level of recall satisfies both sides to the lawsuit, the software is unleashed on all the collected documents and it dutifully identifies those deemed by the algorithm to be relevant and thus producible.

Data analytics (NLP) to boost knowledge management efforts

Knowledge management for law firms and law departments has been pursued for decades, but the overall success given the investment seems debatable.  It has been proven difficult to collect the unstructured text of lawyers in a system that others find useful enough to justify the cost.

Perhaps machine learning and natural language processing will replace the older paradigm of contributions by lawyers of their work product, often with key words extracted or sometimes with full-text searching, by a paradigm of software sifting through everything that is saved on a firm or law department’s servers, enriched by  semantic networks or taxonomies created by software.  Natural language processing (NLP) can create the infrastructure of knowledge without lawyers taking any of their time.  Stated differently in the words of the lead articles of a recent publication, data analytics is potentially a “powerful force for increasing knowledge management by amplifying existing data.”  If you can parse and organize and enrich material collected in the ordinary course of legal business, you can boost KM efforts enormously.

These dots connected for me as I read KMWorld, Oct. 2016, at S18 of its white paper on cognitive computing best practices.