Rating

    No results.

Assessor error and term model weights

In my last post, we saw that randomly swapping training labels, in a (simplistic) simulation of the effect of assessor error, leads as expected to a decline in classifier accuracy, with the decline being greater for lower prevalence topics (in part, we surmised, because of the primitive way we were simulating assessor errors). In this […] → Read More: Assessor error and term model weights

Annotator error and predictive reliability

There has been some interesting recent research on the effect of using unreliable annotators to train a text classification or predictive coding system. Why would you want to do such a thing? Well, the unreliable annotators may be much cheaper than a reliable expert, and by paying for a few more annotations, you might be […] → Read More: Annotator error and predictive reliability

Repeated testing does not necessarily invalidate stopping decision

Thinking recently about the question of sequential testing bias in e-discovery, I’ve realized an important qualification to my previous post on the topic. While repeatedly testing an iteratively trained classifier against a target threshold will lead to optimistic bias in the final estimate of effectiveness, it does not necessarily lead to an optimistic bias in […] → Read More: Repeated testing does not necessarily invalidate stopping decision

Sample-based estimation of depth for recall

In my previous post, I advocated the used of depth for recall as a classifier effectiveness metric in e-discovery, as it directly measures the review cost of proceeding to production with the current classifier. If we know where all the responsive documents are in the ranking, then calculating depth for Z recall is straightforward: it […] → Read More: Sample-based estimation of depth for recall

Total annotation cost should guide automated review

One of the most difficult challenges for the manager of an automated e-discovery review is knowing when enough is enough; when it is time to stop training the classifier, and start reviewing the documents it predicts to be responsive. Unfortunately, the guidance the review manager receives from their system providers is not always as helpful […] → Read More: Total annotation cost should guide automated review

Relevance density affects assessor judgment

It is somewhat surprising to me that, having gone to the University of Maryland with the intention of working primarily on the question of assessor variability in relevance judgment, I did in fact end up working (or at least publishing) primarily on the question of assessor variability in relevance judgment. The last of these publications, […] → Read More: Relevance density affects assessor judgment

Measuring incremental cost-to-production in predictive coding

I had the opportunity on Monday of giving a talk on processes for predictive coding in e-discovery to the Victorian Society for Computers and the Law. The key novel suggestion of my talk was that the effectiveness of the iteratively-trained classifier should be measured not (only) by abstract metrics of effectiveness such as F score, […] → Read More: Measuring incremental cost-to-production in predictive coding

Change of career, change of name

This blog has followed by own research interests in becoming increasing focused upon evaluation and technology question in e-discovery, rather than in information retrieval more generally. Now my own career has followed my interests out the ivy-clad gates of academia and into private consulting in e-discovery. In recognition of these changes, I’ve also changed the […] → Read More: Change of career, change of name

The bias of sequential testing in predictive coding

Text classifiers (or predictive coders) are in general trained iteratively, with training data added until acceptable effectiveness is achieved. Some method of measuring or estimating effectiveess is required—or, more precisely, of predicting effectiveness on the remainder of the collection. The simplest way of measuring classifier effectiveness is with a random sample of documents, from which […] → Read More: The bias of sequential testing in predictive coding

Non-authoritative relevance coding degrades classifier accuracy

There has been considerable attention paid to the high level of disagreement between assessors on the relevance of documents, not least on this blog. This level of disagreement has been cited to argue in favour of the use of automated text analytics (or predictive coding) in e-discovery: not only do humans make mistakes, but they […] → Read More: Non-authoritative relevance coding degrades classifier accuracy