Rating

    No results.

Off to FTI: see you on the other side

Tomorrow I’m starting a new, full-time position as data scientist at FTI’s lab here in Melbourne. I’m excited to have the opportunity to contribute to the e-discovery community from another angle, as a builder-of-product. Unfortunately, this means the end of this blog, at least in its current form and at least for now. Thanks to […] → Read More: Off to FTI: see you on the other side

Off to FTI: see you on the other side

Tomorrow I’m starting a new, full-time position as data scientist at FTI’s lab here in Melbourne. I’m excited to have the opportunity to contribute to the e-discovery community from another angle, as a builder-of-product. Unfortunately, this means the end of this blog, at least in its current form and at least for now. Thanks to […] → Read More: Off to FTI: see you on the other side

Off to FTI: see you on the other side

Tomorrow I’m starting a new, full-time position as data scientist at FTI’s lab here in Melbourne. I’m excited to have the opportunity to contribute to the e-discovery community from another angle, as a builder-of-product. Unfortunately, this means the end of this blog, at least in its current form and at least for now. Thanks to […] → Read More: Off to FTI: see you on the other side

Confidence intervals on recall and eRecall

There is an ongoing discussion about methods of estimating the recall of a production, as well as estimating a confidence interval on that recall. One approach is to use the control set sample, drawn at the start of production to estimate collection richness and guide the predictive coding process, to also estimate the final confidence […] → Read More: Confidence intervals on recall and eRecall

Confidence intervals on recall and eRecall

There is an ongoing discussion about methods of estimating the recall of a production, as well as estimating a confidence interval on that recall. One approach is to use the control set sample, drawn at the start of production to estimate collection richness and guide the predictive coding process, to also estimate the final confidence […] → Read More: Confidence intervals on recall and eRecall

Confidence intervals on recall and eRecall

There is an ongoing discussion about methods of estimating the recall of a production, as well as estimating a confidence interval on that recall. One approach is to use the control set sample, drawn at the start of production to estimate collection richness and guide the predictive coding process, to also estimate the final confidence […] → Read More: Confidence intervals on recall and eRecall

Why training and review (partly) break control sets

A technology-assisted review (TAR) process frequently begins with the creation of a control set—a set of documents randomly sampled from the collection, and coded by a human expert for relevance. The control set can then be used to estimate the richness (proportion relevant) of the collection, and also to gauge the effectiveness of a predictive […] → Read More: Why training and review (partly) break control sets

Why training and review (partly) break control sets

A technology-assisted review (TAR) process frequently begins with the creation of a control set—a set of documents randomly sampled from the collection, and coded by a human expert for relevance. The control set can then be used to estimate the richness (proportion relevant) of the collection, and also to gauge the effectiveness of a predictive […] → Read More: Why training and review (partly) break control sets

Why training and review (partly) break control sets

A technology-assisted review (TAR) process frequently begins with the creation of a control set—a set of documents randomly sampled from the collection, and coded by a human expert for relevance. The control set can then be used to estimate the richness (proportion relevant) of the collection, and also to gauge the effectiveness of a predictive […] → Read More: Why training and review (partly) break control sets

Total assessment cost with different cost models

In my previous post, I found that relevance and uncertainty selection needed similar numbers of document relevance assessments to achieve a given level of recall. I summarized this by saying the two methods had similar cost. The number of documents assessed, however, is only a very approximate measure of the cost of a review process, […] → Read More: Total assessment cost with different cost models