Rating

    No results.
12

Explicit negative feedback comes to the web…somewhat

If you use Chrome, you can block results from certain sites. Even if this is equivalent to adding [-site:domain], it certainly makes the query easier to specify. Promoted as a way to filter content farms, it could provide easily data to go beyond simple results filtering.
G is upfront about collecting the data,
If installed, [...] → Read More: Explicit negative feedback comes to the web…somewhat

“Economic Impact Assessment of NIST’s Text REtrieval Conference (TREC) Program”

Thanks to your feedback,
“…this study estimates that TREC’s existence was responsible for approximately one-third of an improvement of more than 200% in web search products that was observed between 1999 and 2009.”
More here.
→ Read More: “Economic Impact Assessment of NIST’s Text REtrieval Conference (TREC) Program”

Query logs and information retrieval research

About one year ago, Bruce Croft asked the IR community for help with getting access to query logs for academia,
The goal of this project is to create a database of web search activity that will be provided to the information retrieval research community to use on current and future information retrieval research projects.
To accomplish [...] → Read More: Query logs and information retrieval research

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke University of Waterloo The TREC 2009 web ad hoc and relevance feedback tasks used a new document collection, the ClueWeb09 dataset, which was crawled from the general Web in early 2009. This dataset contains 1 billion web pages, a substantial fraction of which are […] → Read More: Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets

12