David Hawking, Bodo Billerbeck, Nick Craswell, Paul Thomas

(0)

Kirjoita arvostelu

-25%

Simulating Information Retrieval Test Collections

David Hawking, Bodo Billerbeck, Nick Craswell, Paul Thomas

(0)

Kirjoita arvostelu

Kieli englanti

Kansi Pehmeäkantinen

Julkaistu 2020-09-04

74,24 € 98,98 €

-25% koodilla BOOKS

Pehmeäkantinen 98,98 € Kovakantinen

Toimittajalla varastossa

Toimitus 12-18 arkipäivässä

30 päivän palautusoikeus

Saatat myös pitää

-25%

TOP

If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

Eliezer Yudkowsky, Nate Soares

17,29 € 23,06 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The God Test

Robert Wright

17,29 € 23,06 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Gödel, Escher, Bach: An Eternal Golden Braid

Douglas R. Hofstadter

24,06 € 32,08 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

AI Engineering: Building Applications with Foundation Models

Chip Huyen

83,89 € 111,85 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

Karen Hao

19,06 € 25,41 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The Rust Programming Language

Steve Klabnik, Carol Nichols, Chris Krycho

50,09 € 66,79 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The Art of Game Design: A Book of Lenses

Jesse Schell

90,20 € 120,27 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

How To Think About AI: A Guide For The Perplexed

Richard Susskind

14,02 € 18,70 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The Pragmatic Programmer: journey to mastery, 20th Anniversary Edition, 2/e: your journey to mastery, 20th Anniversary Edition

Andrew Hunt, David Thomas

50,72 € 67,63 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

HBR Guide to Generative AI for Managers

Elisa Farri, Gabriele Rosani

35,50 € 47,33 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Hackers. 25th Anniversary Edition: Heroes of the Computer Revolution

Steven Levy

31,46 € 41,95 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The Web Application Hacker's Handbook: Discovering and Exploiting Security Flaws

Marcus Pinto, Dafydd Stuttard

63,84 € 85,12 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

How Linux Works: What Every Superuser Should Know

Brian Ward

41,68 € 55,57 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Fundamentals of Software Architecture: A Modern Engineering Approach

Mark Richards, Neal Ford

83,89 € 111,85 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Speak Data: Artists, Scientists, Thinkers, and Dreamers on How We Live Our Lives in Numbers

Giorgia Lupi, Phillip Cox

32,57 € 43,43 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Linux Basics for Hackers, 2nd Edition: Getting Started with Networking, Scripting, and Security in Kali

Occupytheweb

33,69 € 44,92 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Deep Learning: Foundations and Concepts

Christopher M. Bishop, Hugh Bishop

98,98 € 131,98 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Mathematics for Machine Learning

Cheng Soon Ong, A. Aldo Faisal, Marc Peter Deisenroth

64,37 € 85,82 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

The Art of Death Stranding

33,25 € 44,33 €

-25% koodilla BOOKS

Toimittajalla varastossa

-25%

TOP

Artificial Intelligence: A Modern Approach, Global Edition

Peter Norvig, Stuart Russell

104,66 € 139,55 €

-25% koodilla BOOKS

Toimittajalla varastossa

Kuvaus

Simulated test collections may find application in situations where real datasets cannot easily be accessed due to confidentiality concerns or practical inconvenience. They can potentially support Information Retrieval (IR) experimentation, tuning, validation, performance prediction, and hardware sizing. Naturally, the accuracy and usefulness of results obtained from a simulation depend upon the fidelity and generality of the models which underpin it. The fidelity of emulation of a real corpus is likely to be limited by the requirement that confidential information in the real corpus should not be able to be extracted from the emulated version. We present a range of methods exploring trade-offs between emulation fidelity and degree of preservation of privacy. We present three different simple types of text generator which work at a micro level: Markov models, neural net models, and substitution ciphers. We also describe macro level methods where we can engineer macro properties of a corpus, giving a range of models for each of the salient properties: document length distribution, word frequency distribution (for independent and non-independent cases), word length and textual representation, and corpus growth. We present results of emulating existing corpora and for scaling up corpora by two orders of magnitude. We show that simulated collections generated with relatively simple methods are suitable for some purposes and can be generated very quickly. Indeed it may sometimes be feasible to embed a simple lightweight corpus generator into an indexer for the purpose of efficiency studies. Naturally, a corpus of artificial text cannot support IR experimentation in the absence of a set of compatible queries. We discuss and experiment with published methods for query generation and query log emulation. We present a proof-of-the-pudding study in which we observe the predictive accuracy of efficiency and effectiveness results obtained on emulated versions of TREC corpora. The study includes three open-source retrieval systems and several TREC datasets. There is a trade-off between confidentiality and prediction accuracy and there are interesting interactions between retrieval systems and datasets. Our tentative conclusion is that there are emulation methods which achieve useful prediction accuracy while providing a level of confidentiality adequate for many applications. Many of the methods described here have been implemented in the open source project SynthaCorpus, accessible at: https://bitbucket.org/davidhawking/synthacorpus/

Lisätietoja

Kirjoittaja	David Hawking, Bodo Billerbeck, Nick Craswell, Paul Thomas
Julkaisija	Springer Nature Switzerland
Series	Synthesis Lectures on Information Concepts, Retrieval, and Services
Julkaisuvuosi	2020
Kannen tyyppi	Pehmeäkantinen
EAN	9783031011955

Kirjoita oma arvostelusi

Arvostelet: Simulating Information Retrieval Test Collections

Arvostelusi:

Goodreads-arvostelut

74,24 € 98,98 €

Simulating Information Retrieval Test Collections

Saatat myös pitää

If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

The God Test

Gödel, Escher, Bach: An Eternal Golden Braid

AI Engineering: Building Applications with Foundation Models

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

The Rust Programming Language

The Art of Game Design: A Book of Lenses

How To Think About AI: A Guide For The Perplexed

The Pragmatic Programmer: journey to mastery, 20th Anniversary Edition, 2/e: your journey to mastery, 20th Anniversary Edition

HBR Guide to Generative AI for Managers

Hackers. 25th Anniversary Edition: Heroes of the Computer Revolution

The Web Application Hacker's Handbook: Discovering and Exploiting Security Flaws

How Linux Works: What Every Superuser Should Know

Fundamentals of Software Architecture: A Modern Engineering Approach

Speak Data: Artists, Scientists, Thinkers, and Dreamers on How We Live Our Lives in Numbers

Linux Basics for Hackers, 2nd Edition: Getting Started with Networking, Scripting, and Security in Kali

Deep Learning: Foundations and Concepts

Mathematics for Machine Learning

The Art of Death Stranding

Artificial Intelligence: A Modern Approach, Global Edition

Kuvaus

Lisätietoja

Goodreads-arvostelut

Olibro

Asiakaspalvelu

Tietoa

Ota yhteyttä

Simulating Information Retrieval Test Collections - David Hawking,Bodo Billerbeck,Nick Craswell,Paul Thomas

Simulating Information Retrieval Test Collections

Saatat myös pitää

Kuvaus

Lisätietoja

Goodreads-arvostelut