Artificial intelligence and librarianship [E-book] : notes for reading
Martin Frické
Livro
Inglês
9780473722944
3. ed.
Wanaka : SoftOption, 2024.
533 p.
Disponível em: https://softoption.us/AIandLibrarianship
Carrossel de IA
Table of contents:
Chapter 1: intellectual background, 18
1.1 Introduction to Artificial Intelligence, 18
1.2 A Genuine Great Leap Forward, 24
1.3 Digitization and Transcription, 26
1.4 A Paean to Text in Structured Digital Form, 29
1.4.1 Text-to-Speech, 29
1.4.2 Machine Translation,... Ver mais
Chapter 1: intellectual background, 18
1.1 Introduction to Artificial Intelligence, 18
1.2 A Genuine Great Leap Forward, 24
1.3 Digitization and Transcription, 26
1.4 A Paean to Text in Structured Digital Form, 29
1.4.1 Text-to-Speech, 29
1.4.2 Machine Translation,... Ver mais
Table of contents:
Chapter 1: intellectual background, 18
1.1 Introduction to Artificial Intelligence, 18
1.2 A Genuine Great Leap Forward, 24
1.3 Digitization and Transcription, 26
1.4 A Paean to Text in Structured Digital Form, 29
1.4.1 Text-to-Speech, 29
1.4.2 Machine Translation, 30
1.4.3 Search and Navigation, 32
1.4.4 Preservation and Archiving, 33
1.4.5 Free Books!, 33
1.4.6 Natural Language Processing, 33
1.4.7 Processing by Computer Software, 34
1.5 Data and the Need for Good Data, 34
1.6 Types of Machine Learning, 37
1.6.1 Supervised, 37
1.6.2 Unsupervised, 39
1.6.3 Semi-Supervised, 40
1.6.4 Self-Supervised, 41
1.6.5 Reinforcement, 43
1.6.6 Reinforcement Learning from Human Feedback (RLHF), 45
1.7 The Concept of Algorithm, 46
1.8 Annotated Readings for Chapter 1, 48
CHAPTER 2: CHATBOTS, 50
2.1 Introduction, 50
2.2 Dialog Processing, 51
2.3 ELIZA to ALICE, 54
2.4 The Turing Test, 57
2.5 Machine Learning Chit-Chat Bots, 57
2.6 LaMDA, 58
2.7 ChatGPT, 59
2.8 Task-Oriented, 62
2.9 GPTs, 65
2.10 Annotated Readings for Chapter 2, 68
CHAPTER 3: LANGUAGE MODELS, 70
3.1 Introduction, 70
3.2 Markov Chains, 71
3.3 Hidden Markov Models, 75
3.4 Shannon's Guessing Game, 77
3.4.1 Introduction, 77
3.4.2 Shannon's Approximations as Markov Processes, 79
3.4.3 Training a Shannon-Markov Model to Produce 'A Baby GPT', 82
3.5 Taylor's Cloze Procedure, 86
3.6 nanoGPT and an Illustration of Training, 87
3.7 Embeddings, 89
3.8 Word Embeddings and Word2Vec, 92
3.9 Adding Knowledge to Language Models, 94
3.10 InstructGPT and the Insights it Provides, 96
7
3.11 Annotated Readings for Chapter 3, 100
CHAPTER 4: LARGE LANGUAGE MODELS, 101
4.1 Introduction, 101
4.2 Seq2Seq, Encoder-Decoder Architecture, and Attention, 102
4.3 Attention and Transformers, 104
4.4 Large Language Models and Foundation Models, 105
4.5 Foundation Models, 105
4.5.1 BERT, 106
4.5.2 GPT-3, GPT-3.5, GPT-4, 107
4.6 Bigger is Better and Switch Transformers, 109
4.7 Base Models to Assistants to Agents, 110
4.8 Concerns and Limitations, 117
4.8.1 Hallucinations, 117
4.8.2 Fakes and Deepfakes, 118
4.8.3 Source Training Data Intellectual Property, Privacy, and Bias, 119
4.8.4 Intellectual Property of the Generated Output, 121
4.8.5 Cybersecurity, 123
4.8.6 Apparent Conflict with Chomsky’s Theories, 123
4.8.7 Environmental Costs, 124
4.8.8 Lack of Transparency, 125
4.9 Adding Knowledge and Reasoning to LLMs, 126
4.10 Annotated Readings for Chapter 4, 127
CHAPTER 5: LARGE MULTIMODAL MODELS, 130
5.1 Introduction, 130
5.2 Built in Safety Restrictions for GPT-4V, 132
8
5.2.1 ‘Inherited’ Restrictions, 132
5.2.2 Privacy, 133
5.2.3 Stereotypes and Ungrounded Inferences, 133
5.2.4 Be My Eyes— Be My AI, 135
5.2.5 An Assessment of the Restrictions, 135
5.3 A General Sense of What GPT-4V Can Do, 136
5.3.1 Follow Textual Instructions, 136
5.3.2 Read Printed or Handwritten Text, 137
5.3.3 Read Some Mathematics, 143
5.3.4 Read Data and Reason with It, 143
5.3.5 Follow Visual Pointing in Images, 143
5.3.6 Analyze Images Including Medical Images, 145
5.3.7 Use Ordinary Common-Sense Knowledge and Reasoning Across Modes, 149
5.3.8 Be an Educational Tutor, 150
5.3.9 Use Visual Diagrams When Writing Computer Code, 151
5.3.10 Have Temporal and Video Understanding, 151
5.3.11 Answer Intelligence Quotient (IQ) Tests, 152
5.3.12 Avoid False Presuppositions, 153
5.3.13 Navigate Real and Virtual Spaces, 153
5.4 Yang et al.’s Conclusion on GPT-4V, 154
5.5 GPT-4 Turbo (Early 2024), 155
5.6 GPT-4o (Later 2024), 156
5.7 Google’s Gemini, 156
5.8 Anthropic’s Claude, 157
5.9 Meta’s LLaMa, 158
5.10 Voice, 159
5.11 Possible Applications for LMMs, 159
5.11.1 Smartphone Uses, 159
5.11.2 Spot the Difference, 160
5.11.3 Producing Reports from Medical Images, 160
5.11.4 Assist with Image Generation, 160
5.11.5 Extension with Plugins, 161
5.11.6 Retrieval-Augmented Generation (RAG), 161
9
5.11.7 Label and Categorize Images, 162
5.11.8 Identify Objects, 162
5.11.9 ‘Igor’, AI Advantage and AI Community, 162
5.12 Annotated Readings for Chapter 5, 163
CHAPTER 6: EVALUATION AND THE FUTURE, 164
6.1 Reliability, Trustworthiness, and Alignment, 164
6.2 System 1 and System 2, 166
6.3 Benchmarks, 167
6.3.1 Introduction, 167
6.3.2 Multi-turn dialogs, 167
6.3.3 Chatbots, 168
6.3.4 Reasoning, 168
6.3.5 Common sense reasoning, 169
6.3.6 MMLU, 170
6.3.7 Coding, 171
6.4 Artificial General Intelligence (AGI), 173
6.5 The ARC-AGI Benchmark, 175
6.6 Artificial Super Intelligence (ASI), 176
6.7 Annotated Readings for Chapter 6, 178
CHAPTER 7: BIAS AND UNFAIRNESS, 179
7.1 Algorithmic Pipeline + Data = Machine Learning, 179
7.2 Some Clarification of the Terms 'Bias' and ‘Unfairness’, 181
7.3 Forms of Bias in Wider Machine Learning, 186
7.4 Bias in Natural Language Processing, 187
7.5 Some Clarification of the Term 'Algorithm', 192
10
7.6 Computer Program Inadequacy, 194
7.7 Bias in the Context of Wider Machine Learning Programs, 197
7.7.1 Fairness ('Distributive Justice'), 198
7.7.2 Debiasing Representation, 208
7.7.3 Panopticon Bias, the Panopticon Gaze, 209
7.7.4 Bias in (Librarianship) Classification, 212
7.8 Stochastic Psittacosis: LLMs and Foundation Models, 212
7.9 Supplement: The Bias of Programmers, 216
7.9.1 The 'Biases' of Professional Programmers, 216
7.9.2 The Biases of All of Us as Programmers, 218
7.10 Annotated Readings for Chapter 7, 218
CHAPTER 8: BIAS IN MACHINE LEARNING AND LIBRARIANSHIP, 221
8.1 Introduction, 221
8.2 Harms of Omission, 223
8.3 What to Digitize, 223
8.4 Search, Primarily Using Search Engines, 224
8.5 Social Media, Dis-, Mis- and False-Information, 231
8.6 Bias in the Organization of Information, 231
8.6.1 Introduction, 231
8.6.2 Be Careful, and Sparing, with Emotive Content, 233
8.6.3 Warrant and Controlled Vocabularies, 233
8.6.4 The Act of Classification Has Consequences, 239
8.6.5 Taxonomies Have Consequences, 241
8.6.6 The Current State of Libraries and Their Organizational Systems, 243
8.6.7 Designing Information Taxonomies for Librarianship, 245
8.7 Navigation: Metadata Supported and Otherwise, 247
11
8.8 Ethical Arguments to Underpin Assertions of Harms of Bias, 249
8.9 Annotated Readings for Chapter 8, 250
CHAPTER 9: WHAT MIGHT NATURAL LANGUAGE PROCESSING (NLP)
BRING TO LIBRARIANSHIP?, 251
9.1 Introduction, 251
9.2 The Pre-Processing Pipeline, 252
9.3 Text Embeddings and Similarity, 254
9.3.1 Searching by Meaning (Semantic Search), 256
9.3.2 Research Trails, 257
9.3.3 Classification, 258
9.3.4 One Style of Recommendation, 258
9.3.5 Plagiarism Detection, 258
9.4 Named Entity Recognition, 259
9.5 Topic Modeling, 260
9.6 Text Classification Problems, 261
9.6.1 Shelving and Subject Classification, 262
9.6.2 Sentiment Analysis, 262
9.6.3 Author or Genre Recognition, 263
9.7 Controlled Vocabularies, Thesauri, and Ontological Vocabularies . 264
9.8 Indexing and Automatic Indexing, 265
9.9 Abstracts, Extracts, Key Phrases, Keywords, and Summaries, 268
9.10 Text Mining and Question Answering, 271
9.11 Machine Translation, 271
9.12 Evidence, 271
12
9.13 This Is Not Magic, 272
9.14 Text Processing and Laws, 273
9.15 Annotated Readings for Chapter 9, 274
CHAPTER 10: WHAT ARE THE OPPORTUNITIES FOR LIBRARIANS?275
10.1 Introduction, 275
10.2 Librarians as Synergists, 279
10.3 Librarians as Sentries, 283
10.4 Librarians as Educators, 284
10.5 Librarians as Managers, 286
10.6 Librarians as Astronauts, 287
10.7 Annotated Readings for Chapter 10, 288
CHAPTER 11: LIBRARIANS AS SYNERGISTS, 290
11.1 Intellectual Freedom, 290
11.1.1 Text Recognition, 292
11.1.2 Speech to Text, 302
11.1.3 Sign Language to Text, and Text to Sign Language, 304
11.1.4 Helping Filter and Personalize, 305
11.1.5 Scholarly Publishing, 306
11.1.6 What Can Be Done With Computer Text, 306
11.1.7 ELI5 Translation, 306
11.2 Improving the Intermediation Between 'Users' and 'Information
Resources'., 307
11.2.1 Some Users Might Not Be Human, 307
11.2.2 Some Resources Might Not Be Resources, 308
11.2.3 Digital Archiving, 308
11.2.4 Enhanced Search Engines, 308
13
11.2.5 Personalization and Recommendation, 311
11.2.6 Recommender Systems, 312
11.2.7 Understanding What the User is Asking For, 315
11.2.8 Text Mining, 315
11.2.9 Information Assistants (and ‘GPTs’), 316
11.3 Improving Traditional Cataloging, Classification, and Retrieval Tools
, 318
11.3.1 NLP Inspired Improvements, 321
11.3.2 Metadata Generation and Automatic Cataloging, 322
11.3.3 Some Retrieval Tools, 323
11.4 Chatbots, 330
11.4.1 Reference Interviews, 331
11.4.2 Virtual Services, 333
11.4.3 Chatbots as Continuous User Testing of a Library's Public Interface., 334
11.5 Release, Produce, or Curate Training Data, 334
11.6 Debunking, Disinformation, Misinformation, and Fakes, 336
11.7 Social Epistemology, 336
11.8 Robots, 339
11.9 Images, 341
11.10 Annotated Readings for Chapter 11, 342
CHAPTER 12: LIBRARIANS AS SENTRIES, 343
12.1 Copyright and Intellectual Property, 343
12.2 Intellectual Freedom, 343
12.3 Censorship and Algorithmic Curation, 344
12.4 Privacy, 346
14
12.5 Bias, 347
12.6 Social Epistemology, 347
12.6.1 Reliability, Validity, and Over Confidence, 347
12.6.2 Confirmation Bias and Poor Reasoning, 348
12.6.3 Misinformation, 348
12.6.4 Awareness of the Digital Literacy of Patrons, 348
12.7 Chatbots, 349
12.8 Personalization and Paternalism, 350
12.9 Images and Facial Recognition Technology, 352
12.10 Losing Jobs, 353
12.11 Annotated Readings for Chapter 12, 354
CHAPTER 13: LIBRARIANS AS EDUCATORS, 355
13.1 Information Literacy (for Consumers of Information), 355
13.2 Artificial Intelligence Literacy, 355
13.3 Data Information Literacy (for Producers of Information), 358
13.4 Changes in Learning and Teaching, 359
13.5 Scholarly Communication, 359
13.6 Academic Libraries Collaborating with other University Units, 360
13.7 AI Laboratories in the Library, 360
13.8 Automated Decision-Making, 361
13.9 Explainable Artificial Intelligence (XAI), 367
15
13.10 Annotated Readings for Chapter 12, 370
CHAPTER 14: LIBRARIANS AS MANAGERS, 372
14.1 Coming on Board, 372
14.2 Data and Analyses, 375
14.3 Evidence-Based Librarianship, 376
14.4 Data-Driven Decision Making, 377
14.4.1 Collection Building and Management, 377
14.4.2 Circulation and User Studies, 377
14.4.3 Processing in Libraries, 377
14.4.4 Research and Scholarship, 378
14.4.5 Service Quality, 378
14.5 Acquiring the Appropriate AI Tools, 378
14.6 Analysts and Staff, 379
14.7 Fear of AI, 379
14.8 Annotated Readings for Chapter 14, 380
CHAPTER 15: LIBRARIANS AS ASTRONAUTS, 381
15.1 Astronaut Training, 381
15.2 Why Should You Learn How To Do It?, 381
15.3 What are the Real Creative Possibilities, 382
15.4 Sitting in Your Tin Can, 384
15.5 Exploring World 3, 385
15.5.1 Undiscovered Public Knowledge (UPK), 385
15.5.2 Literature-Based Discovery (Text Based Informatics), 388
16
15.5.3 A Message to Librarian Astronauts, 388
15.6 Annotated Readings for Chapter 15, 389
APPENDIX A: SOME THEORETICAL BACKGROUND TO
LIBRARIANSHIP, 390
A.1 Concepts, Classification, Taxonomies, and Items, 390
A.2 Controlled Vocabularies, and Thesauri, 391
A.3 Ontologies and Ontological Vocabularies, 393
A.4 Objective, Intersubjective, and Subjective, 395
A.5 Emotive and Descriptive Content, 397
A.6 Classification Schemes and the Act of Classification, 399
A.7 Annotated Readings for Appendix A, 401
APPENDIX B: WORKING WITH LLMS, 402
B.1 Introduction, 402
B.2 Prompts and Prompt Engineering, 403
B.2.1 Basic Examples of Zero-Shot Prompting, 405
B.2.2 Examples of Few-Shot Prompting, 411
B.2.3 Chain of Thought Prompting, 413
B.2.4 Tuning, or Configuring, the Models or Prompts, 415
B.3 Choices on Development, 418
B.4 Moving Forward With LangChain, 421
B.4.0 A Note on the Status of LangChain and Similar as of 11/6/2023, 421
B.4.1 What is LangChain?, 422
B.4.2 LangChain Experiments Displayed to a Web Page, 424
B.4.3 LangChain Using Jupyter, 435
B.4.4 Resources for LangChain using Jupyter, 438
17
B.5 Annotated Resources for Appendix B, 439
APPENDIX C: TWO IMPORTANT METHODOLOGICAL POINTS, 441
C.1 False Positives and False Negatives, 441
C.2 The Base-Rate Fallacy, 443
C.3 Annotated Readings for Appendix C, 447
APPENDIX D: CAUSAL DIAGRAMS, 449
D.1 Causation and Correlation, 449
D.2 Causal Diagrams, 451
D.3 Annotated Readings for Appendix D, 467
APPENDIX E: KNOWLEDGE GRAPHS, 468
E.1 Knowledge Graphs, 468
E.2 Annotated Readings for Appendix E, 470
GLOSSARY, 471
BIBLIOGRAPHY, 507 Ver menos
Chapter 1: intellectual background, 18
1.1 Introduction to Artificial Intelligence, 18
1.2 A Genuine Great Leap Forward, 24
1.3 Digitization and Transcription, 26
1.4 A Paean to Text in Structured Digital Form, 29
1.4.1 Text-to-Speech, 29
1.4.2 Machine Translation, 30
1.4.3 Search and Navigation, 32
1.4.4 Preservation and Archiving, 33
1.4.5 Free Books!, 33
1.4.6 Natural Language Processing, 33
1.4.7 Processing by Computer Software, 34
1.5 Data and the Need for Good Data, 34
1.6 Types of Machine Learning, 37
1.6.1 Supervised, 37
1.6.2 Unsupervised, 39
1.6.3 Semi-Supervised, 40
1.6.4 Self-Supervised, 41
1.6.5 Reinforcement, 43
1.6.6 Reinforcement Learning from Human Feedback (RLHF), 45
1.7 The Concept of Algorithm, 46
1.8 Annotated Readings for Chapter 1, 48
CHAPTER 2: CHATBOTS, 50
2.1 Introduction, 50
2.2 Dialog Processing, 51
2.3 ELIZA to ALICE, 54
2.4 The Turing Test, 57
2.5 Machine Learning Chit-Chat Bots, 57
2.6 LaMDA, 58
2.7 ChatGPT, 59
2.8 Task-Oriented, 62
2.9 GPTs, 65
2.10 Annotated Readings for Chapter 2, 68
CHAPTER 3: LANGUAGE MODELS, 70
3.1 Introduction, 70
3.2 Markov Chains, 71
3.3 Hidden Markov Models, 75
3.4 Shannon's Guessing Game, 77
3.4.1 Introduction, 77
3.4.2 Shannon's Approximations as Markov Processes, 79
3.4.3 Training a Shannon-Markov Model to Produce 'A Baby GPT', 82
3.5 Taylor's Cloze Procedure, 86
3.6 nanoGPT and an Illustration of Training, 87
3.7 Embeddings, 89
3.8 Word Embeddings and Word2Vec, 92
3.9 Adding Knowledge to Language Models, 94
3.10 InstructGPT and the Insights it Provides, 96
7
3.11 Annotated Readings for Chapter 3, 100
CHAPTER 4: LARGE LANGUAGE MODELS, 101
4.1 Introduction, 101
4.2 Seq2Seq, Encoder-Decoder Architecture, and Attention, 102
4.3 Attention and Transformers, 104
4.4 Large Language Models and Foundation Models, 105
4.5 Foundation Models, 105
4.5.1 BERT, 106
4.5.2 GPT-3, GPT-3.5, GPT-4, 107
4.6 Bigger is Better and Switch Transformers, 109
4.7 Base Models to Assistants to Agents, 110
4.8 Concerns and Limitations, 117
4.8.1 Hallucinations, 117
4.8.2 Fakes and Deepfakes, 118
4.8.3 Source Training Data Intellectual Property, Privacy, and Bias, 119
4.8.4 Intellectual Property of the Generated Output, 121
4.8.5 Cybersecurity, 123
4.8.6 Apparent Conflict with Chomsky’s Theories, 123
4.8.7 Environmental Costs, 124
4.8.8 Lack of Transparency, 125
4.9 Adding Knowledge and Reasoning to LLMs, 126
4.10 Annotated Readings for Chapter 4, 127
CHAPTER 5: LARGE MULTIMODAL MODELS, 130
5.1 Introduction, 130
5.2 Built in Safety Restrictions for GPT-4V, 132
8
5.2.1 ‘Inherited’ Restrictions, 132
5.2.2 Privacy, 133
5.2.3 Stereotypes and Ungrounded Inferences, 133
5.2.4 Be My Eyes— Be My AI, 135
5.2.5 An Assessment of the Restrictions, 135
5.3 A General Sense of What GPT-4V Can Do, 136
5.3.1 Follow Textual Instructions, 136
5.3.2 Read Printed or Handwritten Text, 137
5.3.3 Read Some Mathematics, 143
5.3.4 Read Data and Reason with It, 143
5.3.5 Follow Visual Pointing in Images, 143
5.3.6 Analyze Images Including Medical Images, 145
5.3.7 Use Ordinary Common-Sense Knowledge and Reasoning Across Modes, 149
5.3.8 Be an Educational Tutor, 150
5.3.9 Use Visual Diagrams When Writing Computer Code, 151
5.3.10 Have Temporal and Video Understanding, 151
5.3.11 Answer Intelligence Quotient (IQ) Tests, 152
5.3.12 Avoid False Presuppositions, 153
5.3.13 Navigate Real and Virtual Spaces, 153
5.4 Yang et al.’s Conclusion on GPT-4V, 154
5.5 GPT-4 Turbo (Early 2024), 155
5.6 GPT-4o (Later 2024), 156
5.7 Google’s Gemini, 156
5.8 Anthropic’s Claude, 157
5.9 Meta’s LLaMa, 158
5.10 Voice, 159
5.11 Possible Applications for LMMs, 159
5.11.1 Smartphone Uses, 159
5.11.2 Spot the Difference, 160
5.11.3 Producing Reports from Medical Images, 160
5.11.4 Assist with Image Generation, 160
5.11.5 Extension with Plugins, 161
5.11.6 Retrieval-Augmented Generation (RAG), 161
9
5.11.7 Label and Categorize Images, 162
5.11.8 Identify Objects, 162
5.11.9 ‘Igor’, AI Advantage and AI Community, 162
5.12 Annotated Readings for Chapter 5, 163
CHAPTER 6: EVALUATION AND THE FUTURE, 164
6.1 Reliability, Trustworthiness, and Alignment, 164
6.2 System 1 and System 2, 166
6.3 Benchmarks, 167
6.3.1 Introduction, 167
6.3.2 Multi-turn dialogs, 167
6.3.3 Chatbots, 168
6.3.4 Reasoning, 168
6.3.5 Common sense reasoning, 169
6.3.6 MMLU, 170
6.3.7 Coding, 171
6.4 Artificial General Intelligence (AGI), 173
6.5 The ARC-AGI Benchmark, 175
6.6 Artificial Super Intelligence (ASI), 176
6.7 Annotated Readings for Chapter 6, 178
CHAPTER 7: BIAS AND UNFAIRNESS, 179
7.1 Algorithmic Pipeline + Data = Machine Learning, 179
7.2 Some Clarification of the Terms 'Bias' and ‘Unfairness’, 181
7.3 Forms of Bias in Wider Machine Learning, 186
7.4 Bias in Natural Language Processing, 187
7.5 Some Clarification of the Term 'Algorithm', 192
10
7.6 Computer Program Inadequacy, 194
7.7 Bias in the Context of Wider Machine Learning Programs, 197
7.7.1 Fairness ('Distributive Justice'), 198
7.7.2 Debiasing Representation, 208
7.7.3 Panopticon Bias, the Panopticon Gaze, 209
7.7.4 Bias in (Librarianship) Classification, 212
7.8 Stochastic Psittacosis: LLMs and Foundation Models, 212
7.9 Supplement: The Bias of Programmers, 216
7.9.1 The 'Biases' of Professional Programmers, 216
7.9.2 The Biases of All of Us as Programmers, 218
7.10 Annotated Readings for Chapter 7, 218
CHAPTER 8: BIAS IN MACHINE LEARNING AND LIBRARIANSHIP, 221
8.1 Introduction, 221
8.2 Harms of Omission, 223
8.3 What to Digitize, 223
8.4 Search, Primarily Using Search Engines, 224
8.5 Social Media, Dis-, Mis- and False-Information, 231
8.6 Bias in the Organization of Information, 231
8.6.1 Introduction, 231
8.6.2 Be Careful, and Sparing, with Emotive Content, 233
8.6.3 Warrant and Controlled Vocabularies, 233
8.6.4 The Act of Classification Has Consequences, 239
8.6.5 Taxonomies Have Consequences, 241
8.6.6 The Current State of Libraries and Their Organizational Systems, 243
8.6.7 Designing Information Taxonomies for Librarianship, 245
8.7 Navigation: Metadata Supported and Otherwise, 247
11
8.8 Ethical Arguments to Underpin Assertions of Harms of Bias, 249
8.9 Annotated Readings for Chapter 8, 250
CHAPTER 9: WHAT MIGHT NATURAL LANGUAGE PROCESSING (NLP)
BRING TO LIBRARIANSHIP?, 251
9.1 Introduction, 251
9.2 The Pre-Processing Pipeline, 252
9.3 Text Embeddings and Similarity, 254
9.3.1 Searching by Meaning (Semantic Search), 256
9.3.2 Research Trails, 257
9.3.3 Classification, 258
9.3.4 One Style of Recommendation, 258
9.3.5 Plagiarism Detection, 258
9.4 Named Entity Recognition, 259
9.5 Topic Modeling, 260
9.6 Text Classification Problems, 261
9.6.1 Shelving and Subject Classification, 262
9.6.2 Sentiment Analysis, 262
9.6.3 Author or Genre Recognition, 263
9.7 Controlled Vocabularies, Thesauri, and Ontological Vocabularies . 264
9.8 Indexing and Automatic Indexing, 265
9.9 Abstracts, Extracts, Key Phrases, Keywords, and Summaries, 268
9.10 Text Mining and Question Answering, 271
9.11 Machine Translation, 271
9.12 Evidence, 271
12
9.13 This Is Not Magic, 272
9.14 Text Processing and Laws, 273
9.15 Annotated Readings for Chapter 9, 274
CHAPTER 10: WHAT ARE THE OPPORTUNITIES FOR LIBRARIANS?275
10.1 Introduction, 275
10.2 Librarians as Synergists, 279
10.3 Librarians as Sentries, 283
10.4 Librarians as Educators, 284
10.5 Librarians as Managers, 286
10.6 Librarians as Astronauts, 287
10.7 Annotated Readings for Chapter 10, 288
CHAPTER 11: LIBRARIANS AS SYNERGISTS, 290
11.1 Intellectual Freedom, 290
11.1.1 Text Recognition, 292
11.1.2 Speech to Text, 302
11.1.3 Sign Language to Text, and Text to Sign Language, 304
11.1.4 Helping Filter and Personalize, 305
11.1.5 Scholarly Publishing, 306
11.1.6 What Can Be Done With Computer Text, 306
11.1.7 ELI5 Translation, 306
11.2 Improving the Intermediation Between 'Users' and 'Information
Resources'., 307
11.2.1 Some Users Might Not Be Human, 307
11.2.2 Some Resources Might Not Be Resources, 308
11.2.3 Digital Archiving, 308
11.2.4 Enhanced Search Engines, 308
13
11.2.5 Personalization and Recommendation, 311
11.2.6 Recommender Systems, 312
11.2.7 Understanding What the User is Asking For, 315
11.2.8 Text Mining, 315
11.2.9 Information Assistants (and ‘GPTs’), 316
11.3 Improving Traditional Cataloging, Classification, and Retrieval Tools
, 318
11.3.1 NLP Inspired Improvements, 321
11.3.2 Metadata Generation and Automatic Cataloging, 322
11.3.3 Some Retrieval Tools, 323
11.4 Chatbots, 330
11.4.1 Reference Interviews, 331
11.4.2 Virtual Services, 333
11.4.3 Chatbots as Continuous User Testing of a Library's Public Interface., 334
11.5 Release, Produce, or Curate Training Data, 334
11.6 Debunking, Disinformation, Misinformation, and Fakes, 336
11.7 Social Epistemology, 336
11.8 Robots, 339
11.9 Images, 341
11.10 Annotated Readings for Chapter 11, 342
CHAPTER 12: LIBRARIANS AS SENTRIES, 343
12.1 Copyright and Intellectual Property, 343
12.2 Intellectual Freedom, 343
12.3 Censorship and Algorithmic Curation, 344
12.4 Privacy, 346
14
12.5 Bias, 347
12.6 Social Epistemology, 347
12.6.1 Reliability, Validity, and Over Confidence, 347
12.6.2 Confirmation Bias and Poor Reasoning, 348
12.6.3 Misinformation, 348
12.6.4 Awareness of the Digital Literacy of Patrons, 348
12.7 Chatbots, 349
12.8 Personalization and Paternalism, 350
12.9 Images and Facial Recognition Technology, 352
12.10 Losing Jobs, 353
12.11 Annotated Readings for Chapter 12, 354
CHAPTER 13: LIBRARIANS AS EDUCATORS, 355
13.1 Information Literacy (for Consumers of Information), 355
13.2 Artificial Intelligence Literacy, 355
13.3 Data Information Literacy (for Producers of Information), 358
13.4 Changes in Learning and Teaching, 359
13.5 Scholarly Communication, 359
13.6 Academic Libraries Collaborating with other University Units, 360
13.7 AI Laboratories in the Library, 360
13.8 Automated Decision-Making, 361
13.9 Explainable Artificial Intelligence (XAI), 367
15
13.10 Annotated Readings for Chapter 12, 370
CHAPTER 14: LIBRARIANS AS MANAGERS, 372
14.1 Coming on Board, 372
14.2 Data and Analyses, 375
14.3 Evidence-Based Librarianship, 376
14.4 Data-Driven Decision Making, 377
14.4.1 Collection Building and Management, 377
14.4.2 Circulation and User Studies, 377
14.4.3 Processing in Libraries, 377
14.4.4 Research and Scholarship, 378
14.4.5 Service Quality, 378
14.5 Acquiring the Appropriate AI Tools, 378
14.6 Analysts and Staff, 379
14.7 Fear of AI, 379
14.8 Annotated Readings for Chapter 14, 380
CHAPTER 15: LIBRARIANS AS ASTRONAUTS, 381
15.1 Astronaut Training, 381
15.2 Why Should You Learn How To Do It?, 381
15.3 What are the Real Creative Possibilities, 382
15.4 Sitting in Your Tin Can, 384
15.5 Exploring World 3, 385
15.5.1 Undiscovered Public Knowledge (UPK), 385
15.5.2 Literature-Based Discovery (Text Based Informatics), 388
16
15.5.3 A Message to Librarian Astronauts, 388
15.6 Annotated Readings for Chapter 15, 389
APPENDIX A: SOME THEORETICAL BACKGROUND TO
LIBRARIANSHIP, 390
A.1 Concepts, Classification, Taxonomies, and Items, 390
A.2 Controlled Vocabularies, and Thesauri, 391
A.3 Ontologies and Ontological Vocabularies, 393
A.4 Objective, Intersubjective, and Subjective, 395
A.5 Emotive and Descriptive Content, 397
A.6 Classification Schemes and the Act of Classification, 399
A.7 Annotated Readings for Appendix A, 401
APPENDIX B: WORKING WITH LLMS, 402
B.1 Introduction, 402
B.2 Prompts and Prompt Engineering, 403
B.2.1 Basic Examples of Zero-Shot Prompting, 405
B.2.2 Examples of Few-Shot Prompting, 411
B.2.3 Chain of Thought Prompting, 413
B.2.4 Tuning, or Configuring, the Models or Prompts, 415
B.3 Choices on Development, 418
B.4 Moving Forward With LangChain, 421
B.4.0 A Note on the Status of LangChain and Similar as of 11/6/2023, 421
B.4.1 What is LangChain?, 422
B.4.2 LangChain Experiments Displayed to a Web Page, 424
B.4.3 LangChain Using Jupyter, 435
B.4.4 Resources for LangChain using Jupyter, 438
17
B.5 Annotated Resources for Appendix B, 439
APPENDIX C: TWO IMPORTANT METHODOLOGICAL POINTS, 441
C.1 False Positives and False Negatives, 441
C.2 The Base-Rate Fallacy, 443
C.3 Annotated Readings for Appendix C, 447
APPENDIX D: CAUSAL DIAGRAMS, 449
D.1 Causation and Correlation, 449
D.2 Causal Diagrams, 451
D.3 Annotated Readings for Appendix D, 467
APPENDIX E: KNOWLEDGE GRAPHS, 468
E.1 Knowledge Graphs, 468
E.2 Annotated Readings for Appendix E, 470
GLOSSARY, 471
BIBLIOGRAPHY, 507 Ver menos
Artificial intelligence and librarianship [E-book] : notes for reading
Martin Frické
Artificial intelligence and librarianship [E-book] : notes for reading
Martin Frické