cover

Full Table of Contents for AI: A Modern Approach

Part I: Artificial Intelligence

Chapter 1   Introduction ... 1

    What Is AI? ... 1
        1.1.1   Acting humanly: The Turing test approach ... 2
        1.1.2   Thinking humanly: The cognitive modeling approach ... 2
        1.1.3   Thinking rationally: The ``laws of thought'' approach ... 3
        1.1.4   Acting rationally: The rational agent approach ... 3
        1.1.5   Beneficial machines ... 4
    1.2   The Foundations of Artificial Intelligence ... 5
        1.2.1   Philosophy ... 6
        1.2.2   Mathematics ... 8
        1.2.3   Economics ... 9
        1.2.4   Neuroscience ... 11
        1.2.5   Psychology ... 12
        1.2.6   Computer engineering ... 14
        1.2.7   Control theory and cybernetics ... 15
        1.2.8   Linguistics ... 16
    1.3   The History of Artificial Intelligence ... 17
        1.3.1   The inception of artificial intelligence (1943--1956) ... 17
        1.3.2   Early enthusiasm, great expectations (1952--1969) ... 18
        1.3.3   A dose of reality (1966--1973) ... 21
        1.3.4   Expert systems (1969--1986) ... 22
        1.3.5   The return of neural networks (1986--present) ... 24
        1.3.6   Probabilistic reasoning and machine learning (1987--present) ... 24
        1.3.7   Big data (2001--present) ... 26
        1.3.8   Deep learning (2011--present) ... 26
    1.4   The State of the Art ... 27
    1.5   Risks and Benefits of AI ... 31
    Summary ... 34
    Bibliographical and Historical Notes ... 35

Chapter 2   Intelligent Agents ... 36

    2.1   Agents and Environments ... 36
    2.2   Good Behavior: The Concept of Rationality ... 39
        2.2.1   Performance measures ... 39
        2.2.2   Rationality ... 40
        2.2.3   Omniscience, learning, and autonomy ... 40
    2.3   The Nature of Environments ... 42
        2.3.1   Specifying the task environment ... 42
        2.3.2   Properties of task environments ... 43
    2.4   The Structure of Agents ... 47
        2.4.1   Agent programs ... 48
        2.4.2   Simple reflex agents ... 49
        2.4.3   Model-based reflex agents ... 51
        2.4.4   Goal-based agents ... 53
        2.4.5   Utility-based agents ... 54
        2.4.6   Learning agents ... 56
        2.4.7   How the components of agent programs work ... 58
    Summary ... 60
    Bibliographical and Historical Notes ... 60

Part II: Problem-solving

Chapter 3   Solving Problems by Searching ... 63

    3.1   Problem-Solving Agents ... 63
        3.1.1   Search problems and solutions ... 65
        3.1.2   Formulating problems ... 66
    3.2   Example Problems ... 66
        3.2.1   Standardized problems ... 66
        3.2.2   Real-world problems ... 69
    3.3   Search Algorithms ... 71
        3.3.1   Best-first search ... 73
        3.3.2   Search data structures ... 73
        3.3.3   Redundant paths ... 74
        3.3.4   Measuring problem-solving performance ... 75
    3.4   Uninformed Search Strategies ... 76
        3.4.1   Breadth-first search ... 76
        3.4.2   Dijkstra's algorithm or uniform-cost search ... 77
        3.4.3   Depth-first search and the problem of memory ... 78
        3.4.4   Depth-limited and iterative deepening search ... 80
        3.4.5   Bidirectional search ... 82
        3.4.6   Comparing uninformed search algorithms ... 84
    3.5   Informed (Heuristic) Search Strategies ... 84
        3.5.1   Greedy best-first search ... 85
        3.5.2   A* search ... 85
        3.5.3   Search contours ... 89
        3.5.4   Satisficing search: Inadmissible heuristics and weighted A* ... 90
        3.5.5   Memory-bounded search ... 92
        3.5.6   Bidirectional heuristic search ... 96
    3.6   Heuristic Functions ... 97
        3.6.1   The effect of heuristic accuracy on performance ... 98
        3.6.2   Generating heuristics from relaxed problems ... 99
        3.6.3   Generating heuristics from subproblems: Pattern databases ... 100
        3.6.4   Generating heuristics with landmarks ... 102
        3.6.5   Learning to search better ... 103
        3.6.6   Learning heuristics from experience ... 104
    Summary ... 104
    Bibliographical and Historical Notes ... 106

Chapter 4   Search in Complex Environments ... 110

    4.1   Local Search and Optimization Problems ... 110
        4.1.1   Hill-climbing search ... 111
        4.1.2   Simulated annealing ... 114
        4.1.3   Local beam search ... 115
        4.1.4   Evolutionary algorithms ... 115
    4.2   Local Search in Continuous Spaces ... 119
    4.3   Search with Nondeterministic Actions ... 122
        4.3.1   The erratic vacuum world ... 122
        4.3.2   AND—OR search trees ... 123
        4.3.3   Try, try again ... 125
    4.4   Search in Partially Observable Environments ... 126
        4.4.1   Searching with no observation ... 126
        4.4.2   Searching in partially observable environments ... 130
        4.4.3   Solving partially observable problems ... 132
        4.4.4   An agent for partially observable environments ... 132
    4.5   Online Search Agents and Unknown Environments ... 134
        4.5.1   Online search problems ... 135
        4.5.2   Online search agents ... 137
        4.5.3   Online local search ... 138
        4.5.4   Learning in online search ... 140
    Summary ... 141
    Bibliographical and Historical Notes ... 142

Chapter 5   Adversarial Search and Games ... 146

    5.1   Game Theory ... 146
        5.1.1   Two-player zero-sum games ... 147
    5.2   Optimal Decisions in Games ... 148
        5.2.1   The minimax search algorithm ... 149
        5.2.2   Optimal decisions in multiplayer games ... 151
        5.2.3   Alpha--Beta Pruning ... 152
        5.2.4   Move ordering ... 153
    5.3   Heuristic Alpha--Beta Tree Search ... 156
        5.3.1   Evaluation functions ... 156
        5.3.2   Cutting off search ... 158
        5.3.3   Forward pruning ... 159
        5.3.4   Search versus lookup ... 160
    5.4   Monte Carlo Tree Search ... 161
    5.5   Stochastic Games ... 164
        5.5.1   Evaluation functions for games of chance ... 166
    5.6   Partially Observable Games ... 168
        5.6.1   Kriegspiel: Partially observable chess ... 168
        5.6.2   Card games ... 171
    5.7   Limitations of Game Search Algorithms ... 173
    Summary ... 174
    Bibliographical and Historical Notes ... 175

Chapter 6   Constraint Satisfaction Problems ... 180

    6.1   Defining Constraint Satisfaction Problems ... 180
        6.1.1   Example problem: Map coloring ... 181
        6.1.2   Example problem: Job-shop scheduling ... 182
        6.1.3   Variations on the CSP formalism ... 183
    6.2   Constraint Propagation: Inference in CSPs ... 185
        6.2.1   Node consistency ... 186
        6.2.2   Arc consistency ... 186
        6.2.3   Path consistency ... 187
        6.2.4   Kconsistency ... 188
        6.2.5   Global constraints ... 188
        6.2.6   Sudoku ... 189
    6.3   Backtracking Search for CSPs ... 191
        6.3.1   Variable and value ordering ... 193
        6.3.2   Interleaving search and inference ... 194
        6.3.3   Intelligent backtracking: Looking backward ... 195
        6.3.4   Constraint learning ... 196
    6.4   Local Search for CSPs ... 197
    6.5   The Structure of Problems ... 199
        6.5.1   Cutset conditioning ... 200
        6.5.2   Tree decomposition ... 201
        6.5.3   Value symmetry ... 203
    Summary ... 203
    Bibliographical and Historical Notes ... 204

Part III: Knowledge, reasoning, and planning

Chapter 7   Logical Agents ... 208

    7.1   Knowledge-Based Agents ... 209
    7.2   The Wumpus World ... 210
    7.3   Logic ... 214
    7.4   Propositional Logic: A Very Simple Logic ... 217
        7.4.1   Syntax ... 217
        7.4.2   Semantics ... 218
        7.4.3   A simple knowledge base ... 220
        7.4.4   A simple inference procedure ... 220
    7.5   Propositional Theorem Proving ... 222
        7.5.1   Inference and proofs ... 223
        7.5.2   Proof by resolution ... 225
            Conjunctive normal form ... 226
            A resolution algorithm ... 227
            Completeness of resolution ... 228
        7.5.3   Horn clauses and definite clauses ... 229
        7.5.4   Forward and backward chaining ... 230
    7.6   Effective Propositional Model Checking ... 232
        7.6.1   A complete backtracking algorithm ... 233
        7.6.2   Local search algorithms ... 235
        7.6.3   The landscape of random SAT problems ... 236
    7.7   Agents Based on Propositional Logic ... 237
        7.7.1   The current state of the world ... 237
        7.7.2   A hybrid agent ... 241
        7.7.3   Logical state estimation ... 241
        7.7.4   Making plans by propositional inference ... 244
    Summary ... 246
    Bibliographical and Historical Notes ... 247

Chapter 8   First-Order Logic ... 251

    8.1   Representation Revisited ... 251
        8.1.1   The language of thought ... 252
        8.1.2   Combining the best of formal and natural languages ... 254
    8.2   Syntax and Semantics of First-Order Logic ... 256
        8.2.1   Models for first-order logic ... 256
        8.2.2   Symbols and interpretations ... 257
        8.2.3   Terms ... 259
        8.2.4   Atomic sentences ... 260
        8.2.5   Complex sentences ... 260
        8.2.6   Quantifiers ... 260
            Universal quantification (∀) ... 260
            Existential quantification (∃) ... 262
            Nested quantifiers ... 263
            Connections between ∀ and ∃ ... 263
        8.2.7   Equality ... 264
        8.2.8   Database semantics ... 264
    8.3   Using First-Order Logic ... 265
        8.3.1   Assertions and queries in first-order logic ... 265
        8.3.2   The kinship domain ... 266
        8.3.3   Numbers, sets, and lists ... 268
        8.3.4   The wumpus world ... 270
    8.4   Knowledge Engineering in First-Order Logic ... 271
        8.4.1   The knowledge engineering process ... 272
        8.4.2   The electronic circuits domain ... 273
            Identify the questions ... 273
            Assemble the relevant knowledge ... 274
            Decide on a vocabulary ... 274
            Encode general knowledge of the domain ... 275
            Encode the specific problem instance ... 276
            Pose queries to the inference procedure ... 276
            Debug the knowledge base ... 277
    Summary ... 277
    Bibliographical and Historical Notes ... 278

Chapter 9   Inference in First-Order Logic ... 280

    9.1   Propositional vs. First-Order Inference ... 280
        9.1.1   Reduction to propositional inference ... 281
    9.2   Unification and First-Order Inference ... 282
        9.2.1   Unification ... 283
        9.2.2   Storage and retrieval ... 284
    9.3   Forward Chaining ... 286
        9.3.1   First-order definite clauses ... 286
        9.3.2   A simple forward-chaining algorithm ... 287
        9.3.3   Efficient forward chaining ... 289
            Matching rules against known facts ... 289
            Incremental forward chaining ... 291
            Irrelevant facts ... 292
    9.4   Backward Chaining ... 293
        9.4.1   A backward-chaining algorithm ... 293
        9.4.2   Logic programming ... 294
        9.4.3   Redundant inference and infinite loops ... 295
        9.4.4   Database semantics of Prolog ... 297
        9.4.5   Constraint logic programming ... 298
    9.5   Resolution ... 298
        9.5.1   Conjunctive normal form for first-order logic ... 299
        9.5.2   The resolution inference rule ... 300
        9.5.3   Example proofs ... 301
        9.5.4   Completeness of resolution ... 303
        9.5.5   Equality ... 306
        9.5.6   Resolution strategies ... 308
            Practical uses of resolution theorem provers ... 309
    Summary ... 309
    Bibliographical and Historical Notes ... 310

Chapter 10   Knowledge Representation ... 314

    10.1   Ontological Engineering ... 314
    10.2   Categories and Objects ... 317
        10.2.1   Physical composition ... 318
        10.2.2   Measurements ... 319
        10.2.3   Objects: Things and stuff ... 321
    10.3   Events ... 322
        10.3.1   Time ... 324
        10.3.2   Fluents and objects ... 325
    10.4   Mental Objects and Modal Logic ... 326
        10.4.1   Other modal logics ... 328
    10.5   Reasoning Systems for Categories ... 329
        10.5.1   Semantic networks ... 329
        10.5.2   Description logics ... 331
    10.6   Reasoning with Default Information ... 333
        10.6.1   Circumscription and default logic ... 333
        10.6.2   Truth maintenance systems ... 335
    Summary ... 337
    Bibliographical and Historical Notes ... 338

Chapter 11   Automated Planning ... 344

    11.1   Definition of Classical Planning ... 344
        11.1.1   Example domain: Air cargo transport ... 345
        11.1.2   Example domain: The spare tire problem ... 346
        11.1.3   Example domain: The blocks world ... 346
    11.2   Algorithms for Classical Planning ... 348
        11.2.1   Forward state-space search for planning ... 348
        11.2.2   Backward search for planning ... 350
        11.2.3   Planning as Boolean satisfiability ... 351
        11.2.4   Other classical planning approaches ... 352
    11.3   Heuristics for Planning ... 353
        11.3.1   Domain-independent pruning ... 354
        11.3.2   State abstraction in planning ... 355
    11.4   Hierarchical Planning ... 356
        11.4.1   High-level actions ... 357
        11.4.2   Searching for primitive solutions ... 358
        11.4.3   Searching for abstract solutions ... 360
    11.5   Planning and Acting in Nondeterministic Domains ... 365
        11.5.1   Sensorless planning ... 367
        11.5.2   Contingent planning ... 370
        11.5.3   Online planning ... 371
    11.6   Time, Schedules, and Resources ... 374
        11.6.1   Representing temporal and resource constraints ... 375
        11.6.2   Solving scheduling problems ... 376
    11.7   Analysis of Planning Approaches ... 378
    Summary ... 379
    Bibliographical and Historical Notes ... 380

Part IV: Uncertain knowledge and reasoning

Chapter 12   Quantifying Uncertainty ... 385

    12.1   Acting under Uncertainty ... 385
        12.1.1   Summarizing uncertainty ... 386
        12.1.2   Uncertainty and rational decisions ... 387
    12.2   Basic Probability Notation ... 388
        12.2.1   What probabilities are about ... 388
        12.2.2   The language of propositions in probability assertions ... 390
        12.2.3   Probability axioms and their reasonableness ... 393
    12.3   Inference Using Full Joint Distributions ... 395
    12.4   Independence ... 397
    12.5   Bayes' Rule and Its Use ... 399
        12.5.1   Applying Bayes' rule: The simple case ... 399
        12.5.2   Using Bayes' rule: Combining evidence ... 400
    12.6   Naive Bayes Models ... 402
        12.6.1   Text classification with naive Bayes ... 403
    12.7   The Wumpus World Revisited ... 404
    Summary ... 407
    Bibliographical and Historical Notes ... 408

Chapter 13   Probabilistic Reasoning ... 412

    13.1   Representing Knowledge in an Uncertain Domain ... 412
    13.2   The Semantics of Bayesian Networks ... 414
            A method for constructing Bayesian networks ... 415
            Compactness and node ordering ... 417
        13.2.1   Conditional independence relations in Bayesian networks ... 418
        13.2.2   Efficient Representation of Conditional Distributions ... 420
        13.2.3   Bayesian nets with continuous variables ... 422
        13.2.4   Case study: Car insurance ... 424
    13.3   Exact Inference in Bayesian Networks ... 427
        13.3.1   Inference by enumeration ... 427
        13.3.2   The variable elimination algorithm ... 430
            Operations on factors ... 431
            Variable ordering and variable relevance ... 432
        13.3.3   The complexity of exact inference ... 433
        13.3.4   Clustering algorithms ... 434
    13.4   Approximate Inference for Bayesian Networks ... 435
        13.4.1   Direct sampling methods ... 436
            Rejection sampling in Bayesian networks ... 437
            Importance sampling ... 439
        13.4.2   Inference by Markov chain simulation ... 441
            Gibbs sampling in Bayesian networks ... 442
            Analysis of Markov chains ... 443
            Why Gibbs sampling works ... 445
            Complexity of Gibbs sampling ... 446
            Metropolis--Hastings sampling ... 447
        13.4.3   Compiling approximate inference ... 448
    13.5   Causal Networks ... 449
        13.5.1   Representing actions: do-operator ... 451
        13.5.2   The back-door criterion ... 453
    Summary ... 453
    Bibliographical and Historical Notes ... 454

Chapter 14   Probabilistic Reasoning over Time ... 461

    14.1   Time and Uncertainty ... 461
        14.1.1   States and observations ... 462
        14.1.2   Transition and sensor models ... 463
    14.2   Inference in Temporal Models ... 465
        14.2.1   Filtering and prediction ... 466
        14.2.2   Smoothing ... 468
        14.2.3   Finding the most likely sequence ... 471
    14.3   Hidden Markov Models ... 473
        14.3.1   Simplified matrix algorithms ... 474
        14.3.2   Hidden Markov model example: Localization ... 476
    14.4   Kalman Filters ... 479
        14.4.1   Updating Gaussian distributions ... 479
        14.4.2   A simple one-dimensional example ... 480
        14.4.3   The general case ... 482
        14.4.4   Applicability of Kalman filtering ... 483
    14.5   Dynamic Bayesian Networks ... 485
        14.5.1   Constructing DBNs ... 486
        14.5.2   Exact inference in DBNs ... 489
        14.5.3   Approximate inference in DBNs ... 491
    Summary ... 496
    Bibliographical and Historical Notes ... 497

Chapter 15   Probabilistic Programming ... 500

    15.1   Relational Probability Models ... 501
        15.1.1   Syntax and semantics ... 502
        15.1.2   Example: Rating player skill levels ... 505
        15.1.3   Inference in relational probability models ... 506
    15.2   Open-Universe Probability Models ... 507
        15.2.1   Syntax and semantics ... 508
        15.2.2   Inference in open-universe probability models ... 510
        15.2.3   Examples ... 511
            Citation matching ... 511
            Nuclear treaty monitoring ... 512
    15.3   Keeping Track of a Complex World ... 514
        15.3.1   Example: Multitarget tracking ... 515
        15.3.2   Example: Traffic monitoring ... 518
    15.4   Programs as Probability Models ... 519
        15.4.1   Example: Reading text ... 519
        15.4.2   Syntax and semantics ... 520
        15.4.3   Inference results ... 522
        15.4.4   Improving the generative program to incorporate a Markov model ... 522
        15.4.5   Inference in generative programs ... 522
    Summary ... 523
    Bibliographical and Historical Notes ... 524

Chapter 16   Making Simple Decisions ... 528

    16.1   Combining Beliefs and Desires under Uncertainty ... 528
    16.2   The Basis of Utility Theory ... 529
        16.2.1   Constraints on rational preferences ... 530
        16.2.2   Rational preferences lead to utility ... 531
    16.3   Utility Functions ... 532
        16.3.1   Utility assessment and utility scales ... 533
        16.3.2   The utility of money ... 534
        16.3.3   Expected utility and post-decision disappointment ... 536
        16.3.4   Human judgment and irrationality ... 538
    16.4   Multiattribute Utility Functions ... 540
        16.4.1   Dominance ... 540
        16.4.2   Preference structure and multiattribute utility ... 543
            Preferences without uncertainty ... 543
            Preferences with uncertainty ... 544
    16.5   Decision Networks ... 544
        16.5.1   Representing a decision problem with a decision network ... 545
        16.5.2   Evaluating decision networks ... 546
    16.6   The Value of Information ... 547
        16.6.1   A simple example ... 547
        16.6.2   A general formula for perfect information ... 548
        16.6.3   Properties of the value of information ... 549
        16.6.4   Implementation of an information-gathering agent ... 550
        16.6.5   Nonmyopic information gathering ... 551
        16.6.6   Sensitivity analysis and robust decisions ... 552
    16.7   Unknown Preferences ... 553
        16.7.1   Uncertainty about one's own preferences ... 553
        16.7.2   Deference to humans ... 554
    Summary ... 557
    Bibliographical and Historical Notes ... 557

Chapter 17   Making Complex Decisions ... 562

    17.1   Sequential Decision Problems ... 562
        17.1.1   Utilities over time ... 564
        17.1.2   Optimal policies and the utilities of states ... 567
        17.1.3   Reward scales ... 569
        17.1.4   Representing MDPs ... 570
    17.2   Algorithms for MDPs ... 572
        17.2.1   Value Iteration ... 572
            Convergence of value iteration ... 574
        17.2.2   Policy iteration ... 576
        17.2.3   Linear programming ... 578
        17.2.4   Online algorithms for MDPs ... 578
    17.3   Bandit Problems ... 581
        17.3.1   Calculating the Gittins index ... 583
        17.3.2   The Bernoulli bandit ... 584
        17.3.3   Approximately optimal bandit policies ... 585
        17.3.4   Non-indexable variants ... 586
    17.4   Partially Observable MDPs ... 588
        17.4.1   Definition of POMDPs ... 588
    17.5   Algorithms for Solving POMDPs ... 590
        17.5.1   Value iteration for POMDPs ... 590
        17.5.2   Online algorithms for POMDPs ... 593
    Summary ... 595
    Bibliographical and Historical Notes ... 596

Chapter 18   Multiagent Decision Making ... 599

    18.1   Properties of Multiagent Environments ... 599
        18.1.1   One decision maker ... 599
        18.1.2   Multiple decision makers ... 600
        18.1.3   Multiagent planning ... 601
        18.1.4   Planning with multiple agents: Cooperation and coordination ... 604
    18.2   Non-Cooperative Game Theory ... 605
        18.2.1   Games with a single move: Normal form games ... 605
        18.2.2   Social welfare ... 609
            Computing equilibria ... 610
        18.2.3   Repeated games ... 614
        18.2.4   Sequential games: The extensive form ... 617
            Chance and simultaneous moves ... 619
            Capturing imperfect information ... 619
        18.2.5   Uncertain payoffs and assistance games ... 623
    18.3   Cooperative Game Theory ... 626
        18.3.1   Coalition structures and outcomes ... 626
        18.3.2   Strategy in cooperative games ... 627
        18.3.3   Computation in cooperative games ... 630
            Marginal contribution nets ... 630
            Coalition structures for maximum social welfare ... 631
    18.4   Making Collective Decisions ... 632
        18.4.1   Allocating tasks with the contract net ... 632
        18.4.2   Allocating scarce resources with auctions ... 634
            Common goods ... 637
        18.4.3   Voting ... 638
            Strategic manipulation ... 641
        18.4.4   Bargaining ... 641
            Bargaining with the alternating offers protocol ... 641
            Impatient agents ... 642
            Negotiation in task-oriented domains ... 643
            The monotonic concession protocol ... 644
            The Zeuthen strategy ... 644
    Summary ... 645
    Bibliographical and Historical Notes ... 646

Part V: Machine Learning

Chapter 19   Learning from Examples ... 651

    19.1   Forms of Learning ... 651
    19.2   Supervised Learning ... 653
        19.2.1   Example problem: Restaurant waiting ... 656
    19.3   Learning Decision Trees ... 657
        19.3.1   Expressiveness of decision trees ... 657
        19.3.2   Learning decision trees from examples ... 658
        19.3.3   Choosing attribute tests ... 661
        19.3.4   Generalization and overfitting ... 663
        19.3.5   Broadening the applicability of decision trees ... 664
    19.4   Model Selection and Optimization ... 665
        19.4.1   Model selection ... 667
        19.4.2   From error rates to loss ... 669
        19.4.3   Regularization ... 671
        19.4.4   Hyperparameter tuning ... 671
    19.5   The Theory of Learning ... 672
        19.5.1   PAC learning example: Learning decision lists ... 674
    19.6   Linear Regression and Classification ... 676
        19.6.1   Univariate linear regression ... 676
        19.6.2   Gradient descent ... 677
        19.6.3   Multivariable linear regression ... 679
        19.6.4   Linear classifiers with a hard threshold ... 682
        19.6.5   Linear classification with logistic regression ... 684
    19.7   Nonparametric Models ... 686
        19.7.1   Nearest-neighbor models ... 687
        19.7.2   Finding nearest neighbors with k-d trees ... 688
        19.7.3   Locality-sensitive hashing ... 689
        19.7.4   Nonparametric regression ... 691
        19.7.5   Support vector machines ... 692
        19.7.6   The kernel trick ... 695
    19.8   Ensemble Learning ... 696
        19.8.1   Bagging ... 697
        19.8.2   Random forests ... 697
        19.8.3   Stacking ... 699
        19.8.4   Boosting ... 699
        19.8.5   Gradient boosting ... 701
        19.8.6   Online learning ... 702
    19.9   Developing Machine Learning Systems ... 704
        19.9.1   Problem formulation ... 704
        19.9.2   Data collection, assessment, and management ... 705
            Feature engineering ... 707
            Exploratory data analysis and visualization ... 708
        19.9.3   Model selection and training ... 709
        19.9.4   Trust, interpretability, and explainability ... 710
        19.9.5   Operation, monitoring, and maintenance ... 712
    Summary ... 714
    Bibliographical and Historical Notes ... 715

Chapter 20   Learning Probabilistic Models ... 721

    20.1   Statistical Learning ... 721
    20.2   Learning with Complete Data ... 724
        20.2.1   Maximum-likelihood parameter learning: Discrete models ... 725
        20.2.2   Naive Bayes models ... 727
        20.2.3   Generative and discriminative models ... 727
        20.2.4   Maximum-likelihood parameter learning: Continuous models ... 728
        20.2.5   Bayesian parameter learning ... 730
        20.2.6   Bayesian linear regression ... 732
        20.2.7   Learning Bayes net structures ... 734
        20.2.8   Density estimation with nonparametric models ... 736
    20.3   Learning with Hidden Variables: The EM Algorithm ... 737
        20.3.1   Unsupervised clustering: Learning mixtures of Gaussians ... 738
        20.3.2   Learning Bayes net parameter values for hidden variables ... 741
        20.3.3   Learning hidden Markov models ... 744
        20.3.4   The general form of the EM algorithm ... 744
        20.3.5   Learning Bayes net structures with hidden variables ... 745
    Summary ... 746
    Bibliographical and Historical Notes ... 747

Chapter 21   Deep Learning ... 750

    21.1   Simple Feedforward Networks ... 751
        21.1.1   Networks as complex functions ... 751
        21.1.2   Gradients and learning ... 754
    21.2   Computation Graphs for Deep Learning ... 756
        21.2.1   Input encoding ... 756
        21.2.2   Output layers and loss functions ... 757
        21.2.3   Hidden layers ... 759
    21.3   Convolutional Networks ... 760
        21.3.1   Pooling and downsampling ... 762
        21.3.2   Tensor operations in CNNs ... 763
        21.3.3   Residual networks ... 764
    21.4   Learning Algorithms ... 765
        21.4.1   Computing gradients in computation graphs ... 766
        21.4.2   Batch normalization ... 768
    21.5   Generalization ... 768
        21.5.1   Choosing a network architecture ... 768
        21.5.2   Neural architecture search ... 770
        21.5.3   Weight decay ... 771
        21.5.4   Dropout ... 772
    21.6   Recurrent Neural Networks ... 772
        21.6.1   Training a basic RNN ... 773
        21.6.2   Long short-term memory RNNs ... 775
    21.7   Unsupervised Learning and Transfer Learning ... 775
        21.7.1   Unsupervised learning ... 776
            Probabilistic PCA: A simple generative model ... 776
            Autoencoders ... 778
            Deep autoregressive models ... 779
            Generative adversarial networks ... 780
            Unsupervised translation ... 780
        21.7.2   Transfer learning and multitask learning ... 781
    21.8   Applications ... 782
        21.8.1   Vision ... 782
        21.8.2   Natural language processing ... 783
        21.8.3   Reinforcement learning ... 783
    Summary ... 784
    Bibliographical and Historical Notes ... 785

Chapter 22   Reinforcement Learning ... 789

    22.1   Learning from Rewards ... 789
    22.2   Passive Reinforcement Learning ... 791
        22.2.1   Direct utility estimation ... 792
        22.2.2   Adaptive dynamic programming ... 793
        22.2.3   Temporal-difference learning ... 793
    22.3   Active Reinforcement Learning ... 797
        22.3.1   Exploration ... 797
        22.3.2   Safe exploration ... 799
        22.3.3   Temporal-difference Q-learning ... 801
    22.4   Generalization in Reinforcement Learning ... 803
        22.4.1   Approximating direct utility estimation ... 804
        22.4.2   Approximating temporal-difference learning ... 805
        22.4.3   Deep reinforcement learning ... 806
        22.4.4   Reward shaping ... 807
        22.4.5   Hierarchical reinforcement learning ... 807
    22.5   Policy Search ... 810
    22.6   Apprenticeship and Inverse Reinforcement Learning ... 812
    22.7   Applications of Reinforcement Learning ... 815
        22.7.1   Applications in game playing ... 815
        22.7.2   Application to robot control ... 816
    Summary ... 818
    Bibliographical and Historical Notes ... 819

Part VI: Communicating, perceiving, and acting

Chapter 23   Natural Language Processing ... 823

    23.1   Language Models ... 823
        23.1.1   The bag-of-words model ... 824
        23.1.2   N-gram word models ... 826
        23.1.3   Other n-gram models ... 826
        23.1.4   Smoothing n-gram models ... 827
        23.1.5   Word representations ... 828
        23.1.6   Part-of-speech (POS) tagging ... 829
        23.1.7   Comparing language models ... 832
    23.2   Grammar ... 833
        23.2.1   The lexicon of E0 ... 835
    23.3   Parsing ... 835
        23.3.1   Dependency parsing ... 838
        23.3.2   Learning a parser from examples ... 839
    23.4   Augmented Grammars ... 841
        23.4.1   Semantic interpretation ... 843
        23.4.2   Learning semantic grammars ... 845
    23.5   Complications of Real Natural Language ... 845
    23.6   Natural Language Tasks ... 849
    Summary ... 850
    Bibliographical and Historical Notes ... 851

Chapter 24   Deep Learning for Natural Language Processing ... 856

    24.1   Word Embeddings ... 856
    24.2   Recurrent Neural Networks for NLP ... 860
        24.2.1   Language models with recurrent neural networks ... 860
        24.2.2   Classification with recurrent neural networks ... 862
        24.2.3   LSTMs for NLP tasks ... 863
    24.3   Sequence-to-Sequence Models ... 864
        24.3.1   Attention ... 865
        24.3.2   Decoding ... 867
    24.4   The Transformer Architecture ... 868
        24.4.1   Self-attention ... 868
        24.4.2   From self-attention to transformer ... 869
    24.5   Pretraining and Transfer Learning ... 871
        24.5.1   Pretrained word embeddings ... 871
        24.5.2   Pretrained contextual representations ... 873
        24.5.3   Masked language models ... 873
    24.6   State of the art ... 875
    Summary ... 878
    Bibliographical and Historical Notes ... 878

Chapter 25   Computer Vision ... 881

    25.1   Introduction ... 881
    25.2   Image Formation ... 882
        25.2.1   Images without lenses: The pinhole camera ... 882
        25.2.2   Lens systems ... 884
        25.2.3   Scaled orthographic projection ... 885
        25.2.4   Light and shading ... 886
        25.2.5   Color ... 888
    25.3   Simple Image Features ... 888
        25.3.1   Edges ... 889
        25.3.2   Texture ... 892
        25.3.3   Optical flow ... 893
        25.3.4   Segmentation of natural images ... 894
    25.4   Classifying Images ... 895
        25.4.1   Image classification with convolutional neural networks ... 896
        25.4.2   Why convolutional neural networks classify images well ... 897
    25.5   Detecting Objects ... 899
    25.6   The 3D World ... 901
        25.6.1   3D cues from multiple views ... 901
        25.6.2   Binocular stereopsis ... 902
        25.6.3   3D cues from a moving camera ... 904
        25.6.4   3D cues from one view ... 905
    25.7   Using Computer Vision ... 906
        25.7.1   Understanding what people are doing ... 906
        25.7.2   Linking pictures and words ... 909
        25.7.3   Reconstruction from many views ... 911
        25.7.4   Geometry from a single view ... 911
        25.7.5   Making pictures ... 913
        25.7.6   Controlling movement with vision ... 917
    Summary ... 919
    Bibliographical and Historical Notes ... 920

Chapter 26   Robotics ... 925

    26.1   Robots ... 925
    26.2   Robot Hardware ... 926
        26.2.1   Types of robots from the hardware perspective ... 926
        26.2.2   Sensing the world ... 927
        26.2.3   Producing motion ... 929
    26.3   What kind of problem is robotics solving? ... 930
    26.4   Robotic Perception ... 931
        26.4.1   Localization and mapping ... 932
        26.4.2   Other types of perception ... 935
        26.4.3   Supervised and unsupervised learning in robot perception ... 937
    26.5   Planning and Control ... 938
        26.5.1   Configuration space ... 939
        26.5.2   Motion planning ... 942
            Visibility graphs ... 943
            Voronoi diagrams ... 943
            Cell decomposition ... 945
            Randomized motion planning ... 946
            Rapidly-exploring random trees ... 947
            Trajectory optimization for kinematic planning ... 948
        26.5.3   Trajectory tracking control ... 950
            Plans versus policies ... 954
        26.5.4   Optimal control ... 954
    26.6   Planning Uncertain Movements ... 956
    26.7   Reinforcement Learning in Robotics ... 958
        26.7.1   Exploiting models ... 959
        26.7.2   Exploiting other information ... 960
    26.8   Humans and Robots ... 961
        26.8.1   Coordination ... 961
            Humans as approximately rational agents ... 961
            Predicting human action ... 962
            Human predictions about the robot ... 964
            Humans as black box agents ... 965
        26.8.2   Learning to do what humans want ... 965
            Preference learning: Learning cost functions ... 965
            Learning policies directly via imitation ... 966
    26.9   Alternative Robotic Frameworks ... 968
        26.9.1   Reactive controllers ... 968
        26.9.2   Subsumption architectures ... 969
    26.10   Application Domains ... 971
    Summary ... 974
    Bibliographical and Historical Notes ... 975

Part VII: Conclusions

Chapter 27   Philosophy, Ethics, and Safety of AI ... 981

    27.1   The Limits of AI ... 981
        27.1.1   The argument from informality ... 981
        27.1.2   The argument from disability ... 982
        27.1.3   The mathematical objection ... 983
        27.1.4   Measuring AI ... 984
    27.2   Can Machines Really Think? ... 984
        27.2.1   The Chinese room ... 985
        27.2.2   Consciousness and qualia ... 985
    27.3   The Ethics of AI ... 986
        27.3.1   Lethal autonomous weapons ... 987
        27.3.2   Surveillance, security, and privacy ... 990
        27.3.3   Fairness and bias ... 992
        27.3.4   Trust and transparency ... 996
        27.3.5   The future of work ... 998
        27.3.6   Robot rights ... 1000
        27.3.7   AI Safety ... 1001
    Summary ... 1005
    Bibliographical and Historical Notes ... 1006

Chapter 28   The Future of AI ... 1012

    28.1   AI Components ... 1012
            Sensors and actuators ... 1012
            Representing the state of the world ... 1013
            Selecting actions ... 1014
            Deciding what we want ... 1014
            Learning ... 1015
            Resources ... 1017
    28.2   AI Architectures ... 1018
            General AI ... 1020
            AI engineering ... 1021
            The future ... 1022

Appendix A:   Mathematical Background ... 1023

    A.1   Complexity Analysis and O() Notation ... 1023
        A.1.1   Asymptotic analysis ... 1023
        A.1.2   NP and inherently hard problems ... 1024
    A.2   Vectors, Matrices, and Linear Algebra ... 1025
    A.3   Probability Distributions ... 1027
    Bibliographical and Historical Notes ... 1029

Appendix B:   Notes on Languages and Algorithms ... 1030

    B.1   Defining Languages with Backus--Naur Form (BNF) ... 1030
    B.2   Describing Algorithms with Pseudocode ... 1031
    B.3   Online Supplemental Material ... 1032

Bibliography ... 1033

Index ... 1089