full transcript

From the Ted Talk by Brian Christian: How to get better at video games, according to babies


Unscramble the Blue Letters


Spoiler alert: babies. We’ll come back to that in a minute.

Playing Atari games with AI involves what’s called reinforcement lanienrg, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal dievrs the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based aehppracos, where they have a model of the environment that they can use to predict what will happen next once they take a certain aicton. DQN, however, is model free. Instead of eitpicxlly modeling its eoinrnenvmt, it just lnraes to predict, based on the images on screen, how many future ptnios it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”

Open Cloze


Spoiler alert: babies. We’ll come back to that in a minute.

Playing Atari games with AI involves what’s called reinforcement ________, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal ______ the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based __________, where they have a model of the environment that they can use to predict what will happen next once they take a certain ______. DQN, however, is model free. Instead of __________ modeling its ___________, it just ______ to predict, based on the images on screen, how many future ______ it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”

Solution


  1. learns
  2. environment
  3. drives
  4. explicitly
  5. learning
  6. points
  7. action
  8. approaches

Original Text


Spoiler alert: babies. We’ll come back to that in a minute.

Playing Atari games with AI involves what’s called reinforcement learning, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal drives the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based approaches, where they have a model of the environment that they can use to predict what will happen next once they take a certain action. DQN, however, is model free. Instead of explicitly modeling its environment, it just learns to predict, based on the images on screen, how many future points it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”

Frequently Occurring Word Combinations





Important Words


  1. action
  2. ai
  3. approaches
  4. atari
  5. babies
  6. ball
  7. based
  8. buttons
  9. called
  10. case
  11. designed
  12. dqn
  13. drives
  14. earn
  15. environment
  16. expect
  17. explicitly
  18. free
  19. future
  20. games
  21. goal
  22. happen
  23. images
  24. instance
  25. involves
  26. kind
  27. learn
  28. learning
  29. learns
  30. left
  31. maximize
  32. minute
  33. model
  34. modeling
  35. move
  36. numerical
  37. playing
  38. points
  39. predict
  40. press
  41. pressing
  42. reinforcement
  43. rewards
  44. screen
  45. simply
  46. spoiler
  47. system
  48. systems
  49. underlying