full transcript
From the Ted Talk by Brian Christian: How to get better at video games, according to babies
Unscramble the Blue Letters
Spoiler alert: babies. We’ll come back to that in a minute.
Playing Atari games with AI involves what’s called reinforcement lanienrg, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal dievrs the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based aehppracos, where they have a model of the environment that they can use to predict what will happen next once they take a certain aicton. DQN, however, is model free. Instead of eitpicxlly modeling its eoinrnenvmt, it just lnraes to predict, based on the images on screen, how many future ptnios it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”
Open Cloze
Spoiler alert: babies. We’ll come back to that in a minute.
Playing Atari games with AI involves what’s called reinforcement ________, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal ______ the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based __________, where they have a model of the environment that they can use to predict what will happen next once they take a certain ______. DQN, however, is model free. Instead of __________ modeling its ___________, it just ______ to predict, based on the images on screen, how many future ______ it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”
Solution
- learns
- environment
- drives
- explicitly
- learning
- points
- action
- approaches
Original Text
Spoiler alert: babies. We’ll come back to that in a minute.
Playing Atari games with AI involves what’s called reinforcement learning, where the system is designed to maximize some kind of numerical rewards. In this case, those rewards were simply the game's points. This underlying goal drives the system to learn which buttons to press and when to press them to get the most points. Some systems use model-based approaches, where they have a model of the environment that they can use to predict what will happen next once they take a certain action. DQN, however, is model free. Instead of explicitly modeling its environment, it just learns to predict, based on the images on screen, how many future points it can expect to earn by pressing different buttons. For instance, “if the ball is here and I move left, more points, but if I move right, no more points.”
Frequently Occurring Word Combinations
Important Words
- action
- ai
- approaches
- atari
- babies
- ball
- based
- buttons
- called
- case
- designed
- dqn
- drives
- earn
- environment
- expect
- explicitly
- free
- future
- games
- goal
- happen
- images
- instance
- involves
- kind
- learn
- learning
- learns
- left
- maximize
- minute
- model
- modeling
- move
- numerical
- playing
- points
- predict
- press
- pressing
- reinforcement
- rewards
- screen
- simply
- spoiler
- system
- systems
- underlying