The idea behind reinforcement learning is you don't necessarily

The idea behind reinforcement learning is you don't necessarily know the actions you might take, so you explore the sequence of actions you should take by taking one that you think is a good idea and then observing how the world reacts. Like in a board game where you can react to how your opponent plays.

- Jeff Dean -

Or copy link

https://importantquote.com//quotes/the-idea-behind-reinforcement-learning-is-you-dont-necessarily-31823

Copy

Copy succcess!

(1 evaluate)

In the counsel of the elders we hear a new parable for our age: “The idea behind reinforcement learning” is that a traveler sets forth without a perfect map. The paths of actions are not yet known; they are discovered by walking. One chooses a step that seems wise, and then listens—deeply—to how the world reacts. Thus the way is carved by trial and error, as a river learns the bend of the earth not from prophecy but from flowing. In this saying, Jeff Dean gathers the logic of machines and the intuition of wanderers into one teaching.

The ancients would nod: wisdom is forged where experience strikes choice like flint on steel. In reinforcement learning, a seeker observes the state of things, takes an action, receives a reward or a rebuke, and adjusts the policy—the inner rule for choosing—so the next step is wiser. This is not the brittle certainty of a script; it is the flexible courage of a sailor trimming his sail to the shifting wind. The lesson is humble and heroic at once: do not demand that the world be simple; become skillful at reading its reply.

Consider the image of the board game that Dean invokes. Across the table sits an opponent, not as an enemy but as a mirror held by fate. You move a stone; the board replies; your sequence of actions takes shape under pressure. In play, as in life, the plan survives only if it can be revised. Strategy, then, is not a monument but a dance: sense the feedback, revise the policy, seek the next best move, again and again, until the pattern of victory reveals itself.

We have seen this parable written in our own century upon the grid of 19 lines by 19 lines. When AlphaGo faced Lee Sedol, the program did not triumph through a fixed script but through countless rehearsals of exploration and reward, shaping a sense of value for positions no master had fully charted. Move 37—quiet as snowfall, shocking as thunder—was not an oracle’s decree; it was the fruit of a policy trained to trust unlikely actions when the hidden returns were rich. And when Lee answered with his own brilliant hand in Game 4, it was the human echo of the same law: attend to the world’s reply, adapt your line, discover the move that was invisible before. Thus machine and grandmaster together testified to the old-new wisdom Dean describes.

Yet this teaching is older than silicon. The craftsman who perfects a blade, reheating and quenching until it sings; the physician who tests a remedy, measuring rewards in steady pulses; the general who scouts the ground and feints to learn where the enemy leans—each practices reinforcement learning in flesh and time. They do not cling to first guesses. They cultivate a rhythm: act, observe, update. This is the drumbeat beneath every durable triumph.

What, then, is the heart of the saying? It is the courage to explore before we exploit, to seek knowledge not only by thinking but by doing, to let feedback correct pride, and to let small rewards accumulate into great gains. The world is not a scroll to be read once; it is an opponent who plays back. If we harden into certainty, we grow brittle; if we listen, we grow strong. The wise sharpen their policy the way a gardener prunes a tree—cutting what withers so that what lives may bear fruit.

Take this lesson to your own road. First, define your state: name where you are without flattery. Second, choose a modest action you judge promising. Third, observe the world’s reaction with honesty. Fourth, update your policy—change your rule for choosing—and repeat. In practice: (1) set one measurable goal per week, (2) try one new move that could raise your reward (a sales script, a study tactic, a training routine), (3) record results the same day, and (4) keep only what improves the score by your own true metric. In this way you will convert chance into learning, learning into mastery, and mastery into service. For the path is walked as it is found, and the game is won one answer to the world at a time.

Jeff Dean

American - Musician

With the author

Traditionally computers have not been that good at interacting with people in ways that people feel natural interacting with.

Jeff Dean

I think that is one of the main goals of pushing forward in machine learning: having computers provide the wisdom that a human companion would be able to provide in offering advice, looking up more information when necessary and those kinds of things.

Jeff Dean

AI can help solve some of the most difficult social and environmental challenges in areas like healthcare, disaster prediction, environmental conservation, agriculture, or cultural preservation.

Jeff Dean

Computers don't usually have a sense of if you have a picture of something what is in that image. And if we can do a good job of understanding what is in an image, that can bring along a lot of new things you can do in applications.

Jeff Dean

I think true artificial general intelligence would be a system that is able to perform human-level reasoning, understanding, and accomplishing of complicated tasks.

Jeff Dean

Reinforcement learning is the idea of being able to assign credit or blame to all the actions you took along the way while you were getting that reward signal.

Jeff Dean

Computers can see, and understand what people say via speech recognition.

Jeff Dean

Same category

Waiting is a period of learning. The longer we wait, the more we hear about him for whom we are waiting.

Henri Nouwen

People want stardom or fame or whatever - instant gratification as opposed to learning one's craft, which, when I was starting out, was the most important thing: that you are as fully equipped for your job or your art as possible.

Joshua Sasse

The greatest gift that Oxford gives her sons is, I truly believe, a genial irreverence toward learning, and from that irreverence love may spring.

Robertson Davies

Learning never exhausts the mind.

Leonardo da Vinci

When I was learning to creep, my mother set me down on the beach to see what I thought of it. I crawled straight for the coming wave and was just through the wall of green when she caught my heels.

Sylvia Plath

I know from my days working on education reform in government that it's almost impossible to exaggerate how little those who work on education policy think about 'how to improve learning.'

Dominic Cummings

Until Ranveer was born in August 2005, three years into our marriage, I was working in Hindi or South Indian films. After marriage, I began learning how to run a house. My mother wanted to teach me the basics, but I was never home. So when my mother-in-law taught me chores, it was hard to adjust.

Sonali Bendre

One of our key strategies has been to restructure traditional high schools into small learning communities with personalized attention and a range of options.

Thomas Menino

Tocpics Related

Welcome, honored guests. Please leave a comment, we will respond soon

Reply.10 tháng trước

The idea behind reinforcement learning is you don't necessarily

With the author

Traditionally computers have not been that good at interacting with people in ways that people feel natural interacting with.

I think that is one of the main goals of pushing forward in machine learning: having computers provide the wisdom that a human companion would be able to provide in offering advice, looking up more information when necessary and those kinds of things.

AI can help solve some of the most difficult social and environmental challenges in areas like healthcare, disaster prediction, environmental conservation, agriculture, or cultural preservation.

Computers don't usually have a sense of if you have a picture of something what is in that image. And if we can do a good job of understanding what is in an image, that can bring along a lot of new things you can do in applications.

I think true artificial general intelligence would be a system that is able to perform human-level reasoning, understanding, and accomplishing of complicated tasks.

Reinforcement learning is the idea of being able to assign credit or blame to all the actions you took along the way while you were getting that reward signal.

Computers can see, and understand what people say via speech recognition.

Same category

Waiting is a period of learning. The longer we wait, the more we hear about him for whom we are waiting.

People want stardom or fame or whatever - instant gratification as opposed to learning one's craft, which, when I was starting out, was the most important thing: that you are as fully equipped for your job or your art as possible.

The greatest gift that Oxford gives her sons is, I truly believe, a genial irreverence toward learning, and from that irreverence love may spring.

Learning never exhausts the mind.

When I was learning to creep, my mother set me down on the beach to see what I thought of it. I crawled straight for the coming wave and was just through the wall of green when she caught my heels.

I know from my days working on education reform in government that it's almost impossible to exaggerate how little those who work on education policy think about 'how to improve learning.'

Until Ranveer was born in August 2005, three years into our marriage, I was working in Hindi or South Indian films. After marriage, I began learning how to run a house. My mother wanted to teach me the basics, but I was never home. So when my mother-in-law taught me chores, it was hard to adjust.

One of our key strategies has been to restructure traditional high schools into small learning communities with personalized attention and a range of options.

Liam Neeson Quotes

Mark Twain Quotes

Conrad Hall Quotes

Brett Young Quotes

Judd Gregg Quotes

Parenting Quotes

Shane Leslie Quotes

Ryan Holiday Quotes

Bobby Seale

Zachary Taylor

Philip Rosenthal

Marcus Samuelsson

Yitzhak Navon

KRS-One

Tulsi Kumar

Morten Andersen