Reinforcement Learning

Reinforcement learning refers to goal-oriented algorithms, which learn how to attain a complex objective (goal) or maximize along a particular dimension over many steps; for example, maximize the points won in a game over many moves. They can start from a blank slate, and under the right conditions they achieve superhuman performance. Like a child incentivized by spankings and candy, these algorithms are penalized when they make the wrong decisions and rewarded when they make the right ones – this is reinforcement.

Reinforcement learning solves the difficult problem of correlating immediate actions with the delayed returns they produce. Like humans, reinforcement learning algorithms sometimes have to wait a while to see the fruit of their decisions. They operate in a delayed return environment, where it can be difficult to understand which action leads to which outcome over many time steps.

Reinforcement learning algorithms perform better and better in more ambiguous, real-life environments while choosing from an arbitrary number of possible actions, rather than from the limited options of a video game.

Google AlphaZero From the game of “Go” to the game of Elections

Google AlphaZero a single algorithm thet taught itself from scratch how to master the game of Chess, Shogi ang Go, convincingly beating a world champion program in each case. Alpha Zero’s ability to learn each game by itself results in a distinctive, creative and dynamic playing style.”

It is this kind of Reinforcement learning algorithm that solved the problem of optimization loop for human behavior (behavior modification).

We saw how AlphaZero started learning from scratch and in a very short time became the best player, and won against the best players in the world in one of the most complex games that exist. The similar algorithms are now used to play with our psychology and with our political views. We are entering a post-democratic world where technology has enabled complete control of democratic processes and psychology of entire nations.

Mental state observation (political opinions): From the 2013 research (Private traits and attributes are predictable from digital records of human behavior), we learned how much information Facebook has about all of us. We also learned that Facebook knows the political views of all of us. It’s the same with google and other companies that develop artificial intelligence, the amount of data they know about us is infinite.

Information management: Management of the information that is shown to us, amplifycation, the manipulation of the likes and dislikes …

Reinforcement learning loop used in the latest Brazilian elections

Same model and algorithm can be used for many other purposes. For example, the expansion the spread of radicalism, secessionism, the destabilization of states, possibilities are infinite. Even a nuclear war can be a consequence if this technology is used to spread hatred between the two nuclear states. This proves that artificial intelligence is more dangerous.

HENRY A. KISSINGER  – JUNE 2018 – in the text on artificial intelligence…

“The internet’s purpose is to ratify knowledge through the accumulation and manipulation of ever expanding data. Human cognition loses its personal character. Individuals turn into data, and data become regnant.”

