What Is Google’s New ‘Reinforcement Learning Algorithm’?

Rory Hope ● June 25, 2018 | Blog, SEO

Written by Rory Hope
June 25, 2018 | Blog, SEO

Google Research published a new research paper titled, “Ask the right questions: Active question reformulation with reinforcement learning” in May 2018, which specified a new Reinforcement Learning algorithm which has a new system for answering queries.

Google has coined this algorithmic approach as ‘Active Question Answering’, which uses an agent – known as an ‘AQA agent’ – that acts as an intermediary between the user and a ‘black box’ QA system (Google labels the QA system as the ‘The Environment’).

This AQA agent employs an ‘active question answering’ strategy, which aims to increase the chance of providing the correct answer to a query through sending the ranking algorithm a reformulated question. Essentially, improving Google’s ability to perceive content relevance to a search query.

This algorithm is likely to change the impact of traditional ranking factors. Google Zurich’s Jannis Bulian and Neil Houlsby discussed this new framework, which uses ‘deep reinforcement learning’ to ask the right questions at International Conference on Learning Representations 2018.

As stated in the introduction, the research paper details an approach to query reformulation which uses a new method of presenting queries to a ranking engine.

This machine learning algorithm uses a ‘Reinforcement Learning’. The first component of the AQA agent uses a sequence-to-sequence model trained with reinforcement learning. Reinforcement learning will provide the AQA agent with a reward based on the answer returned by the environment (the backend QA system).

The second component to ‘active question answering’ agent will then combine the evidence following an interaction with the environment using a ‘convolutional neural network’ to select an answer.

The algorithm has no prior knowledge of how a ranking system should function. This ‘black box algorithm’ uses a learning system that reformulates the user query, and asks the ranking engine numerous questions, to then select the most suitable answers from many sets of answers.

So for example, the user would input a question. Then the machine learning algorithm reformulates that question into multiple questions to put to the ranking algorithm. The ranking algorithm will then returns sets of results, and the AQA agent selects the most suitable answer, following the process of ‘active question answering’.

Google states that this new algorithm approach is inspired by a human’s ability to ask the right questions through reformulation.

Google’s research paper specifies: “In the face of complex information needs, humans overcome uncertainty by reformulating questions, issuing multiple searches, and aggregating responses. Inspired by humans’ ability to ask the right questions, we present an agent that learns to carry out this process for the user. The agent sits between the user and a backend QA system that we refer to as ‘the environment”.

The ‘active question answering’ agent is managing machine-to-machine communication to try to adapt the language of a search query to improve the response from another – the QA environment.

Google’s new Reinforcement Learning algorithm will be positioned between the user and the regular ranking algorithm.

The key takeaway here is that the regular ranking algorithm is no longer deciding what to show in the SERPs, the new Reinforcement Learning algorithm is doing so via the AQA agent to make smarter decisions when providing answers.

The Reinforcement Learning algorithm will take into account traditional ranking factors like links, content relevancy, UX signals, but then decide if those traditional signals are important to providing the best answer to the query. This will no doubt change considerably depending on the keyword segment, or topical sphere.

Google’s qualitative analysis found that the AQA agent’s question reformulations ‘diverge significantly from natural language paraphrases’, whereby the agent is able to learn non-trivial and transparent policies.

This is significant for Google as it incentivises relevance, and enables more than deep language understanding.

We have heard Google use the term ‘relevance’ often, for most core algorithm updates in fact. Check out this article I wrote on Google’s content relevance ranking factor last year.

This Reinforcement Learning algorithm is likely going to improve the accuracy of providing the most relevant answer for a searcher, meaning SEOs will need to focus a greater amount of resource on maximising the relevance of their content.

What Is Google’s New ‘Reinforcement Learning Algorithm’?

Google states that this new algorithm approach is inspired by a human’s ability to ask the right questions through reformulation.

Our Blog