43
Buenos Aires, junio de 2016 Eduardo Poggi

Poggi analytics - ebl - 1

Embed Size (px)

Citation preview

Page 1: Poggi   analytics - ebl - 1

Buenos Aires, junio de 2016Eduardo Poggi

Page 2: Poggi   analytics - ebl - 1

Agenda

EBL Inductive vs Analytical learning Perfect Domain Theories Examples Problems Summary

Page 3: Poggi   analytics - ebl - 1

¿EBL?

Learning general problem-solving techniques by observing and analyzing human solutions to specific problems.

EBL attempts to formulate a generalization after observing only a single (few) example.

Introduced by Gerald De Jong in 1981. Aplicado originalmente en robótica y resolución de problemas.

Page 4: Poggi   analytics - ebl - 1

The EBL Hypothesis

EBL is based on the hypothesis that an intelligent system can learn a general concept after observing only a single example.

By understanding why an example is a member of a concept, can learn the essential properties of the concept.

EBL uses prior knowledge to analyze or explain each training example in order to infer what properties are relevant to the target function and which are irrelevant.

Page 5: Poggi   analytics - ebl - 1

Learning by Generalizing Explanations

Given Goal concept (e.g., some predicate calculus statement) Training example (facts) Domain Theory (inference rules) Operationality Criterion

Given this four inputs, the task is to determine a generalization of the training example that is sufficient concept definition for the goal concept and that satisfies the operationality criteria.

The operationality criterion requires that the final concept definition be described in terms of the predicates used to describe the training example.

Page 6: Poggi   analytics - ebl - 1

Learning algorithms like neural networks, decision trees, inductive logic programming, etc. require a good number of examples to be able to do good predictions.

dataset

Learning Algorithm hypothesis

Dataset must be sufficiently large

Inductive learning

Page 7: Poggi   analytics - ebl - 1

We don’t need a large dataset if besides taking examples as input, the learning algorithm can take prior knowledge.

dataset

Learning Algorithm hypothesis

Dataset does not need to be large

Prior knowledge

Analytical learning

Page 8: Poggi   analytics - ebl - 1

Prior knowledge is used to reduce the size of the hypothesis space.

It analyzes each example to infer which features are relevant and which ones are irrelevant.

Hypothesis Space HS

Priorknowledge Reduced HS

Explanation-based learning

Page 9: Poggi   analytics - ebl - 1

Example, learning to play chess

Suppose we want to learn a concept like “what is a board position in which black will lose the queen in x moves?”.

Chess is a complex game. Each piece can occupy many positions.

We would need many examples to learn this concept.

But humans can learn these type of concepts with very few examples. Why?

Page 10: Poggi   analytics - ebl - 1

Example, learning to play chess

Humans can analyze an example and use prior knowledge related to legal moves.

From there it can generalize with only few examples.

Reasoning like: “Because the white king is attacking both king and queen; black must avoid check, letting white capture the queen”

Page 11: Poggi   analytics - ebl - 1

Example, learning to play chess

What is the prior knowledge involved in playing chess?

It is knowledge about the rules of chess: Legal moves for the pieces. Players alternate moves in games. To win you must capture the opponent’s king.

Page 12: Poggi   analytics - ebl - 1

Inductive Learning Input: HS, D Output: hypothesis h / h is consistent with D

Analytical Learning Input: HS, D, BK Output: hypothesis h / h is consistent with D and BK (-h no se infiere de BK)

With: HS: Hypothesis Space D: Training Set BK: Background knowledge (domain theory)

Inductive and Analytical Learning

Page 13: Poggi   analytics - ebl - 1

Standard Approach to EBL

goal

facts

After Learning (go directly from facts to solution):

goal

facts

An Explanation (detailed proof of goal)

Page 14: Poggi   analytics - ebl - 1

The EBL Process

Page 15: Poggi   analytics - ebl - 1

Example Analytical Learning

Examples: Dataset where each instance is a pair of objects represented by the following predicates: Color, Volume, Owner, Material, Density, On. Example:

On(Obj1,Obj2) Type(Obj1,Box) Owner(Obj2,Louise) Density(Obj1,0.3) Color(Obj1,Red) Material(Obj1,Cardboard) Type(Obj2,Endtable) Color(Obj2,Blue) Material(Obj2,Wood) Volume(Obj1,2) Owner(Obj1,Fred)

Page 16: Poggi   analytics - ebl - 1

Example, Analytical Learning

Domain theory SafeToStack(x,y) ¬Fragile(y) SafeToStack(x,y) Lighter(x,y) Lighter(x,y) Weight(x,wx), Weight(y,wy), LessThan(wx,wy) Weight(p,w) Density(p,d), Volume(p,v), Equal(w, times(d,v)) … Fragile(x) Material(x,Glass)

Page 17: Poggi   analytics - ebl - 1

Example, Analytical Learning

Hypothesis space: set of Horn clause rules. The head of each rule has the predicate SafeToStack. The body of each rule is based on the instances and the

predicates LessThan, Equal, Greater and plus, minus and times. Example: SafeToStack(x,y) Volume(x,vx) ^ Volume(y,vy) ^

LessThan(vx,vy)

Page 18: Poggi   analytics - ebl - 1

Example, Analytical Learning

Determine Hypothesis consistent with both the training examples

and the domain theory Notes

The domain theory refers to predicates not contained in the examples.

The domain theory is sufficient to prove the example is true.

Page 19: Poggi   analytics - ebl - 1

Perfect Domain Theories

A domain theory is correct if each statement is true.

A domain theory is complete if it covers every positive example of the instance space (ie: a target concept and instance space).

A perfect domain theory is correct and complete.

Page 20: Poggi   analytics - ebl - 1

Perfect Domain Theories

Examples of where to find perfect domain theories:

Rules of chess Examples of where not to find perfect domain

theories: SafetoStack problem

In general, only learning problems with perfect domain theories is considered.

Page 21: Poggi   analytics - ebl - 1

EBL Algorithm

We consider an algorithm that has the following properties:

It is a sequential covering algorithm considering the data incrementally

For each positive example not covered by the current rules it forms a new rule by:

Explain how training example satisfies target concept, in terms of domain theory.

Analyze the explanation to find a the most general conditions under which this explanation holds.

Refine the current hypothesis by adding a new Horn Clause rule to cover the example.

Page 22: Poggi   analytics - ebl - 1

Explanation

Page 23: Poggi   analytics - ebl - 1

Explanation

There might be more than one explanation to the example. In that case one or all explanations may be used.

An explanation is obtained using a backward chaining search as is done by Prolog. Prolog-EBG stops when it finds the first proof.

Page 24: Poggi   analytics - ebl - 1

Analyze

Many features appear in an example. Of them, how many are truly relevant?

We consider as relevant those features that show in the explanation.

Example: Relevant feature: Density Irrelevant feature: Owner

Page 25: Poggi   analytics - ebl - 1

Analyze

Taking the leaf nodes of the explanation and substituting variables x and y for Obj1, and Obj2:

SafeToStack(x,y) Volume(x,2), Density(x,0.3), Type(y,Endtable)

Remove features that are independent of x and y such as Equal(0.6,times(2,0.3)) and LessThan(0.6,5).

The rule is now more general and can serve to explain other instances matching the rule.

A more general form of generalization (called “regression”) finds the most general rule explaining the example.

Page 26: Poggi   analytics - ebl - 1

Refine

The current hypothesis is the set of Horn clauses that we have constructed up to this point.

Using sequential covering we keep adding more rules, thus refining our hypothesis.

A new instance is negative if it is not covered by any rule.

Page 27: Poggi   analytics - ebl - 1

Computing the weakest preimage of explanation

Page 28: Poggi   analytics - ebl - 1

Discovering new features

The PROLOG-EBG system described can formulate new features that are not in the training examples:

Example: Volume * Density > 5, derived from the domain theory.

Page 29: Poggi   analytics - ebl - 1

Inductive Bias in Explanation-Based Learning

What is the inductive bias of explanation based learning?

The hypothesis h follows deductively from D (database) and B (Background knowledge)

Bias: prefer small sets of maximally general Horn Clauses

Page 30: Poggi   analytics - ebl - 1

Problems with EBL

The number of control rules that must be learned is very large.

If the control rules are many, much time will be spent looking for the best rule.

Utility analysis is used to determine what rules to keep and what rules to forget.

Another problem with EBL is that it is sometimes difficult to create an explanation for the target concept.

For example, in chess, learning a concept like: “states for which operator A leads to a solution”

The search here grows exponentially.

Page 31: Poggi   analytics - ebl - 1

Summary Different from inductive learning, analytical

learning looks for a hypothesis that fit the background knowledge and covers the training examples.

Explanation based learning is one kind of analytical learning that divides into three steps:

Explain the target value for the current example Analyze the explanation (generalize) Refine the hypothesis

PROLOG-EBG constructs intermediate features after analyzing examples.

Explanation based learning can be used to find search control rules.

Depend on a perfect domain theory.

Page 32: Poggi   analytics - ebl - 1

Componentes del EBL

Page 33: Poggi   analytics - ebl - 1

Componentes del EBL Resolución del problema:

Utilizar el dominio y el ejemplo para llegar al objetivo. Análisis de la traza:

Analizamos la solución para obtener una explicación. Usamos criterios:

Criterio de relevancia: aquella información que forme parte en el camino de llegar a la solución.

Criterio de operatividad: aquellas reglas que se activan directamente. Filtrado:

Filtra la información de la traza para llegar a la explicación. Generalización:

Sustituye las constantes por variables de forma que la expresión siga siendo válida.

El método más usado es el algoritmo de regresión de objetivos. Regresionar una fórmula f a través de una regla r consiste en determinar las

condiciones necesarias y suficientes bajo las cuales puede usarse la regla r para obtener f.

Construir nueva información: La información que construimos puede ser:

Reglas de dominio que expresaran nuevas definiciones de conceptos en las que la parte derecha de la regla serán las combinaciones necesarias y suficientes del árbol de explicación generalizado mientras que la raíz del mismo será la parte izquierda de la regla.

Reglas de control que se construyen de forma similar. Incorporar:

Las nuevas reglas hay que incorporarlas a la base de conocimiento, pero a veces se obtienen demasiadas reglas para introducir en nuestro sistema.

Hay que llegar a una solución de compromiso (velocidad de inferencia o base de afirmaciones mayor).

Page 34: Poggi   analytics - ebl - 1

Ejemplo 1

Page 35: Poggi   analytics - ebl - 1

Ejemplo 1

Page 36: Poggi   analytics - ebl - 1

Ejemplo 2

Concepto objetivo: TAZA Definición funcional:

TAZA(x) <- RECIPIENTE_ABIERTO(x) ^ ESTABLE(x) ^ ALZABLE(x)

Ejemplo de entrenamiento: TIENE_PARTE(OBJ1, CONCAVIDAD1) COLOR(OBJ1, ROJO) CONCAVIDAD(CONCAVIDAD1) ORIENTADA_HACIA_ARRIBA(CONCAVIDAD1) TIENE_DUEÑO(OBJ1, SAM) TIENE_PARTE(OBJ1, FONDO1) FONDO(FONDO1) PLANO(FONDO1) LIGERO(OBJ1) TIENE_PARTE(OBJ1, ASA1) ASA(ASA1) LONGITUD(ASA1, 5)

Page 37: Poggi   analytics - ebl - 1

Ejemplo 2

Teoría del dominio: ESTABLE(OBJ) <- TIENE_PARTE(OBJ, F) ^ FONDO(F) ^

PLANO(F) RECIPIENTE_ABIERTO(OBJ) <- TIENE_PARTE(OBJ, C) ^

CONCAVIDAD(C) ^ ORIENTADA_HACIA_ARRIBA(C) ALZABLE(OBJ) <- TIENE_PARTE(OBJ, A) ^ ASA(A) ^

LIGERO(OBJ)

Criterio operacional: El concepto tiene que definirse en términos de los

predicados usados en el ejemplo.

Page 38: Poggi   analytics - ebl - 1

Ejemplo 2

El proceso de aprendizaje tiene 2 pasos: Se usa la teoría del dominio para construir una

explicación (demostración) de que el ejemplo de entrenamiento es un ejemplo positivo del concepto objetivo. Los nodos terminales del árbol de explicación tienen que ser operativos

Transformar los nodos terminales en un conjunto de condiciones suficientes para que la demostración siga siendo válida (generalizar la explicación de acuerdo con la teoría del dominio)

Page 39: Poggi   analytics - ebl - 1

Ejemplo 2

Resultado del proceso de aprendizaje: TIENE_PARTE(X, Y) ^ CONCAVIDAD(Y) ^

ORIENTADA_HACIA_ARRIBA(Y) ^ TIENE_PARTE(X, Z) ^ FONDO(Z) ^ PLANO(Z) ^ TIENE_PARTE(X, W) ^ ASA(W) ^ LIGERO(X)

Page 40: Poggi   analytics - ebl - 1

Problemas de EBL

Cuando el EBL se aplica a dominios reales surgen un conjunto de problemas que se deben resolver, estos se pueden agrupar en dos clases:

Problemas derivados del nuevo conocimiento incorporado a la teoría (reformulación de la teoría)

Problemas derivados de la calidad de la teoría (revisión de la teoría)

Page 41: Poggi   analytics - ebl - 1

Problemas de EBL – Reformulación de la teoría

La incorporación sistemática de nuevo conocimiento lleva a la degradación de la teoría debido a dos causas:

Baja frecuencia de aplicación: es posible que el nuevo conocimiento se use pocas veces.

Alto costo de cotejar las reglas: el comprobar si el nuevo conocimiento es útil es muy costoso.

Esto produce una reducción de la eficiencia del sistema.

Como solución se puede reescribir la teoría para reducir el costo de cotejo de las reglas, evaluar la frecuencia de uso de las reglas y eliminar las que se usen pocas veces.

Page 42: Poggi   analytics - ebl - 1

Problemas de EBL – Revisión de la teoría

La teoría que se utiliza no permite solucionar correctamente los problemas y se debe corregir:

Teorías incompletas: la teoría no es capaz de solucionar todos los problemas por falta de reglas adecuadas. Se puede completar la teoría por métodos inductivos

Teorías incorrectas: la teoría da soluciones incorrectas a algunos problemas. Se puede intentar determinar que reglas están mal y eliminarlas o corregirlas inductivamente.

Teorías inconsistentes: llegan a soluciones contradictorias.

Teorías intratables: para poder solucionar un problema necesitan mas recursos que los que se dispone.

Page 43: Poggi   analytics - ebl - 1

[email protected]

eduardo-poggi

http://ar.linkedin.com/in/eduardoapoggi

https://www.facebook.com/eduardo.poggi

@eduardoapoggi