{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "papermill": { "duration": 0.670894, "end_time": "2019-10-23T16:01:12.210448", "exception": false, "start_time": "2019-10-23T16:01:11.539554", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import matplotlib.pyplot as plt\n", "\n", "import gym\n", "from gym.envs.registration import register" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.016798, "end_time": "2019-10-23T16:01:12.244111", "exception": false, "start_time": "2019-10-23T16:01:12.227313", "status": "completed" }, "tags": [] }, "source": [ "# ACS2 in Frozen Lake\n", "\n", "About the environment\n", "> The agent controls the movement of a character in a grid world. Some tiles of the grid are walkable, and others lead to the agent falling into the water. Additionally, the movement direction of the agent is uncertain and only partially depends on the chosen direction. The agent is rewarded for finding a walkable path to a goal tile." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "papermill": { "duration": 0.72504, "end_time": "2019-10-23T16:01:12.990433", "exception": false, "start_time": "2019-10-23T16:01:12.265393", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\u001b[41mS\u001b[0mFFF\n", "FHFH\n", "FFFH\n", "HFFG\n" ] } ], "source": [ "fl_env = gym.make('FrozenLake-v0')\n", "\n", "# Reset the state\n", "state = fl_env.reset()\n", "\n", "# Render the environment\n", "fl_env.render()" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.022612, "end_time": "2019-10-23T16:01:13.030417", "exception": false, "start_time": "2019-10-23T16:01:13.007805", "status": "completed" }, "tags": [] }, "source": [ "Each state might get following possible values: `{S, F, H, G}` which, refers to\n", "```\n", "SFFF (S: starting point, safe)\n", "FHFH (F: frozen surface, safe)\n", "FFFH (H: hole, fall to your doom)\n", "HFFG (G: goal, where the frisbee is located)\n", "```\n", "\n", "In case of interacting with environment agent cant perform 4 action which map as follow:\n", "- 0 - left\n", "- 1 - down\n", "- 2 - right\n", "- 3 - up\n", "\n", "> FrozenLake-v0 defines \"solving\" as getting average reward of 0.78 over 100 consecutive trials.\n", "\n", "We will also define a second version of the same environment but with `slippery=False` parameters. That make it more deterministic." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "papermill": { "duration": 0.027484, "end_time": "2019-10-23T16:01:13.075413", "exception": false, "start_time": "2019-10-23T16:01:13.047929", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\u001b[41mS\u001b[0mFFF\n", "FHFH\n", "FFFH\n", "HFFG\n" ] } ], "source": [ "register(\n", " id='FrozenLakeNotSlippery-v0',\n", " entry_point='gym.envs.toy_text:FrozenLakeEnv',\n", " kwargs={'map_name': '4x4', 'is_slippery': False},\n", " max_episode_steps=100,\n", " reward_threshold=0.78, # optimum = .8196\n", ")\n", "\n", "fl_ns_env = gym.make('FrozenLakeNotSlippery-v0')\n", "\n", "# Reset the state\n", "state = fl_ns_env.reset()\n", "\n", "# Render the environment\n", "fl_ns_env.render()" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.017077, "end_time": "2019-10-23T16:01:13.109843", "exception": false, "start_time": "2019-10-23T16:01:13.092766", "status": "completed" }, "tags": [] }, "source": [ "## ACS2" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "papermill": { "duration": 0.076895, "end_time": "2019-10-23T16:01:13.203862", "exception": false, "start_time": "2019-10-23T16:01:13.126967", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "# Import PyALCS code from local path\n", "import sys, os\n", "sys.path.append(os.path.abspath('../'))\n", "\n", "from lcs.agents import EnvironmentAdapter\n", "from lcs.agents.acs2 import ACS2, Configuration\n", "\n", "# Enable automatic module reload\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "CLASSIFIER_LENGTH = 16 # Because we are operating in 4x4 grid\n", "POSSIBLE_ACTIONS = fl_env.action_space.n # 4" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.016884, "end_time": "2019-10-23T16:01:13.237853", "exception": false, "start_time": "2019-10-23T16:01:13.220969", "status": "completed" }, "tags": [] }, "source": [ "### Encoding perception\n", "The only information back from the environment is the current agent position (not it's perception). Therefore our agent task will be to predicit where it will land after executing each action.\n", "\n", "To do so we will represent state as a one-hot encoded vector." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "papermill": { "duration": 0.047926, "end_time": "2019-10-23T16:01:13.303017", "exception": false, "start_time": "2019-10-23T16:01:13.255091", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "class FrozenLakeAdapter(EnvironmentAdapter):\n", " @classmethod\n", " def to_genotype(cls, phenotype):\n", " genotype = ['0' for i in range(CLASSIFIER_LENGTH)]\n", " genotype[phenotype] = 'X'\n", " return ''.join(genotype)" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.017365, "end_time": "2019-10-23T16:01:13.337377", "exception": false, "start_time": "2019-10-23T16:01:13.320012", "status": "completed" }, "tags": [] }, "source": [ "`X` corresponds to current agent position. State 4 is encoded as follows:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "papermill": { "duration": 0.043785, "end_time": "2019-10-23T16:01:13.398491", "exception": false, "start_time": "2019-10-23T16:01:13.354706", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'0000X00000000000'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "FrozenLakeAdapter().to_genotype(4)" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.018894, "end_time": "2019-10-23T16:01:13.434852", "exception": false, "start_time": "2019-10-23T16:01:13.415958", "status": "completed" }, "tags": [] }, "source": [ "### Environment metrics\n", "We will also need a function for evaluating if agent finished succesfuly a trial" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "papermill": { "duration": 0.043406, "end_time": "2019-10-23T16:01:13.499052", "exception": false, "start_time": "2019-10-23T16:01:13.455646", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "from lcs.metrics import population_metrics\n", "\n", "\n", "# We assume if the final state was with number 15 that the algorithm found the reward. Otherwise not\n", "def fl_metrics(pop, env):\n", " metrics = {\n", " 'found_reward': env.env.s == 15,\n", " }\n", " \n", " # Add basic population metrics\n", " metrics.update(population_metrics(pop, env))\n", " \n", " return metrics" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.018531, "end_time": "2019-10-23T16:01:13.535569", "exception": false, "start_time": "2019-10-23T16:01:13.517038", "status": "completed" }, "tags": [] }, "source": [ "### Performance evaluation" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "papermill": { "duration": 0.04482, "end_time": "2019-10-23T16:01:13.601254", "exception": false, "start_time": "2019-10-23T16:01:13.556434", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "def print_performance(population, metrics):\n", " population.sort(key=lambda cl: -cl.fitness)\n", " population_count = len(population)\n", " reliable_count = len([cl for cl in population if cl.is_reliable()])\n", " successful_trials = sum(m['found_reward'] for m in metrics)\n", "\n", " print(\"Number of classifiers: {}\".format(population_count))\n", " print(\"Number of reliable classifiers: {}\".format(reliable_count))\n", " print(\"Percentage of successul trials: {:.2f}%\".format(successful_trials / EXPLOIT_TRIALS * 100))\n", " print(\"\\nTop 10 classifiers:\")\n", " for cl in population[:10]:\n", " print(\"{!r} \\tq: {:.2f} \\tr: {:.2f} \\tir: {:.2f} \\texp: {}\".format(cl, cl.q, cl.r, cl.ir, cl.exp))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "papermill": { "duration": 0.046577, "end_time": "2019-10-23T16:01:13.665121", "exception": false, "start_time": "2019-10-23T16:01:13.618544", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "def plot_success_trials(metrics, ax=None):\n", " if ax is None:\n", " ax = plt.gca()\n", " \n", " trials = [m['trial'] for m in metrics]\n", " success = [m['found_reward'] for m in metrics]\n", "\n", " ax.plot(trials, success)\n", " ax.set_title(\"Successful Trials\")\n", " ax.set_xlabel(\"Trial\")\n", " ax.set_ylabel(\"Agent found reward\")" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "papermill": { "duration": 0.044081, "end_time": "2019-10-23T16:01:13.726645", "exception": false, "start_time": "2019-10-23T16:01:13.682564", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "def plot_population(metrics, ax=None):\n", " if ax is None:\n", " ax = plt.gca()\n", " \n", " trials = [m['trial'] for m in metrics]\n", " \n", " population_size = [m['numerosity'] for m in metrics]\n", " reliable_size = [m['reliable'] for m in metrics]\n", " \n", " ax.plot(trials, population_size, 'b', label='all')\n", " ax.plot(trials, reliable_size, 'r', label='reliable')\n", " \n", " ax.set_title(\"Population size\")\n", " ax.set_xlabel(\"Trial\")\n", " ax.set_ylabel(\"Number of macroclassifiers\")\n", " ax.legend(loc='best')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "papermill": { "duration": 0.045741, "end_time": "2019-10-23T16:01:13.789752", "exception": false, "start_time": "2019-10-23T16:01:13.744011", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "def plot_performance(metrics):\n", " plt.figure(figsize=(13, 10), dpi=100)\n", " plt.suptitle('Performance Visualization')\n", " \n", " ax1 = plt.subplot(221)\n", " plot_success_trials(metrics, ax1)\n", " \n", " ax2 = plt.subplot(222)\n", " plot_population(metrics, ax2)\n", " \n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.017168, "end_time": "2019-10-23T16:01:13.824416", "exception": false, "start_time": "2019-10-23T16:01:13.807248", "status": "completed" }, "tags": [] }, "source": [ "### Default ACS2 configuration\n", "Right now we are ready to configure the ACS2 agent providing some defaults" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "papermill": { "duration": 0.04294, "end_time": "2019-10-23T16:01:13.884941", "exception": false, "start_time": "2019-10-23T16:01:13.842001", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ACS2Configuration:\n", "\t- Classifier length: [16]\n", "\t- Number of possible actions: [4]\n", "\t- Classifier wildcard: [#]\n", "\t- Environment adapter function: [<__main__.FrozenLakeAdapter object at 0x117bd0b00>]\n", "\t- Fitness function: [None]\n", "\t- Do GA: [False]\n", "\t- Do subsumption: [True]\n", "\t- Do Action Planning: [False]\n", "\t- Beta: [0.05]\n", "\t- ...\n", "\t- Epsilon: [0.7]\n", "\t- U_max: [100000]\n" ] } ], "source": [ "cfg = Configuration(\n", " classifier_length=CLASSIFIER_LENGTH,\n", " number_of_possible_actions=POSSIBLE_ACTIONS,\n", " environment_adapter=FrozenLakeAdapter(),\n", " metrics_trial_frequency=1,\n", " user_metrics_collector_fcn=fl_metrics,\n", " theta_i=0.3,\n", " epsilon=0.7)\n", "\n", "print(cfg)" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.020047, "end_time": "2019-10-23T16:01:13.922588", "exception": false, "start_time": "2019-10-23T16:01:13.902541", "status": "completed" }, "tags": [] }, "source": [ "## Experiments" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "papermill": { "duration": 0.044259, "end_time": "2019-10-23T16:01:13.985648", "exception": false, "start_time": "2019-10-23T16:01:13.941389", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "EXPLORE_TRIALS = 2000\n", "EXPLOIT_TRIALS = 100\n", "\n", "\n", "def perform_experiment(cfg, env):\n", " # explore phase\n", " agent = ACS2(cfg)\n", " population_explore, metrics_explore = agent.explore(env, EXPLORE_TRIALS)\n", " \n", " # exploit phase, reinitialize agent with population above\n", " agent = ACS2(cfg, population=population_explore)\n", " population_exploit, metrics_exploit = agent.exploit(env, EXPLOIT_TRIALS)\n", " \n", " return (population_explore, metrics_explore), (population_exploit, metrics_exploit)" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.018167, "end_time": "2019-10-23T16:01:14.022350", "exception": false, "start_time": "2019-10-23T16:01:14.004183", "status": "completed" }, "tags": [] }, "source": [ "### FrozenLake-v0 environment (baseline)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "papermill": { "duration": 46.358754, "end_time": "2019-10-23T16:02:00.401837", "exception": false, "start_time": "2019-10-23T16:01:14.043083", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 46.1 s, sys: 138 ms, total: 46.2 s\n", "Wall time: 46.3 s\n" ] } ], "source": [ "%%time\n", "explore_results, exploit_results = perform_experiment(cfg, fl_env) " ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.018931, "end_time": "2019-10-23T16:02:00.445959", "exception": false, "start_time": "2019-10-23T16:02:00.427028", "status": "completed" }, "tags": [] }, "source": [ "Learn some behaviour during exploration phase" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "papermill": { "duration": 0.046931, "end_time": "2019-10-23T16:02:00.514419", "exception": false, "start_time": "2019-10-23T16:02:00.467488", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of classifiers: 441\n", "Number of reliable classifiers: 0\n", "Percentage of successul trials: 42.00%\n", "\n", "Top 10 classifiers:\n", "##############X0 1 ##############0X (00000000000000##) q: 0.602 r: 0.3693 ir: 0.2825 f: 0.2224 exp: 41 tga: 1105 talp: 15265 tav: 2.99e+02 num: 1 \tq: 0.60 \tr: 0.37 \tir: 0.28 \texp: 41\n", "#0############X0 1 ##############0X (0#000000000000##) q: 0.577 r: 0.3633 ir: 0.2765 f: 0.2096 exp: 32 tga: 3961 talp: 15265 tav: 3.26e+02 num: 1 \tq: 0.58 \tr: 0.36 \tir: 0.28 \texp: 32\n", "#0###########0X# 3 #############X0# (0#00000000000##0) q: 0.471 r: 0.3723 ir: 0.3615 f: 0.1753 exp: 43 tga: 3660 talp: 14982 tav: 2.96e+02 num: 1 \tq: 0.47 \tr: 0.37 \tir: 0.36 \texp: 43\n", "##############X0 3 ##############0X (00000000000000##) q: 0.442 r: 0.3876 ir: 0.377 f: 0.1715 exp: 45 tga: 2040 talp: 14982 tav: 3.13e+02 num: 1 \tq: 0.44 \tr: 0.39 \tir: 0.38 \texp: 45\n", "#############0X# 3 #############X0# (0000000000000##0) q: 0.446 r: 0.3829 ir: 0.3723 f: 0.1709 exp: 45 tga: 3117 talp: 14982 tav: 2.98e+02 num: 1 \tq: 0.45 \tr: 0.38 \tir: 0.37 \texp: 45\n", "#############0X# 1 #############X0# (0000000000000##0) q: 0.452 r: 0.3676 ir: 0.2807 f: 0.1661 exp: 33 tga: 3251 talp: 15265 tav: 3.33e+02 num: 1 \tq: 0.45 \tr: 0.37 \tir: 0.28 \texp: 33\n", "#0###########0X# 1 #############X0# (0#00000000000##0) q: 0.451 r: 0.3633 ir: 0.2765 f: 0.1639 exp: 33 tga: 3251 talp: 15265 tav: 3.26e+02 num: 1 \tq: 0.45 \tr: 0.36 \tir: 0.28 \texp: 33\n", "##############X# 1 ################ (00000000000000#0) q: 0.368 r: 0.3676 ir: 0.2807 f: 0.1351 exp: 40 tga: 1748 talp: 15265 tav: 2.93e+02 num: 1 \tq: 0.37 \tr: 0.37 \tir: 0.28 \texp: 40\n", "#0############X# 1 ################ (0#000000000000#0) q: 0.355 r: 0.3633 ir: 0.2765 f: 0.129 exp: 36 tga: 3119 talp: 15265 tav: 2.97e+02 num: 1 \tq: 0.36 \tr: 0.36 \tir: 0.28 \texp: 36\n", "##########0###X# 2 ##########X###0# (0000000000#000#0) q: 0.412 r: 0.2099 ir: 0.1545 f: 0.0865 exp: 18 tga: 631 talp: 14255 tav: 7.17e+02 num: 1 \tq: 0.41 \tr: 0.21 \tir: 0.15 \texp: 18\n" ] } ], "source": [ "# exploration\n", "print_performance(explore_results[0], explore_results[1])" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "papermill": { "duration": 0.75281, "end_time": "2019-10-23T16:02:01.286164", "exception": false, "start_time": "2019-10-23T16:02:00.533354", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_performance(explore_results[1])" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.021037, "end_time": "2019-10-23T16:02:01.326396", "exception": false, "start_time": "2019-10-23T16:02:01.305359", "status": "completed" }, "tags": [] }, "source": [ "Metrics from exploitation" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "papermill": { "duration": 0.044591, "end_time": "2019-10-23T16:02:01.391606", "exception": false, "start_time": "2019-10-23T16:02:01.347015", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of classifiers: 441\n", "Number of reliable classifiers: 0\n", "Percentage of successul trials: 6.00%\n", "\n", "Top 10 classifiers:\n", "##############X0 1 ##############0X (00000000000000##) q: 0.602 r: 0.3693 ir: 0.2825 f: 0.2224 exp: 41 tga: 1105 talp: 15265 tav: 2.99e+02 num: 1 \tq: 0.60 \tr: 0.37 \tir: 0.28 \texp: 41\n", "#0############X0 1 ##############0X (0#000000000000##) q: 0.577 r: 0.3633 ir: 0.2765 f: 0.2096 exp: 32 tga: 3961 talp: 15265 tav: 3.26e+02 num: 1 \tq: 0.58 \tr: 0.36 \tir: 0.28 \texp: 32\n", "#0###########0X# 3 #############X0# (0#00000000000##0) q: 0.471 r: 0.3723 ir: 0.3615 f: 0.1753 exp: 43 tga: 3660 talp: 14982 tav: 2.96e+02 num: 1 \tq: 0.47 \tr: 0.37 \tir: 0.36 \texp: 43\n", "##############X0 3 ##############0X (00000000000000##) q: 0.442 r: 0.3876 ir: 0.377 f: 0.1715 exp: 45 tga: 2040 talp: 14982 tav: 3.13e+02 num: 1 \tq: 0.44 \tr: 0.39 \tir: 0.38 \texp: 45\n", "#############0X# 3 #############X0# (0000000000000##0) q: 0.446 r: 0.3829 ir: 0.3723 f: 0.1709 exp: 45 tga: 3117 talp: 14982 tav: 2.98e+02 num: 1 \tq: 0.45 \tr: 0.38 \tir: 0.37 \texp: 45\n", "#############0X# 1 #############X0# (0000000000000##0) q: 0.452 r: 0.3676 ir: 0.2807 f: 0.1661 exp: 33 tga: 3251 talp: 15265 tav: 3.33e+02 num: 1 \tq: 0.45 \tr: 0.37 \tir: 0.28 \texp: 33\n", "#0###########0X# 1 #############X0# (0#00000000000##0) q: 0.451 r: 0.3633 ir: 0.2765 f: 0.1639 exp: 33 tga: 3251 talp: 15265 tav: 3.26e+02 num: 1 \tq: 0.45 \tr: 0.36 \tir: 0.28 \texp: 33\n", "##############X# 1 ################ (00000000000000#0) q: 0.368 r: 0.3676 ir: 0.2807 f: 0.1351 exp: 40 tga: 1748 talp: 15265 tav: 2.93e+02 num: 1 \tq: 0.37 \tr: 0.37 \tir: 0.28 \texp: 40\n", "#0############X# 1 ################ (0#000000000000#0) q: 0.355 r: 0.3633 ir: 0.2765 f: 0.129 exp: 36 tga: 3119 talp: 15265 tav: 2.97e+02 num: 1 \tq: 0.36 \tr: 0.36 \tir: 0.28 \texp: 36\n", "##########0###X# 2 ##########X###0# (0000000000#000#0) q: 0.412 r: 0.2099 ir: 0.1545 f: 0.0865 exp: 18 tga: 631 talp: 14255 tav: 7.17e+02 num: 1 \tq: 0.41 \tr: 0.21 \tir: 0.15 \texp: 18\n" ] } ], "source": [ "# exploitation\n", "print_performance(exploit_results[0], exploit_results[1])" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.019329, "end_time": "2019-10-23T16:02:01.430115", "exception": false, "start_time": "2019-10-23T16:02:01.410786", "status": "completed" }, "tags": [] }, "source": [ "### FrozenLakeNotSlippery-v0 environment" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "papermill": { "duration": 10.20416, "end_time": "2019-10-23T16:02:11.656905", "exception": false, "start_time": "2019-10-23T16:02:01.452745", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 9.99 s, sys: 136 ms, total: 10.1 s\n", "Wall time: 10.2 s\n" ] } ], "source": [ "%%time\n", "explore_results_2, exploit_results_2 = perform_experiment(cfg, fl_ns_env)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "papermill": { "duration": 0.050992, "end_time": "2019-10-23T16:02:11.727616", "exception": false, "start_time": "2019-10-23T16:02:11.676624", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of classifiers: 89\n", "Number of reliable classifiers: 89\n", "Percentage of successul trials: 192.00%\n", "\n", "Top 10 classifiers:\n", "##############X0 2 ##############0X (empty) q: 1.0 r: 1.0 ir: 1.0 f: 1.0 exp: 191 tga: 237 talp: 14842 tav: 79.9 num: 1 \tq: 1.00 \tr: 1.00 \tir: 1.00 \texp: 191\n", "##########X###0# 1 ##########0###X# (empty) q: 1.0 r: 0.95 ir: 0.0 f: 0.95 exp: 245 tga: 324 talp: 15005 tav: 73.1 num: 1 \tq: 1.00 \tr: 0.95 \tir: 0.00 \texp: 245\n", "#############X0# 2 #############0X# (empty) q: 0.999 r: 0.9459 ir: 0.0 f: 0.9453 exp: 130 tga: 236 talp: 14842 tav: 1.38e+02 num: 1 \tq: 1.00 \tr: 0.95 \tir: 0.00 \texp: 130\n", "##############X# 1 ################ (empty) q: 0.997 r: 0.9274 ir: 0.0 f: 0.9245 exp: 87 tga: 572 talp: 15006 tav: 2.14e+02 num: 1 \tq: 1.00 \tr: 0.93 \tir: 0.00 \texp: 87\n", "#########X0##### 2 #########0X##### (empty) q: 1.0 r: 0.9025 ir: 0.0 f: 0.9025 exp: 399 tga: 117 talp: 15035 tav: 50.2 num: 1 \tq: 1.00 \tr: 0.90 \tir: 0.00 \texp: 399\n", "######X###0##### 1 ######0###X##### (empty) q: 1.0 r: 0.8974 ir: 0.0 f: 0.8971 exp: 137 tga: 26 talp: 14979 tav: 89.2 num: 1 \tq: 1.00 \tr: 0.90 \tir: 0.00 \texp: 137\n", "#########X###0## 1 #########0###X## (empty) q: 1.0 r: 0.8892 ir: 0.0 f: 0.8892 exp: 168 tga: 115 talp: 14994 tav: 93.6 num: 1 \tq: 1.00 \tr: 0.89 \tir: 0.00 \texp: 168\n", "####0#####0###X# 3 ##########X###0# (empty) q: 0.997 r: 0.8828 ir: 0.0 f: 0.8797 exp: 98 tga: 485 talp: 14543 tav: 1.62e+02 num: 1 \tq: 1.00 \tr: 0.88 \tir: 0.00 \texp: 98\n", "##########0###X# 3 ##########X###0# (empty) q: 0.997 r: 0.8828 ir: 0.0 f: 0.8797 exp: 97 tga: 485 talp: 14543 tav: 1.62e+02 num: 1 \tq: 1.00 \tr: 0.88 \tir: 0.00 \texp: 97\n", "########X0###### 2 ########0X###### (empty) q: 1.0 r: 0.8573 ir: 0.0 f: 0.8573 exp: 740 tga: 114 talp: 15034 tav: 28.7 num: 1 \tq: 1.00 \tr: 0.86 \tir: 0.00 \texp: 740\n" ] } ], "source": [ "# exploration\n", "print_performance(explore_results_2[0], explore_results_2[1])" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "papermill": { "duration": 0.742154, "end_time": "2019-10-23T16:02:12.489456", "exception": false, "start_time": "2019-10-23T16:02:11.747302", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_performance(explore_results_2[1])" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "papermill": { "duration": 0.050858, "end_time": "2019-10-23T16:02:12.561596", "exception": false, "start_time": "2019-10-23T16:02:12.510738", "status": "completed" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of classifiers: 89\n", "Number of reliable classifiers: 89\n", "Percentage of successul trials: 100.00%\n", "\n", "Top 10 classifiers:\n", "##############X0 2 ##############0X (empty) q: 1.0 r: 1.0 ir: 1.0 f: 1.0 exp: 191 tga: 237 talp: 14842 tav: 79.9 num: 1 \tq: 1.00 \tr: 1.00 \tir: 1.00 \texp: 191\n", "##########X###0# 1 ##########0###X# (empty) q: 1.0 r: 0.95 ir: 0.0 f: 0.95 exp: 245 tga: 324 talp: 15005 tav: 73.1 num: 1 \tq: 1.00 \tr: 0.95 \tir: 0.00 \texp: 245\n", "#############X0# 2 #############0X# (empty) q: 0.999 r: 0.9459 ir: 0.0 f: 0.9453 exp: 130 tga: 236 talp: 14842 tav: 1.38e+02 num: 1 \tq: 1.00 \tr: 0.95 \tir: 0.00 \texp: 130\n", "##############X# 1 ################ (empty) q: 0.997 r: 0.9274 ir: 0.0 f: 0.9245 exp: 87 tga: 572 talp: 15006 tav: 2.14e+02 num: 1 \tq: 1.00 \tr: 0.93 \tir: 0.00 \texp: 87\n", "#########X0##### 2 #########0X##### (empty) q: 1.0 r: 0.9025 ir: 0.0 f: 0.9025 exp: 399 tga: 117 talp: 15035 tav: 50.2 num: 1 \tq: 1.00 \tr: 0.90 \tir: 0.00 \texp: 399\n", "######X###0##### 1 ######0###X##### (empty) q: 1.0 r: 0.8974 ir: 0.0 f: 0.8971 exp: 137 tga: 26 talp: 14979 tav: 89.2 num: 1 \tq: 1.00 \tr: 0.90 \tir: 0.00 \texp: 137\n", "#########X###0## 1 #########0###X## (empty) q: 1.0 r: 0.8892 ir: 0.0 f: 0.8892 exp: 168 tga: 115 talp: 14994 tav: 93.6 num: 1 \tq: 1.00 \tr: 0.89 \tir: 0.00 \texp: 168\n", "####0#####0###X# 3 ##########X###0# (empty) q: 0.997 r: 0.8828 ir: 0.0 f: 0.8797 exp: 98 tga: 485 talp: 14543 tav: 1.62e+02 num: 1 \tq: 1.00 \tr: 0.88 \tir: 0.00 \texp: 98\n", "##########0###X# 3 ##########X###0# (empty) q: 0.997 r: 0.8828 ir: 0.0 f: 0.8797 exp: 97 tga: 485 talp: 14543 tav: 1.62e+02 num: 1 \tq: 1.00 \tr: 0.88 \tir: 0.00 \texp: 97\n", "########X0###### 2 ########0X###### (empty) q: 1.0 r: 0.8573 ir: 0.0 f: 0.8573 exp: 740 tga: 114 talp: 15034 tav: 28.7 num: 1 \tq: 1.00 \tr: 0.86 \tir: 0.00 \texp: 740\n" ] } ], "source": [ "# exploitation\n", "print_performance(exploit_results_2[0], exploit_results_2[1])" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.021458, "end_time": "2019-10-23T16:02:12.604545", "exception": false, "start_time": "2019-10-23T16:02:12.583087", "status": "completed" }, "tags": [] }, "source": [ "## Comparison" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "papermill": { "duration": 0.051526, "end_time": "2019-10-23T16:02:12.677018", "exception": false, "start_time": "2019-10-23T16:02:12.625492", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "def plot_population(metrics, ax=None):\n", " if ax is None:\n", " ax = plt.gca()\n", " \n", " trials = [m['trial'] for m in metrics]\n", " \n", " population_size = [m['numerosity'] for m in metrics]\n", " reliable_size = [m['reliable'] for m in metrics]\n", " \n", " ax.plot(trials, population_size, 'b', label='all')\n", " ax.plot(trials, reliable_size, 'r', label='reliable')\n", " \n", " ax.set_title(\"Population size\")\n", " ax.set_xlabel(\"Trial\")\n", " ax.set_ylabel(\"Number of macroclassifiers\")\n", " ax.legend(loc='best')" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "papermill": { "duration": 0.258639, "end_time": "2019-10-23T16:02:12.956342", "exception": false, "start_time": "2019-10-23T16:02:12.697703", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "original = explore_results[1]\n", "modified = explore_results_2[1]\n", "\n", "ax = plt.gca()\n", "\n", "trials = [m['trial'] for m in original]\n", "\n", "original_numerosity = [m['numerosity'] for m in original]\n", "modified_numerosity = [m['numerosity'] for m in modified]\n", "\n", "ax.plot(trials, original_numerosity, 'r')\n", "ax.text(1000, 350, \"Original environment\", color='r')\n", "\n", "ax.plot(trials, modified_numerosity, 'b')\n", "ax.text(1000, 40, 'No-slippery setting', color='b')\n", "\n", "\n", "ax.set_title('Classifier numerosity in FrozenLake environment')\n", "ax.set_xlabel('Trial')\n", "ax.set_ylabel('Number of macroclassifiers')\n", "\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" }, "papermill": { "duration": 62.784299, "end_time": "2019-10-23T16:02:13.395984", "environment_variables": {}, "exception": null, "input_path": "notebooks/FrozenLake.ipynb", "output_path": "docs/source/notebooks/FrozenLake.ipynb", "parameters": {}, "start_time": "2019-10-23T16:01:10.611685", "version": "1.1.0" } }, "nbformat": 4, "nbformat_minor": 4 }