بکارگیری حافظه ای محدود برای نگهداری برترین کنش اخیردر سیستم های طبقه بندی کننده یادگیر XCS در مسائل هزارتو
محورهای موضوعی : فناوری اطلاعات و ارتباطاتعلی یوسفی 1 , کامبیز بدیع 2 , محمد مهدی عبادزاده 3 , آرش شریفی 4
1 - دانشجوی دکتری هوش مصنوعی و رباتیکز گروه آموزشی کامپیوتر، واحد علوم و تحقیقات ، دانشگاه آزاد اسلامی، تهران،ایران
2 - عضو هیات علمی پژوهشگاه ارتباطات و فناوری اطلاعات
3 - استاد دانشکده مهندسی کامپیوتر و فناوری اطلاعات دانشگاه صنعتی امیر کبیر ، تهران ایران
4 - گروه مهندسی کامپیوتر، واحد علوم و تحقیقات، دانشگاه آزاد اسلامی، تهران، ایران
کلید واژه: سیستم هاي طبقه بند یادگیر, الگوریتم XCS , حافظه ي محدود, مسائل هزارتو,
چکیده مقاله :
امروزه، سیستمهاي طبقهبندي کننده یادگیر درکاربردهاي متنوع در رباتیک مانند رباتهاي حسي، رباتهاي انساننما، سامانه هاي امداد و جات هوشمند وکنترل ربانهاي فیزیکي در محیطهاي گسسته و پیوسته، مورد توجه قرار گرفته است. معمولا از ترکیب یک الگوریتم تکاملي یا روشهاي شهودي با یک فرایند یادگیري براي جستجو در فضاي قوانین موجود در انتساب کنش مناسب یک دستهبند استفاده مي شود. چالش مهم براي بالا بردن سرعت و دقت در رسیدن به هدف در مسائل هزار تو، بکارگیري و انتخاب کنشي است که محرک بجاي برخورد تکراري به موانع اطراف، در مسیر درست قرار گیرد. بدین منظور در این مقاله یک الگوریتم طبقه بندي کننده یادگیر هوشمند سیستمهاي طبقه بند یادگیر مبتني بر دقت) XCS ( مبتني بر حافظه محدود بکار گرفته شده است که با توجه به ورودي و کنشهاي اعمال شده به محیط و عکس العمل محرک، قوانین بهینه شناسایي شده و در اولویت انتخاب با احتمال بیشتري در مراحل بعدي، به عنوان مجموعه دستهبند جدید به الگوریتم سیستمهاي طبقه بند یادگیر مبتني بر دقت (XCS) اضافه گردد. از جمله دستاوردهاي این روش مي توان به کاهش تعداد مراحل لازم و افزایش سرعت در رسیدن محرک به هدف در مقایسه با الگوریتم سیستمهاي طبقه بند یادگیر مبتني بر دقت (XCS) پایه داشت .
Nowadays, learning classifier systems have received attention in various applications in robotics, such as sensory robots, humanoid robots, intelligent rescue and rescue systems, and control of physical robots in discrete and continuous environments. Usually, the combination of an evolutionary algorithm or intuitive methods with a learning process is used to search the space of existing rules in assigning the appropriate action of a category. The important challenge to increase the speed and accuracy in reaching the goal in the maze problems is to use and choose the action that the stimulus is placed on the right path instead of repeatedly hitting the surrounding obstacles. For this purpose, in this article, an intelligent learning classifier algorithm of accuracy-based learning classifier systems (XCS) based on limited memory is used, which according to the input and actions applied to the environment and the reaction of the stimulus, the rules It is optimally identified and added as a new classifier set to the accuracy-based learning classifier systems (XCS) algorithm in the next steps. Among the achievements of this method, it can be based on reducing the number of necessary steps and increasing the speed of reaching the stimulus to the target compared to the accuracy-based learning classifier systems (XCS) algorithm.
[1] "Learning Classifier Systems, From Foundations to Applications," 2000.
[2] J. Holland, L. Booker, M. Colombetti, M. Dorigo, D. Goldberg, S. Forrest, et al., "What Is a Learning Classifier System?," in Learning Classifier Systems. vol. 1813, P. Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2000, pp. 3-32.
[3] S. W. Wilson, "Classifier fitness based on accuracy," Evol. Comput., vol. 3, pp. 149-175, 1995.
[4] E. Bernad\, \#243, -Mansilla, and J. M. Garrell-Guiu, "Accuracy-based learning classifier systems: models, analysis and applications to classification tasks," Evol. Comput., vol. 11, pp. 209-238, 2003.
[5] J. H. Holmes, P. L. Lanzi, W. Stolzmann, and S. W. Wilson, "Learning classifier systems: New models, successful applications," Information Processing Letters, vol. 82, pp. 23-30, 2002.
[6] M. Shariat Panahi, A. Karkhaneh Yousefi, and M. Khorshidi, "Combining accuracy and success-rate to improve the performance of eXtended Classifier System (XCS) for data-mining and control applications," Engineering Applications of Artificial Intelligence, vol. 26, pp. 1930-1935, 2013.
[7] D. Mellor, "A Learning Classifier System Approach to Relational Reinforcement Learning," in Learning Classifier Systems. vol. 4998, J. Bacardit, E. Bernadó-Mansilla, M. Butz, T. Kovacs, X. Llorà, and K. Takadama, Eds., ed: Springer Berlin Heidelberg, 2008, pp. 169-188.
[8] P. Wawrzynski and A. K. Tanwani, "Autonomous reinforcement learning with experience replay," Neural Netw, vol. 41, pp. 156-67, 2013.
[9] Z. Zang, D. Li, J. Wang, and D. Xia, "Learning classifier system with average reward reinforcement learning," Knowledge-Based Systems, vol. 40, pp. 58-71, 2013.
[10] M. Studley and L. Bull, "X-TCS: accuracy-based learning classifier system robotics," in Evolutionary Computation, 2005. The 2005 IEEE Congress on, 2005, pp. 2099-2106 Vol. 3.
[11] M. Butz and D. Goldberg, "Generalized State Values in an Anticipatory Learning Classifier System," in Anticipatory Behavior in Adaptive Learning Systems. vol. 2684, M. Butz, O. Sigaud, and P. Gérard, Eds., ed: Springer Berlin Heidelberg, 2003, pp. 282-301.
[12] M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson, "Toward a theory of generalization and learning in XCS," Evolutionary Computation, IEEE Transactions on, vol. 8, pp. 28-46, 2004.
[13] P. Gérard and O. Sigaud, "YACS: Combining Dynamic Programming with Generalization in Classifier Systems," in Advances in Learning Classifier Systems. vol. 1996, P. Luca Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2001, pp. 52-69.
[14] J. H. Holland, "Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems," in Computation & intelligence, F. L. George, Ed., ed: American Association for Artificial Intelligence, 1995, pp. 275-304.
[15] P. L. Lanzi, "An analysis of generalization in the xcs classifier system," Evol. Comput., vol. 7, pp. 125-149, 1999.
[16] P. L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg, "Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension," Evol. Comput., vol. 15, pp. 133-168, 2007.
[17] M. Iqbal, W. Browne, and M. Zhang, "XCSR with Computed Continuous Action," in AI 2012: Advances in Artificial Intelligence. vol. 7691, M. Thielscher and D. Zhang, Eds., ed: Springer Berlin Heidelberg, 2012, pp. 350-361.
[18] M. Iqbal, W. N. Browne, and Z. Mengjie, "Reusing Building Blocks of Extracted Knowledge to Solve Complex, Large-Scale Boolean Problems," Evolutionary Computation, IEEE Transactions on, vol. 18, pp. 465-480, 2014.
[19] G. Bezerra, T. Barra, L. de Castro, and F. Von Zuben, "Adaptive Radius Immune Algorithm for Data Clustering," in Artificial Immune Systems. vol. 3627, C. Jacob, M. Pilat, P. Bentley, and J. Timmis, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 290-303.
[20] H.-P. Cheng, Z.-S. Lin, H.-F. Hsiao, and M.-L. Tseng, "Designing an Artificial Immune System-Based Machine Learning Classifier for Medical Diagnosis," in Information Computing and Applications. vol. 6377, R. Zhu, Y. Zhang, B. Liu, and C. Liu, Eds., ed: Springer Berlin Heidelberg, 2010, pp. 333-341.
[21] J. D. Farmer, N. H. Packard, and A. S. Perelson, "The immune system, adaptation, and machine learning," Physica D: Nonlinear Phenomena, vol. 22, pp. 187-204, 1986.
[22] F. Freschi and M. Repetto, "Multiobjective optimization by a modified artificial immune system algorithm," presented at the Proceedings of the 4th international conference on Artificial Immune Systems, Banff, Alberta, Canada, 2005.
[23] J. Timmis, P. Andrews, N. Owens, and E. Clark, "An interdisciplinary perspective on artificial immune systems," Evolutionary Intelligence, vol. 1, pp. 5-26, 2008/03/01 2008.
[24] P. Vargas, L. de Castro, and F. Von Zuben, "Mapping Artificial Immune Systems into Learning Classifier Systems," in Learning Classifier Systems. vol. 2661, P. Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2003, pp. 163-186.
[25] L. Bull, "Towards a Mapping of Modern AIS and LCS," in Artificial Immune Systems. vol. 6825, P. Liò, G. Nicosia, and T. Stibor, Eds., ed: Springer Berlin Heidelberg, 2011, pp. 371-382.
[26] Z. Zang, D. Li, and J. Wang, "Learning classifier systems with memory condition to solve non-Markov problems," Soft Computing, vol. 19, pp. 1679-1699, 2015/06/01 2015.
[27] A. L. Thomaz and C. Breazeal, "Teachable robots: Understanding human teaching behavior to build more effective robot learners," Artificial Intelligence, vol. 172, pp. 716-737, 2008.
[28] L. M. Saksida, S. M. Raymond, and D. S. Touretzky, "Shaping robot behavior using principles from instrumental conditioning," Robotics and Autonomous Systems, vol. 22, pp. 231-249, 1997.
[29] M. Dorigo and M. Colombetti, "Robot shaping: developing autonomous agents through learning," Artificial Intelligence, vol. 71, pp. 321-370, 1994.
[30] S. Wilson, "Classifier systems and the animat problem," Machine Learning, vol. 2, pp. 199-228, 1987/11/01 1987.
[31] S. W. Wilson, "Knowledge Growth in an Artificial Animal," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[32] B. G. Farley and W. Clark, "Simulation of self-organizing systems by digital computer," Information Theory, Transactions of the IRE Professional Group on, vol. 4, pp. 76-84, 1954.
[33] C. E. Shannon, "Programming a computer for playing chess," in Computer chess compendium, L. David, Ed., ed: Springer-Verlag New York, Inc., 1988, pp. 2-13.
[34] A. L. Samuel, "Some studies in machine learning using the game of checkers," IBM J. Res. Dev., vol. 3, pp. 210-229, 1959.
[35] A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers. II—Recent Progress," IBM Journal of Research and Development, vol. 11, pp. 601-617, 1967.
[36] J. H. Holland, "Properties of the Bucket Brigade," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[37] G. E. P. Box, "Evolutionary operation: a method for increasing industrial productivity," Applied statistics : a journal of the Royal Statistical Society, vol. 6, pp. 81-101, 1957.
[38] S. W. Wilson and D. E. Goldberg, "A Critical Review of Classifier Systems," presented at the Proceedings of the 3rd International Conference on Genetic Algorithms, 1989.
[39] L. B. Booker, "Intelligent behavior as an adaptation to the task environment," University of Michigan, 1982.
[40] L. B. Booker, "Improving the Performance of Genetic Algorithms in Classifier Systems," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[41] L. B. Booker, "Classifier systems that learn internal world models," Mach. Lang., vol. 3, pp. 161-192, 1988.
[42] L. B. Booker, "Triggered Rule Discovery in Classifier Systems," presented at the Proceedings of the 3rd International Conference on Genetic Algorithms, 1989.
[43] S. W. Wilson, "Zcs: A zeroth level classifier system," Evol. Comput., vol. 2, pp. 1-18, 1994.
[44] R. S. Sutton and A. G. Barto, "Toward a modern theory of adaptive networks: expectation and prediction," Psychol Rev, vol. 88, pp. 135-70, 1981.
[45] S. W. Wilson, "Classifiers that approximate functions," vol. 1, pp. 211-234, 2002.
[46] L. Bull, "Two Simple Learning Classifier Systems," in Foundations of Learning Classifier Systems. vol. 183, L. Bull and T. Kovacs, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 63-89.
[47] L. Bull, "A brief history of learning classifier systems: from CS-1 to XCS and its variants," Evolutionary Intelligence, pp. 1-16, 2015/01/29 2015.
[48] A. Hamzeh, S. Hashemi, A. Sami, and A. Rahmani, "A Recursive Classifier System for Partially Observable Environments," Fundam. Inform., vol. 97, pp. 15-40, 2009.
[49] A. Hamzeh and A. Rahmani, "A New Architecture for Learning Classifier Systems to Solve POMDP Problems," Fundam. Inform., vol. 84, pp. 329-351,2008.
[50] R. Preen and L. Bull, "Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system," Soft Computing, vol. 18, pp. 153-167, 2014/01/01 2014.