مروری بر سازگاری تکثیر داده در سیستمهای توزیع شده
مهسا بیگ رضایی
1
(
مدرس دانشگاه
)
کلید واژه: تکثیر داده, سازگاری, مدلهای سازگاری, سیستم توزیعی, ابر, گرید داده,
چکیده مقاله :
سیستمهای توزیع شده مانند گرید و ابر به منظور مواجهه با مشکلات کارایی، تضمین کیفیت سرویس و افزایش دسترسیپذیری به دادهها از تکثیر داده استفاده میکنند. تکثیر با وجود مزایای بسیار هزینههای مدیریتی نیز به همراه دارد. سازگار نگه داشتن تکثیرها از جمله مهم ترین هزینه های ناشی از تکثیر است. تعادل بین هزینه سازگاری تکثیر و مزایای تکثیر یک موضوع مورد بحث و داغ در بین محققان این حیطه است. لذا توجه به سازگاری تکثیر نقش موثری در کارایی این سیستمها بازی می کند. استراتژیهای بسیاری توسط محققان در حیطه سازگاری تکثیر داده ارائه شده است. هر کدام از این استراتژیها با در نظر گرفتن پارامترهای مختلفی مانند نرخ خواندن، نرخ نوشتن، نرخ تحمل دادههای قدیمی، تعداد تکثیرها و پهنای باند ارتباطی در تعیین سطوح سازگاری تکثیرها سعی در کاهش هزینههای سازگاری و ارائه راهکارهای مؤثر در این حوزه دارند. در این مقاله به مفاهیم تکثیر و سازگاری تکثیر پرداخته می-شود. دستهبندی ها و روش های سازگاری موجود در این حوزه بررسی می شود. کارهای انجام شده در حیطه سازگاری تکثیر داده از دیدگاه های مختلفی مانند نوع سیستم، پارامترهای تصمیم گیری، ابزار شبیه سازی، مدل سازگاری و پارامترهای بهبود داده شده مقایسه می شوند. همچنین در پایان، موضوعات باز در این حوزه مطرح می شود.
چکیده انگلیسی :
Nowadays, applications generate huge amounts of data, in the range of several terabytes or petabytes. This data is shared among many users around the world. Distributed systems such as grid and cloud provide a suitable platform for these applications, enabling the use of these diverse mass data applications in a distributed manner. In these systems, they use data replication to face performance problems, guarantee service quality, and increase data accessibility. Replication, despite its many advantages, also brings administrative costs. The balance between the consistency cost of replication and the benefits of replication is a hotly debated topic among researchers in this field. Therefore, paying attention to the consistency of replication plays an effective role in the efficiency of these systems. Many strategies have been proposed by researchers in the field of data replication consistency. Each of these strategies try to reduce consistency costs and provide effective solutions in this field by considering various parameters such as read rate, write rate, old data tolerance rate, number of replicas and communication bandwidth in determining the consistency levels of replicas. In this article, we will examine the concepts related to replication and replica consistency and categorize its types and review previous works in this field. The done works have been compared from the perspective of system type, decision parameters, compatibility model and improved parameters. At the end, the open issues in this field are raised.
] R. Moore, C. Baru, A. Rajasekar, R. Marciano, and M. Wan, “Data Intensive Computing, In``The Grid: Blueprint for a New Computing Infrastructure’’, eds. I. Foster and C. Kesselman.” Morgan Kaufmann, San Francisco, 1999.
[2] M. Beigrezaei, A. Toroghi Haghighat, and S. Leili Mirtaheri, “Minimizing data access latency in data grids by neighborhood‐based data replication and job scheduling,” Int. J. Commun. Syst., p. e4552.
[3] A. M. Rahmani, Z. Fadaie, and A. T. Chronopoulos, “Data placement using Dewey Encoding in a hierarchical data grid,” J. Netw. Comput. Appl., vol. 49, pp. 88–98, 2015.
[4] U. Tos, R. Mokadem, A. Hameurlain, and T. Ayav, “Achieving query performance in the cloud via a cost-effective data replication strategy,” Soft Comput., vol. 25, no. 7, pp. 5437–5454, 2021.
[5] X. Dong, J. Li, Z. Wu, D. Zhang, and J. Xu, “On dynamic replication strategies in data service grids,” in 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), 2008, pp. 155–161.
[6] M.-C. Lee, F.-Y. Leu, and Y. Chen, “PFRF: An adaptive data replication algorithm based on star-topology data grids,” Futur. Gener. Comput. Syst., vol. 28, no. 7, pp. 1045–1057, 2012.
[7] C. Li, M. Song, M. Zhang, and Y. Luo, “Effective replica management for improving reliability and availability in edge-cloud computing environment,” J. Parallel Distrib. Comput., 2020.
[8] M. I. Naas, L. Lemarchand, P. Raipin, and J. Boukhobza, “IoT data replication and consistency management in fog computing,” J. Grid Comput., vol. 19, no. 3, pp. 1–25, 2021.
[9] S. Sun, W. Yao, B. Qiao, M. Zong, X. He, and X. Li, “RRSD: A file replication method for ensuring data reliability and reducing storage consumption in a dynamic Cloud-P2P environment,” Futur. Gener. Comput. Syst., vol. 100, pp. 844–858, 2019.
[10] M. Beigrezaei, A. T. Haghighat, and H. R. Kanan, “A new fuzzy based dynamic data replication algorithm in data grids,” in 2013 13th Iranian Conference on Fuzzy Systems (IFSC), 2013, pp. 1–5, doi: 10.1109/IFSC.2013.6675676.
[11] K. Rajaretnam, M. Rajkumar, and R. Venkatesan, “Rplb: A replica placement algorithm in data grid with load balancing,” Int. Arab J. Inf. Technol., vol. 13, no. 6, 2016.
[12] H. K. H. So and R. Brodersen, “A unified hardware/software runtime environment for FPGA-based reconfigurable computers using BORPH,” Trans. Embed. Comput. Syst., vol. 7, no. 2, pp. 1–28, 2008, doi: 10.1145/1331331.1331338.
[13] A. M. Rahmani, L. Azari, and H. A. Daniel, “A file group data replication algorithm for data grids,” J. Grid Comput., vol. 15, no. 3, pp. 379–393, 2017.
[14] G. Hager and G. Wellein, Introduction to high performance computing for scientists and engineers. CRC Press, 2010.
[15] M. Beigrezaei, A. T. Haghighat, M. R. Meybodi, and M. Runiassy, “PPRA: A new pre-fetching and prediction based replication algorithm in data grid,” in 2016 6th International Conference on Computer and Knowledge Engineering (ICCKE), 2016, pp. 257–262.
[16] B. L. Chamberlain, D. Callahan, and H. P. Zima, “Parallel programmability and the chapel language,” Int. J. High Perform. Comput. Appl., vol. 21, no. 3, pp. 291–312, 2007.
[17] L. Azari, A. M. Rahmani, H. A. Daniel, and N. N. Qader, “A data replication algorithm for groups of files in data grids,” J. Parallel Distrib. Comput., vol. 113, pp. 115–126, 2018.
[18] Y. Saito, “Consistency management in optimistic replication algorithms,” INTERNET, AOnlineU, vol. 15, pp. 1–18, 2001.
[19] R. A. Campêlo, M. A. Casanova, D. O. Guedes, and A. H. F. Laender, “A brief survey on replica consistency in cloud environments,” J. Internet Serv. Appl., vol. 11, no. 1, pp. 1–13, 2020.
[20] J. Curtis, “Consistency of data replication protocols in database systems: a review,” 2014.
[21] P. Vashisht, A. Sharma, and R. Kumar, “Strategies for replica consistency in data grid–a comprehensive survey,” Concurr. Comput. Pract. Exp., vol. 29, no. 4, p. e3907, 2017.
[22] K. S. Maabreh, “An Enhanced University Registration Model Using Distributed Database Schema.,” KSII Trans. Internet Inf. Syst., vol. 13, no. 7, 2019.
[23] I. Foster, C. Kesselman, and S. Tuecke, “The anatomy of the grid: Enabling scalable virtual organizations,” Int. J. High Perform. Comput. Appl., vol. 15, no. 3, pp. 200–222, 2001.
[24] H. Yu and A. Vahdat, “Design and evaluation of a conit-based continuous consistency model for replicated services,” ACM Trans. Comput. Syst., vol. 20, no. 3, pp. 239–282, 2002.
[25] T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann, “Consistency rationing in the cloud: pay only when it matters,” Proc. VLDB Endow., vol. 2, no. 1, pp. 253–264, 2009.
[26] A. S. Tanenbaum and M. Van Steen, Distributed systems: principles and paradigms. Prentice-Hall, 2007.
[27] J. Du, S. Elnikety, and W. Zwaenepoel, “Clock-SI: Snapshot isolation for partitioned data stores using loosely synchronized clocks,” in 2013 IEEE 32nd International Symposium on Reliable Distributed Systems, 2013, pp. 173–184.
[28] P. Keleher, A. L. Cox, and W. Zwaenepoel, “Lazy release consistency for software distributed shared memory,” ACM SIGARCH Comput. Archit. News, vol. 20, no. 2, pp. 13–21, 1992.
[29] L. Iftode, J. P. Singh, and K. Li, “Scope consistency: A bridge between release consistency and entry consistency,” in Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, 1996, pp. 277–287.
[30] C. Cachin, I. Keidar, and A. Shraer, “Fork sequential consistency is blocking,” Inf. Process. Lett., vol. 109, no. 7, pp. 360–364, 2009.
[31] C. Cachin, A. Shelat, and A. Shraer, “Efficient fork-linearizable access to untrusted shared memory,” in Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing, 2007, pp. 129–138.
[32] C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguiça, and R. Rodrigues, “Making geo-replicated systems fast as possible, consistent when necessary,” in Presented as part of the 10th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 12), 2012, pp. 265–278.
[33] F. J. Torres-Rojas and E. Meneses, “Convergence through a weak consistency model: Timed causal consistency,” CLEI Electron. J, vol. 8, no. 2, pp. 1–2, 2005.
[34] J. Li, M. N. Krohn, D. Mazieres, and D. E. Shasha, “Secure Untrusted Data Repository (SUNDR).,” in Osdi, 2004, vol. 4, p. 9.
[35] A. J. Feldman, W. P. Zeller, M. J. Freedman, and E. W. Felten, “SPORC: Group collaboration using untrusted cloud resources,” 2010.
[36] Y. Saito and M. Shapiro, “Optimistic replication,” ACM Comput. Surv., vol. 37, no. 1, pp. 42–81, 2005.
[37] M. Senftleben and K. Schneider, “Operational characterization of weak memory consistency models,” in International Conference on Architecture of Computing Systems, 2018, pp. 195–208.
[38] D. Terry, V. Prabhakaran, R. Kotla, M. Balakrishnan, and M. K. Aguilera, “Transactions with Consistency Choices on Geo-Replicated Cloud Storage.”
[39] X. Wang, S. Yang, S. Wang, X. Niu, and J. Xu, “An application-based adaptive replica consistency for cloud storage,” in 2010 Ninth International Conference on Grid and Cloud Computing, 2010, pp. 13–17.
[40] Y. N. Aye, “Data Consistency on Private Cloud Storage System,” Int. J. Emerg. Trends Technol. Comput. Sci. Vol., vol. 1, 2012.
[41] H.-E. Chihoub, S. Ibrahim, G. Antoniu, and M. S. Perez, “Harmony: Towards automated self-adaptive consistency in cloud storage,” in 2012 IEEE International Conference on Cluster Computing, 2012, pp. 293–301.
[42] S. Esteves, J. Silva, and L. Veiga, “Quality-of-service for consistency of data geo-replication in cloud computing,” in European Conference on Parallel Processing, 2012, pp. 285–297.
[43] H.-E. Chihoub, S. Ibrahim, G. Antoniu, and M. S. Perez, “Consistency in the cloud: When money does matter!,” in 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, 2013, pp. 352–359.
[44] Q. Liu, G. Wang, and J. Wu, “Consistency as a service: Auditing cloud consistency,” IEEE Trans. Netw. Serv. Manag., vol. 11, no. 1, pp. 25–35, 2014.
[45] X. Ren, W. Ru-chuan, and K. Qiang, “Efficient model for replica consistency maintenance in data grids,” in International Symposium on Computer Science and its Applications, 2008, pp. 159–162.
[46] A. Stiemer, I. Fetai, and H. Schuldt, “Analyzing the performance of data replication and data partitioning in the cloud: the BEOWULF approach,” in 2016 IEEE international conference on big data (Big Data), 2016, pp. 2837–2846.
[47] T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann, “Consistency rationing in the cloud: Pay only when it matters,” Proc. VLDB Endow., vol. 2, no. 1, pp. 253–264, 2009, doi: 10.14778/1687627.1687657.
[48] H. N. S. Aldin, H. Deldari, M. H. Moattar, and M. R. Ghods, “Strict timed causal consistency as a hybrid consistency model in the cloud environment,” Futur. Gener. Comput. Syst., vol. 105, pp. 259–274, 2020.
[49] O. Kozina and M. Kozin, “Simulation Model of Data Consistency Protocol for Multicloud Systems,” in 2022 IEEE 3rd KhPI Week on Advanced Technology (KhPIWeek), 2022, pp. 1–4.
[50] N. Mostafa, “A dynamic approach for consistency service in cloud and fog environment,” in 2020 fifth international conference on fog and mobile edge computing (FMEC), 2020, pp. 28–33.
[51] S. Sun, X. Wang, and F. Zuo, “RPCC: A Replica Placement Method to Alleviate the Replica Consistency under Dynamic Cloud,” in 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), 2020, pp. 729–734.
[52] A. Lakshman and P. Malik, “Cassandra: a decentralized structured storage system,” ACM SIGOPS Oper. Syst. Rev., vol. 44, no. 2, pp. 35–40, 2010.
[53] OpenACC Working Group, “The OpenACC Application Programming Interface 3.0,” Openacc.Org Doc., pp. 1–118, 2019.
[54] C.-T. Yang, W.-C. Tsai, T.-T. Chen, and C.-H. Hsu, “A one-way file replica consistency model in data Grids,” in The 2nd IEEE Asia-Pacific Service Computing Conference (APSCC 2007), 2007, pp. 364–373.
[55] M. N. Vora, “Hadoop-HBase for large-scale data,” in Proceedings of 2011 International Conference on Computer Science and Network Technology, 2011, vol. 1, pp. 601–605.
[56] R. Klophaus, “Riak core: Building distributed applications without shared state,” in ACM SIGPLAN Commercial Users of Functional Programming, 2010, p. 1.
[57] L. Xiong, L. Yang, Y. Tao, J. Xu, and L. Zhao, “Replication strategy for spatiotemporal data based on distributed caching system,” Sensors, vol. 18, no. 1, p. 222, 2018.
[58] M. I. Naas, J. Boukhobza, P. R. Parvedy, and L. Lemarchand, “An extension to ifogsim to enable the design of data placement strategies,” in 2018 IEEE 2nd International Conference on Fog and Edge Computing (ICFEC), 2018, pp. 1–8.
[59] R. Oma, S. Nakamura, D. Duolikun, T. Enokido, and M. Takizawa, “Fault-tolerant fog computing models in the IoT,” in International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 2018, pp. 14–25.
[60] A. V. Dastjerdi, H. Gupta, R. N. Calheiros, S. K. Ghosh, and R. Buyya, “Fog computing: Principles, architectures, and applications,” in Internet of things, Elsevier, 2016, pp. 61–75.
[61] M. Verma, N. Bhardwaj, and A. K. Yadav, “Real time efficient scheduling algorithm for load balancing in fog computing environment,” Int. J. Inf. Technol. Comput. Sci, vol. 8, no. 4, pp. 1–10, 2016.
[62] H. F. Atlam, R. J. Walters, and G. B. Wills, “Fog computing and the internet of things: a review,” big data Cogn. Comput., vol. 2, no. 2, p. 10, 2018.
[63] B. Feng, A. Tian, S. Yu, J. Li, H. Zhou, and H. Zhang, “Efficient Cache Consistency Management for Transient IoT Data in Content-Centric Networking,” IEEE Internet Things J., 2022.
[64] T. Junfeng, B. Wenqing, and J. Haoyi, “PGCE: A distributed storage causal consistency model based on partial geo-replication and cloud-edge collaboration architecture,” Comput. Networks, vol. 212, p. 109065, 2022.
[65] J. Tian, H. Jia, and W. Bai, “CCECGP: causal consistency model of edge–cloud collaborative based on grouping protocol,” J. Supercomput., pp. 1–24, 2022.
[66] C. Li, J. Bai, Y. Chen, and Y. Luo, “Resource and replica management strategy for optimizing financial cost and user experience in edge cloud computing system,” Inf. Sci. (Ny)., vol. 516, pp. 33–55, 2020.
[67] J. Guo, C. Li, and Y. Luo, “Fast replica recovery and adaptive consistency preservation for edge cloud system,” Soft Comput., vol. 24, pp. 14943–14964, 2020.
[68] J. Lan, X. Liu, P. Shenoy, and K. Ramamritham, “Consistency maintenance in peer-to-peer file sharing networks,” in Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003, 2003, pp. 90–94.
[69] H. Shen, “IRM: Integrated file replication and consistency maintenance in P2P systems,” IEEE Trans. Parallel Distrib. Syst., vol. 21, no. 1, pp. 100–113, 2009.
[70] R.-S. Chang and J.-S. Chang, “Adaptable replica consistency service for data grids,” in Third International Conference on Information Technology: New Generations (ITNG’06), 2006, pp. 646–651.
[71] X. Meng and C. Zhang, “An ant colony model based replica consistency maintenance strategy in unstructured P2P networks,” Comput. Networks, vol. 62, pp. 1–11, 2014.
[72] S. C. Choi and H. Y. Youn, “Dynamic hybrid replication effectively combining tree and grid topology,” J. Supercomput., vol. 59, pp. 1289–1311, 2012.
[73] E. Anderson, X. Li, M. A. Shah, J. Tucek, and J. J. Wylie, “What Consistency Does Your {Key-Value} Store Actually Provide?,” 2010.
[74] G. Belalem and B. Yagoubi, Collaborative Negotiation to Resolve Conflicts among Replicas in Data Grids. INTECH Open Access Publisher, 2010.
[75] G. Belalem, C. Haddad, and Y. Slimani, “An effective approach for consistency management of replicas in Data Grid,” in 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008, pp. 11–18.