• OpenAccess
    • List of Articles data

      • Open Access Article

        1 - A method for clustering customers using RFM model and grey numbers in terms of uncertainty
        azime mozafari
        The purpose of this study is presentation a method for clustering bank customers based on RFM model in terms of uncertainty. According to the proposed framework in this study after determination the parameter values of the RFM model, including recently exchange (R), fre More
        The purpose of this study is presentation a method for clustering bank customers based on RFM model in terms of uncertainty. According to the proposed framework in this study after determination the parameter values of the RFM model, including recently exchange (R), frequency exchange (F), and monetary value of the exchange (M), grey theory is used to eliminate the uncertainty and customers are segmented using a different approach. Thus, bank customers are clustered to three main segments called good, ordinary and bad customers. After cluster validation using Dunn index and Davis Bouldin index, properties of customers are detected in any of the segments. Finally, recommendations are offered to improve customer relationship management system. Manuscript profile
      • Open Access Article

        2 - Modified orthogonal chaotic colonial competition algorithm and its application in improving pattern recognition in multilayer perceptron neural network
        Payman Moallem mehrdad sadeghi hariri MAHDI hashemi
        Despite the success of the Colonial Competition Algorithm (ICA) in solving optimization problems, this algorithm still suffers from repeated entrapment in the local minimum and low convergence speed. In this paper, a new version of this algorithm, called Modified Orthog More
        Despite the success of the Colonial Competition Algorithm (ICA) in solving optimization problems, this algorithm still suffers from repeated entrapment in the local minimum and low convergence speed. In this paper, a new version of this algorithm, called Modified Orthogonal Chaotic Colonial Competition (COICA), is proposed. In the policy of absorbing the proposed version, each colony seeks the space to move towards the colonizer through the definition of a new orthogonal vector. Also, the possibility of selecting powerful empires is defined through the boltzmann distribution function, and the selection operation is performed through the roulette wheel method. The proposed multilevel perceptron neural network (MLP) algorithm is used to classify standard datasets, including ionosphere and sonar. To evaluate the performance of this algorithm and to evaluate the generalizability of the trained neural network with the proposed version, the K-Fold cross-validation method has been used. The results obtained from the simulations confirm the reduction of network training error as well as the improved generalizability of the proposed algorithm. Manuscript profile
      • Open Access Article

        3 - Integrating Data Envelopment Analysis and Decision Tree Models in Order to Evaluate Information Technology-Based Units
        Amir Amini ali alinezhad somaye shafaghizade
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data enve More
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data envelopment analysis (DEA) is a non-parametric method to measure the effectiveness and efficiency of decision-making units (DMUs). On the other hand, data mining technique allows DMUs to explore and discover meaningful information, which had previously been hidden in large databases. . This paper presents a general framework for combining DEA and regression tree for evaluating the effectiveness and efficiency of the DMUs. Resulting hybrid model is a set of rules that can be used by policy makers to discover reasons behind efficient and inefficient DMUs. Using the proposed method for examining factors related to productivity, a sample of 18 branches of Iran insurance in Tehran was elected as a case study. After modeling based on advanced model the input oriented LVM model with weak disposability in data envelopment analysis was calculated using undesirable output, and by use of decision tree technique deals with extracting and discovering the rules for the cause of increased productivity and reduced productivity. Manuscript profile
      • Open Access Article

        4 - A Combinational Model for Evaluating Organizational Readiness for Data Warehouse Implementation by Using Analytical Hierarchical Process
        jafar bagherinejad zhinoos adibi
        Enterprise Data Warehouse initiative is a high investment project. The adoption of Data Warehouse will be significantly different depending upon the level of readiness of an organization. Before implementation of Data Warehouse system in a firm, it is necessary to evalu More
        Enterprise Data Warehouse initiative is a high investment project. The adoption of Data Warehouse will be significantly different depending upon the level of readiness of an organization. Before implementation of Data Warehouse system in a firm, it is necessary to evaluate the level of the readiness of firm. A successful Data Warehouse assessment model requires a deep understanding of opportunities, challenges and influential factors that a typical firm’s Data Warehouse (DW) may include. Actually, data warehouse system is one of Knowledge Management and Decision Support System tools. By this system, the distributed data throughout organizations could be collected, extracted and integrated and with knowledge discovery and data mining the latent data can be extracted and analyzed In this paper, after reviewing the relevant literature and a comparative analysis of assessment models for organizational readiness for implementation of Data warehouse system, a conceptual framework was designed and its validity was approved by test of hypothesis. Then, by using analytical hierarchical process technique and its expert choice software, criteria and sub-criteria of influential factors were assessed and weighted. The validity and effectiveness of the model including, six criteria and 23 sub-criteria with main influential factors named Information needs, Data structure , Organizational processes , Organizational factors ,Technical structure and Project management were approved by a field study and the relevant statistical analysis. Manuscript profile
      • Open Access Article

        5 - Learning to Rank for the Persian Web Using the Layered Genetic Programming
        Amir Hosein Keyhanipour
        Learning to rank (L2R) has emerged as a promising approach in handling the existing challenges of Web search engines. However, there are major drawbacks with the present learning to rank techniques. Current L2R algorithms do not take into account to the search behavio More
        Learning to rank (L2R) has emerged as a promising approach in handling the existing challenges of Web search engines. However, there are major drawbacks with the present learning to rank techniques. Current L2R algorithms do not take into account to the search behavior of the users embedded in their search sessions’ logs. On the other hand, machine-learning as a data-intensive process requires a large volume of data about users’ queries as well as Web documents. This situation has made the usage of L2R techniques questionable in the real-world applications. Recently, by the use of the click-through data model and based on the generation of click-through features, a novel approach is proposed, named as MGP-Rank. Using the layered genetic-programming model, MGP-Rank has achieved noticeable performance on the ranking of the English Web content. In this study, with respect to the specific characteristics of the Persian language, some suitable scenarios are presented for the generation of the click-through features. In this way, a customized version of the MGP-Rank is proposed of the Persian Web retrieval. The evaluation results of this algorithm on the dotIR dataset, indicate its considerable improvement in comparison with major ranking methods. The improvement of the performance is particularly more noticeable in the top part of the search results lists, which are most frequently visited by the Web users. Manuscript profile
      • Open Access Article

        6 - Provide a method for customer segmentation using the RFM model in conditions of uncertainty
        mohammadreza gholamian azime mozafari
        The purpose of this study is to provide a method for customer segmentation of a private bank in Shiraz based on the RFM model in the face of uncertainty about customer data. In the proposed framework of this study, first, the values ​​of RFM model indicators including e More
        The purpose of this study is to provide a method for customer segmentation of a private bank in Shiraz based on the RFM model in the face of uncertainty about customer data. In the proposed framework of this study, first, the values ​​of RFM model indicators including exchange novelty (R), number of exchanges (F) and monetary value of exchange (M) were extracted from the customer database and preprocessed. Given the breadth of the data, it is not possible to determine the exact number to determine whether the customer is good or bad; Therefore, to eliminate this uncertainty, the gray number theory was used, which considers the customer's situation as a range. In this way, using a different method, the bank's customers were segmented, which according to the results, customers were divided into three main sections or clusters as good, normal and bad customers. After validating the clusters using Don and Davis Boldin indicators, customer characteristics in each sector were identified and at the end, suggestions were made to improve the customer relationship management system. Manuscript profile
      • Open Access Article

        7 - Increasing the value of collected data and reducing energy consumption by using network coding and mobile sinks in wireless sensor networks
        ehsan kharati
        The wireless sensor network includes a number of fixed sensor nodes that move sink nodes to collect data between nodes. To reduce energy consumption and increase the value of collected data, it is necessary to determine the optimum route and residence location of mobile More
        The wireless sensor network includes a number of fixed sensor nodes that move sink nodes to collect data between nodes. To reduce energy consumption and increase the value of collected data, it is necessary to determine the optimum route and residence location of mobile sinks, which increases the life of wireless sensor networks. Using network coding, this paper presents a Mixed Integer Linear Programming Model to determine the optimal multicast routing of source sensor nodes to mobile sinks in wireless sensor networks, which determines the time and location of sinks to collect maximum coded data and reduces the delay in sink movement and energy consumption. Solving this problem in polynomial time is not possible due to the involvement of various parameters and the constrained resources of wireless sensor networks. Therefore, several exploratory and greedy and fully distributed algorithms are proposed to determine the movement of sinks and their residence location based on maximizing the value of coded data and the type of data dead time. By simulating, the optimal method and the use of coding and proposed algorithms, reduce the runtime and energy consumption and increase the value of collected data and network lifetime than non-coding methods. Manuscript profile
      • Open Access Article

        8 - An Improved Method for Detecting Phishing Websites Using Data Mining on Web Pages
        mahdiye baharloo Alireza Yari
        Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is More
        Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is regarded as one of the important prerequisites in designing an accurate detection system. Therefore, in order to detect phishing features, a list of 30 features suggested by phishing websites was first prepared. Then, a two-stage feature reduction method based on feature selection and extraction were proposed to enhance the efficiency of phishing detection systems, which was able to reduce the number of features significantly. Finally, the performance of decision tree J48, random forest, naïve Bayes methods were evaluated{cke_protected_1}{cke_protected_2}{cke_protected_3}{cke_protected_4} on the reduced features. The results indicated that accuracy of the model created to determine the phishing websites by using the two-stage feature reduction based Wrapper and Principal Component Analysis (PCA) algorithm in the random forest method of 96.58%, which is a desirable outcome compared to other methods. Manuscript profile
      • Open Access Article

        9 - An access control model for online social networks using user-to-user relationships
        Mohamad Javad Piran mahmud deypir
        With the pervasiveness of social networks and the growing information shared on them, users of these networks are exposed to potential threats to data security and privacy. The privacy settings included in these networks do not give users complete control over the manag More
        With the pervasiveness of social networks and the growing information shared on them, users of these networks are exposed to potential threats to data security and privacy. The privacy settings included in these networks do not give users complete control over the management and privatization of access to their shared information by other users. In this article, using the concept of social graph, a new model of user access control was proposed to the user, which allows the expression of privacy policies and more accurate and professional access control in terms of pattern and depth of relationships between users in social networks. In this article, by using the regular index method, indirect relationships among users are examined and analyzed, and more precise policies than previous models are presented. The evaluation of the results showed that for 10 neighbors for each user, the probability accumulation of a qualified path for the first three counter loops is 1, 10.5 and 67.3%, respectively, and finally for the fourth counter it reaches 100%. As the defined counting characteristic increases, the average execution time of the proposed algorithm and previously proposed algorithms increases. However, for the higher limits of the counting characteristic, the proposed algorithm performs better than the previous ones. Manuscript profile
      • Open Access Article

        10 - A Decision Support System based on Rough sets for Enterprise Planning under uncertainty
        سید امیرهادی مینوفام Hassan Rashidi
        Increasing rate of novice technology in global marketing arises some challenges in the economic enterprise planning. One of the appropriate approaches to resolve these challenges is using rough set theory along with decision making. In this paper, a decision support sys More
        Increasing rate of novice technology in global marketing arises some challenges in the economic enterprise planning. One of the appropriate approaches to resolve these challenges is using rough set theory along with decision making. In this paper, a decision support system with an algorithm based on rough set theory is provided. The proposed algorithm is implemented for a product line in one of the organizations under supervision of mining, industry and trade ministry. The variable effects on the enterpise aims are evaluated by analysing the strength and support criteria of rough sets. The rules are classeified as three different classes and 3 out of 12 have high reasonable averagewhie the last 3 have a relatively high violation probability. The other rules have heterogenious distribution and are not certain. The advantages of the proposed system are avoidance of enterprse capital wasting, prevention of errors due to data uncertainty, and high precision of decitions. The decision makers in the enterprise validated the increasment of simplicity and speeds of vital decision making by using the proposed system. Manuscript profile
      • Open Access Article

        11 - An Intelligent Model for Multidimensional Personality Recognition of Users using Deep Learning Methods
        Hossein Sadr fatemeh mohades deilami morteza tarkhan
        Due to the significant growth of textual information and data generated by humans on social networks, there is a need for systems that can automatically analyze the data and extract valuable information from them. One of the most important textual data is people's opini More
        Due to the significant growth of textual information and data generated by humans on social networks, there is a need for systems that can automatically analyze the data and extract valuable information from them. One of the most important textual data is people's opinions about a particular topic that are expressed in the form of text. Text published by users on social networks can represent their personality. Although machine learning based methods can be considered as a good choice for analyzing these data, there is also a remarkable need for deep learning based methods to overcome the complexity and dispersion of content and syntax of textual data during the training process. In this regard, the purpose of this paper is to employ deep learning based methods for personality recognition. Accordingly, the convolutional neural network is combined with the Adaboost algorithm to consider the possibility of using the contribution of various filter lengths and gasp their potential in the final classification via combining various classifiers with respective filter sizes using AdaBoost. The proposed model was conducted on Essays and YouTube datasets. Based on the empirical results, the proposed model presented superior performance compared to other existing models on both datasets. Manuscript profile
      • Open Access Article

        12 - Use of conditional generative adversarial network to produce synthetic data with the aim of improving the classification of users who publish fake news
        arefeh esmaili Saeed Farzi
        For many years, fake news and messages have been spread in human societies, and today, with the spread of social networks among the people, the possibility of spreading false information has increased more than before. Therefore, detecting fake news and messages has bec More
        For many years, fake news and messages have been spread in human societies, and today, with the spread of social networks among the people, the possibility of spreading false information has increased more than before. Therefore, detecting fake news and messages has become a prominent issue in the research community. It is also important to detect the users who generate this false information and publish it on the network. This paper detects users who publish incorrect information on the Twitter social network in Persian. In this regard, a system has been established based on combining context-user and context-network features with the help of a conditional generative adversarial network (CGAN) for balancing the data set. The system also detects users who publish fake news by modeling the twitter social network into a graph of user interactions and embedding a node to feature vector by Node2vec. Also, by conducting several tests, the proposed system has improved evaluation metrics up to 11%, 13%, 12%, and 12% in precision, recall, F-measure and accuracy respectively, compared to its competitors and has been able to create about 99% precision, in detecting users who publish fake news. Manuscript profile
      • Open Access Article

        13 - A framework for establishing a national data vault for Data Governance institution
        Nader naghshineh fatima fahimnia hamidreza Ahmadian chashmi
        the goal of this research is mainly presenting a framework for national data with the concentration on parameters respecting data governance in order to design an effective and comprehensive pattern for all spots interacting with national data. The author has adopted de More
        the goal of this research is mainly presenting a framework for national data with the concentration on parameters respecting data governance in order to design an effective and comprehensive pattern for all spots interacting with national data. The author has adopted descriptive approach and mixed method for this research. In the first step, the articles regarding national data organization are extracted and subsequently accorded with the articles based on technology ecosystem design patterns, 10 key components are formed as main modules. Thereafter, for each module, indexes and sub-indexes are taken into account by considering articles and also taking advantages of interviews and Delphi method. by designing two questionnaires, strategy-management and technical-lawful oriented, total number of 22 indexes and 154 sub-indexes are collected. the research has the capacity of being a scientific reference for the national data vault. it is recommended that development of technical infrastructure and data governance patterns in national level accorded with indexes and sub-indexes counted in this research Manuscript profile
      • Open Access Article

        14 - Investigating the Information and Communication Technology Deployment Impact on Energy Expenditures of Iranian households (A Provincial Approach)
        Elham Hosseinzadeh َAmir Hossein Mozayani
        Nowadays, investing in information and communication technology (ICT) is inevitable, because it affects various aspects of human life, including the economy. Due to the rapid growth of population, increasing energy demand, and limited energy resources, one of the bas More
        Nowadays, investing in information and communication technology (ICT) is inevitable, because it affects various aspects of human life, including the economy. Due to the rapid growth of population, increasing energy demand, and limited energy resources, one of the basic measures to achieve sustainable development in countries, is optimization and reform of energy consumption structures. Given that the home sector is one of the main sectors of energy consumption, one of the effective approaches in reducing and managing household energy expenditures is to use ICT capabilities. In this regard, in this study, the effect of ICT expansion on energy consumption of urban households in Iran using the Panel Data method and GLS model during the period 2008-2015 and in the form of provincial data has been analyzed. The results indicate that in some models, a significant reducing effect of ICT on energy expenditure was observed. However, in most of the estimated models, there is no significant reducing effect of ICT on household energy expenditure. It seems that the main reasons for this are the subsidy structure governing energy prices, the low share of energy in total household consumption expenditures, the lack of proper consumption culture. Manuscript profile
      • Open Access Article

        15 - Presenting the model for opinion mining at the document feature level for hotel users' reviews
        ELHAM KHALAJJ shahriyar mohammadi
        Nowadays, online review of user’s sentiments and opinions on the Internet is an important part of the process of people deciding whether to choose a product or use the services provided. Despite the Internet platform and easy access to blogs related to opinions in the More
        Nowadays, online review of user’s sentiments and opinions on the Internet is an important part of the process of people deciding whether to choose a product or use the services provided. Despite the Internet platform and easy access to blogs related to opinions in the field of tourism and hotel industry, there are huge and rich sources of ideas in the form of text that people can use text mining methods to discover the opinions of. Due to the importance of user's sentiments and opinions in the industry, especially in the tourism and hotel industry, the topics of opinion research and analysis of emotions and exploration of texts written by users have been considered by those in charge. In this research, a new and combined method based on a common approach in sentiment analysis, the use of words to produce characteristics for classifying reviews is presented. Thus, the development of two methods of vocabulary construction, one using statistical methods and the other using genetic algorithm is presented. The above words are combined with the Vocabulary of public feeling and standard Liu Bing classification of prominent words to increase the accuracy of classification Manuscript profile
      • Open Access Article

        16 - Presenting a novel solution to choose a proper database for storing big data in national network services
        Mohammad Reza Ahmadi davood maleki ehsan arianyan
        The increasing development of tools producing data in different services and the need to store the results of large-scale processing results produced from various activities in the national information network services and also the data produced by the private sector an More
        The increasing development of tools producing data in different services and the need to store the results of large-scale processing results produced from various activities in the national information network services and also the data produced by the private sector and social networks, has made the migration to new databases solutions with appropriate features inevitable. With the expansion and change in the size and composition of data and the formation of big data, traditional practices and patterns do not meet new requirements. Therefore, the necessity of using information storage systems in new and scalable formats and models has become necessary. In this paper, the basic structural dimensions and different functions of both traditional databases and modern storage systems are reviewed and a new technical solution for migrating from traditional databases to modern databases is presented. Also, the basic features regarding the connection of traditional and modern databases for storing and processing data obtained from the comprehensive services of the national information network are presented and the parameters and capabilities of databases in the standard and Hadoop context are examined. In addition, as a practical example, a solution for combining traditional and modern databases has been presented, evaluated and compared using the BSC method. Moreover, it is shown that in different data sets with different data volumes, a combined use of both traditional and modern databases can be the most efficient solution. Manuscript profile
      • Open Access Article

        17 - Identifying and ranking factors affecting the digital transformation strategy in Iran's road freight transportation industry focusing on the Internet of Things and data analytics
        Mehran Ehteshami Mohammad Hasan Cheraghali Bita Tabrizian Maryam Teimourian sefidehkhan
        This research has been done with the aim of identifying and ranking the factors affecting the digital transformation strategy in Iran's road freight transportation industry, focusing on the Internet of Things and data analytics. After reviewing the literature, semi-stru More
        This research has been done with the aim of identifying and ranking the factors affecting the digital transformation strategy in Iran's road freight transportation industry, focusing on the Internet of Things and data analytics. After reviewing the literature, semi-structured interviews were conducted with 20 academic and road freight transportation industry experts in Iran, who were selected using the purposive sampling method and saturation principle. In the quantitative part, the opinions of 170 employees of this industry, who were selected based on Cochran's formula and stratified sampling method, were collected using a researcher-made questionnaire. Delphi technique, literature review and coding were used to analyze the data in the qualitative part. In the quantitative part, inferential statistics and SPSS and smartPLS software were used. Finally, 40 indicators were extracted in the form of 8 factors and ranking of indicators and affecting factors was done using factor analysis. The result of this research shows that the internal factors have the highest rank and software infrastructure, hardware infrastructure, economic, external factors, legal, cultural and penetration factor are in the next ranks respectively. Therefore, it is suggested that organizations consider their human resource empowerment program in line with the use of technology and digital tools. Manuscript profile
      • Open Access Article

        18 - Anomaly and Intrusion Detection through Datamining and Feature Selection using PSO Algorithm
        Fereidoon Rezaei Mohamad Ali Afshar Kazemi Mohammad Ali Keramati
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, th More
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, the individuals intrude the websites and virtual markets through cyberattacks and disrupt them. Detection of attacks and anomalies is one of the new challenges in promoting e-commerce technologies. Detecting anomalies of a network and the process of detecting destructive activities in e-commerce can be executed by analyzing the behavior of network traffic. Data mining systems/techniques are used extensively in intrusion detection systems (IDS) in order to detect anomalies. Reducing the size/dimensions of features plays an important role in intrusion detection since detecting anomalies, which are features of network traffic with high dimensions, is a time-consuming process. Choosing suitable and accurate features influences the speed of the proposed task/work analysis, resulting in an improved speed of detection. In this article, by using data mining algorithms such as J48 and PSO, we were able to significantly improve the accuracy of detecting anomalies and attacks. Manuscript profile
      • Open Access Article

        19 - Synthesizing an image dataset for text detection and recognition in images
        Fatemeh Alimoradi Farzaneh Rahmani Leila Rabiei Mohammad Khansari Mojtaba Mazoochi
        Text detection in images is one of the most important sources for image recognition. Although many researches have been conducted on text detection and recognition and end-to-end models (models that provide detection and recognition in a single model) based on deep lear More
        Text detection in images is one of the most important sources for image recognition. Although many researches have been conducted on text detection and recognition and end-to-end models (models that provide detection and recognition in a single model) based on deep learning for languages such as English and Chinese, the main obstacle for developing such models for Persian language is the lack of a large training data set. In this paper, we design and build required tools for synthesizing a data set of scene text images with parameters such as color, size, font, and text rotation for Persian. These tools are used to generate a large still varied data set for training deep learning models. Due to considerations in synthesizing tools and resulted variety of texts, models do not depend on synthesis parameters and can be generalized. 7603 scene text images and 39660 cropped word images are synthesized as sample data set. The advantage of our method over real images is to synthesize any arbitrary number of images, without the need for manual annotations. As far as we know, this is the first open-source and large data set of scene text images for Persian language. Manuscript profile
      • Open Access Article

        20 - Data-driven Marketing in Digital Businesses from Dynamic Capabilities View
        Maede  Amini vlashani ayoub mohamadian Seyed Mohammadbagher Jafari
        Despite the enormous volume of data and the benefits it can bring to marketing activities, it is unclear how to use it in the literature, and very few studies have been conducted in this field. In this regard, this study uses dynamic capabilities view to identify the dy More
        Despite the enormous volume of data and the benefits it can bring to marketing activities, it is unclear how to use it in the literature, and very few studies have been conducted in this field. In this regard, this study uses dynamic capabilities view to identify the dynamic capabilities of data-driven marketing to focus on data in the development of marketing strategies, make effective decisions, and improve efficiency in marketing processes and operations. This research has been carried out in a qualitative method utilizing the content analysis strategy and interviews with specialists. The subjects were 18 professionals in the field of data analytics and marketing. They were selected by the purposeful sampling method. This study provides data-driven marketing dynamic capabilities, including; Ability to absorb marketing data, aggregate and analyze marketing data, the ability to data-driven decision-making, the ability to improve the data-driven experience with the customer, data-driven innovation, networking, agility, and data-driven transformation. The results of this study can be a step towards developing the theory of dynamic capabilities in the field of marketing with a data-driven approach. Therefore, it can be used in training and creating new organizational capabilities to use big data in the marketing activities of organizations, to develop and improve data-driven products and services, and improve the customer experience Manuscript profile
      • Open Access Article

        21 - Survey on the Applications of the Graph Theory in the Information Retrieval
        Maryam Piroozmand Amir Hosein Keyhanipour Ali Moeini
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph- More
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph-based information retrieval algorithms have shown to be efficient and effective, this paper aims to provide an analytical review of these algorithms and propose a categorization of them. Briefly speaking, graph-based information retrieval algorithms might be divided into three major classes: the first category includes those algorithms which use a graph representation of the corresponding dataset within the information retrieval process. The second category contains semantic retrieval algorithms which utilize the graph theory. The third category is associated with the application of the graph theory in the learning to rank problem. The set of reviewed research works is analyzed based on both the frequency as well as the publication time. As an interesting finding of this review is that the third category is a relatively hot research topic in which a limited number of recent research works are conducted. Manuscript profile
      • Open Access Article

        22 - Fuzzy Multicore Clustering of Big Data in the Hadoop Map Reduce Framework
        Seyed Omid Azarkasb Seyed Hossein Khasteh Mostafa  Amiri
        A logical solution to consider the overlap of clusters is assigning a set of membership degrees to each data point. Fuzzy clustering, due to its reduced partitions and decreased search space, generally incurs lower computational overhead and easily handles ambiguous, no More
        A logical solution to consider the overlap of clusters is assigning a set of membership degrees to each data point. Fuzzy clustering, due to its reduced partitions and decreased search space, generally incurs lower computational overhead and easily handles ambiguous, noisy, and outlier data. Thus, fuzzy clustering is considered an advanced clustering method. However, fuzzy clustering methods often struggle with non-linear data relationships. This paper proposes a method based on feasible ideas that utilizes multicore learning within the Hadoop map reduce framework to identify inseparable linear clusters in complex big data structures. The multicore learning model is capable of capturing complex relationships among data, while Hadoop enables us to interact with a logical cluster of processing and data storage nodes instead of interacting with individual operating systems and processors. In summary, the paper presents the modeling of non-linear data relationships using multicore learning, determination of appropriate values for fuzzy parameterization and feasibility, and the provision of an algorithm within the Hadoop map reduce model. The experiments were conducted on one of the commonly used datasets from the UCI Machine Learning Repository, as well as on the implemented CloudSim dataset simulator, and satisfactory results were obtained.According to published studies, the UCI Machine Learning Repository is suitable for regression and clustering purposes in analyzing large-scale datasets, while the CloudSim dataset is specifically designed for simulating cloud computing scenarios, calculating time delays, and task scheduling. Manuscript profile
      • Open Access Article

        23 - The main components of evaluating the credibility of users according to organizational goals in the life cycle of big data
        Sogand Dehghan shahriyar mohammadi rojiar pirmohamadiani
        Social networks have become one of the most important decision-making factors in organizations due to the speed of publishing events and the large amount of information. For this reason, they are one of the most important factors in the decision-making process of inform More
        Social networks have become one of the most important decision-making factors in organizations due to the speed of publishing events and the large amount of information. For this reason, they are one of the most important factors in the decision-making process of information validity. The accuracy, reliability and value of the information are clarified by these networks. For this purpose, it is possible to check the validity of information with the features of these networks at the three levels of user, content and event. Checking the user level is the most reliable level in this field, because a valid user usually publishes valid content. Despite the importance of this topic and the various researches conducted in this field, important components in the process of evaluating the validity of social network information have received less attention. Hence, this research identifies, collects and examines the related components with the narrative method that it does on 30 important and original articles in this field. Usually, the articles in this field are comparable from three dimensions to the description of credit analysis approaches, content topic detection, feature selection methods. Therefore, these dimensions have been investigated and divided. In the end, an initial framework was presented focusing on evaluating the credibility of users as information sources. This article is a suitable guide for calculating the amount of credit of users in the decision-making process. Manuscript profile
      • Open Access Article

        24 - Predicting the workload of virtual machines in order to reduce energy consumption in cloud data centers using the combination of deep learning models
        Zeinab Khodaverdian Hossein Sadr Mojdeh Nazari Soleimandarabi Seyed Ahmad Edalatpanah
        Cloud computing service models are growing rapidly, and inefficient use of resources in cloud data centers leads to high energy consumption and increased costs. Plans of resource allocation aiming to reduce energy consumption in cloud data centers has been conducted usi More
        Cloud computing service models are growing rapidly, and inefficient use of resources in cloud data centers leads to high energy consumption and increased costs. Plans of resource allocation aiming to reduce energy consumption in cloud data centers has been conducted using live migration of Virtual Machines (VMs) and their consolidation into the small number of Physical Machines (PMs). However, the selection of the appropriate VM for migration is an important challenge. To solve this issue, VMs can be classified according to the pattern of user requests into Delay-sensitive (Interactive) or Delay-Insensitive classes, and thereafter suitable VMs can be selected for migration. This is possible by virtual machine workload prediction .In fact, workload predicting and predicting analysis is a pre-migration process of a virtual machine. In this paper, In order to classification of VMs in the Microsoft Azure cloud service, a hybrid model based on Convolution Neural Network (CNN) and Gated Recurrent Unit (GRU) is proposed. Microsoft Azure Dataset is a labeled dataset and the workload of virtual machines in this dataset are in two labeled Delay-sensitive (Interactive) or Delay-Insensitive. But the distribution of samples in this dataset is unbalanced. In fact, many samples are in the Delay-Insensitive class. Therefore, Random Over-Sampling (ROS) method is used in this paper to overcome this challenge. Based on the empirical results, the proposed model obtained an accuracy of 94.42 which clearly demonstrates the superiority of our proposed model compared to other existing models. Manuscript profile
      • Open Access Article

        25 - Presenting a web recommender system for user nose pages using DBSCAN clustering algorithm and machine learning SVM method.
        reza molaee fard Mohammad mosleh
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can s More
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can solve the problem of the cold start of the system and improve the quality of the search. In this research, a new method is presented in order to improve recommender systems in the field of the web, which uses the DBSCAN clustering algorithm to cluster data, and this algorithm obtained an efficiency score of 99%. Then, using the Page rank algorithm, the user's favorite pages are weighted. Then, using the SVM method, we categorize the data and give the user a combined recommender system to generate predictions, and finally, this recommender system will provide the user with a list of pages that may be of interest to the user. The evaluation of the results of the research indicated that the use of this proposed method can achieve a score of 95% in the recall section and a score of 99% in the accuracy section, which proves that this recommender system can reach more than 90%. It detects the user's intended pages correctly and solves the weaknesses of other previous systems to a large extent. Manuscript profile
      • Open Access Article

        26 - Noor Analysis: A Benchmark Dataset for Evaluating Morphological Analysis Engines
        Huda Al-Shohayyeb Behrooz Minaei Mohammad Ebrahim Shenassa Sayyed Ali Hossayni
        The Arabic language has a very rich and complex morphology, which is very useful for the analysis of the Arabic language, especially in traditional Arabic texts such as historical and religious texts, and helps in understanding the meaning of the texts. In the morpholog More
        The Arabic language has a very rich and complex morphology, which is very useful for the analysis of the Arabic language, especially in traditional Arabic texts such as historical and religious texts, and helps in understanding the meaning of the texts. In the morphological data set, the variety of labels and the number of data samples helps to evaluate the morphological methods, in this research, the morphological dataset that we present includes about 22, 3690 words from the book of Sharia alـIslam, which have been labeled by experts, and this dataset is the largest in terms of volume and The variety of labels is superior to other data provided for Arabic morphological analysis. To evaluate the data, we applied the Farasa system to the texts and we report the annotation quality through four evaluation on the Farasa system. Manuscript profile
      • Open Access Article

        27 - Providing a New Solution in Selecting Suitable Databases for Storing Big Data in the National Information Network
        Mohammad Reza Ahmadi davood maleki ehsan arianyan
        The development of infrastructure and applications, especially public services in the form of cloud computing, traditional models of database services and their storage methods have faced sever limitations and challenges. The increasing development of data service produ More
        The development of infrastructure and applications, especially public services in the form of cloud computing, traditional models of database services and their storage methods have faced sever limitations and challenges. The increasing development of data service productive tools and the need to store the results of large-scale processing resulting from various activities in the national network of information and data produced by the private sector and pervasive social networks has made the process of migrating to new databases with appropriate features inevitable. With the expansion and change in the size and composition of data and the formation of big data, traditional practices and patterns do not meet new needs. Therefore, it is necessary to use data storage systems in new and scalable formats and models. This paper reviews the essential solution regarding the structural dimensions and different functions of traditional databases and modern storage systems and technical solutions for migrating from traditional databases to modern ones suitable for big data. Also, the basic features regarding the connection of traditional and modern databases for storing and processing data obtained from the national information network are presented and the parameters and capabilities of databases in the standard platform context and Hadoop context are examined. As a practical example, a combination of traditional and modern databases using the balanced scorecard method is presented as well as evaluated and compared. Manuscript profile
      • Open Access Article

        28 - Identifying and Ranking Factors Affecting the Digital Transformation Strategy in Iran's Road Freight Transportation Industry Focusing on the Internet of Things and Data Analytics
        Mehran Ehteshami Mohammad Hasan Cheraghali Bita Tabrizian Maryam Teimourian sefidehkhan
        This research has been done with the aim of identifying and ranking the factors affecting the digital transformation strategy in Iran's road freight transportation industry, focusing on the Internet of Things and data analytics. After reviewing the literature, semi-stru More
        This research has been done with the aim of identifying and ranking the factors affecting the digital transformation strategy in Iran's road freight transportation industry, focusing on the Internet of Things and data analytics. After reviewing the literature, semi-structured interviews were conducted with 20 academic and road freight transportation industry experts in Iran, who were selected using the purposive sampling method and saturation principle. In the quantitative part, the opinions of 170 employees of this industry, who were selected based on Cochran's formula and stratified sampling method, were collected using a researcher-made questionnaire. Delphi technique, literature review and coding were used to analyze the data in the qualitative part. In the quantitative part, inferential statistics and SPSS and smartPLS software were used. Finally, 40 indicators were extracted in the form of 8 factors and ranking of indicators and affecting factors was done using factor analysis. The result of this research shows that the internal factors have the highest rank and software infrastructure, hardware infrastructure, economic, external factors, legal, cultural and penetration factor are in the next ranks respectively. Therefore, it is suggested that organizations consider their human resource empowerment program in line with the use of technology and digital tools. Manuscript profile
      • Open Access Article

        29 - Anomaly and Intrusion Detection Through Data Mining and Feature Selection using PSO Algorithm
        Fereidoon Rezaei Mohamad Ali Afshar Kazemi Mohammad Ali Keramati
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, th More
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, the individuals intrude the websites and virtual markets through cyberattacks and disrupt them. Detection of attacks and anomalies is one of the new challenges in promoting e-commerce technologies. Detecting anomalies of a network and the process of detecting destructive activities in e-commerce can be executed by analyzing the behavior of network traffic. Data mining systems/techniques are used extensively in intrusion detection systems (IDS) in order to detect anomalies. Reducing the size/dimensions of features plays an important role in intrusion detection since detecting anomalies, which are features of network traffic with high dimensions, is a time-consuming process. Choosing suitable and accurate features influences the speed of the proposed task/work analysis, resulting in an improved speed of detection. In this article, by using data mining algorithms such as Bayesian, Multilayer Perceptron, CFS, Best First, J48 and PSO, we were able to increase the accuracy of detecting anomalies and attacks to 0.996 and the error rate to 0.004. Manuscript profile
      • Open Access Article

        30 - Survey on the Applications of the Graph Theory in the Information Retrieval
        Maryam Piroozmand Amir Hosein Keyhanipour Ali Moeini
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph- More
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph-based information retrieval algorithms have shown to be efficient and effective, this paper aims to provide an analytical review of these algorithms and propose a categorization of them. Briefly speaking, graph-based information retrieval algorithms might be divided into three major classes: the first category includes those algorithms which use a graph representation of the corresponding dataset within the information retrieval process. The second category contains semantic retrieval algorithms which utilize the graph theory. The third category is associated with the application of the graph theory in the learning to rank problem. The set of reviewed research works is analyzed based on both the frequency as well as the publication time. As an interesting finding of this review is that the third category is a relatively hot research topic in which a limited number of recent research works are conducted. Manuscript profile