• Home
  • درخت تصمیم
  • OpenAccess
    • List of Articles درخت تصمیم

      • Open Access Article

        1 - Classifying Two Class data using Hyper Rectangle Parallel to the Coordinate Axes
        zahra moslehi palhang palhang
        One of the machine learning tasks is supervised learning. In supervised learning we infer a function from labeled training data. The goal of supervised learning algorithms is learning a good hypothesis that minimizes the sum of the errors. A wide range of supervised alg More
        One of the machine learning tasks is supervised learning. In supervised learning we infer a function from labeled training data. The goal of supervised learning algorithms is learning a good hypothesis that minimizes the sum of the errors. A wide range of supervised algorithms is available such as decision tress, SVM, and KNN methods. In this paper we focus on decision tree algorithms. When we use the decision tree algorithms, the data is partitioned by axis- aligned hyper planes. The geometric concept of decision tree algorithms is relative to separability problems in computational geometry. One of the famous problems in separability concept is computing the maximum bichromatic discrepancy problem. There exists an -time algorithm to compute the maximum bichromatic discrepancy in d dimensions. This problem is closely relative to decision trees in machine learning. We implement this problem in 1, 2, 3 and d dimension. Also, we implement the C4.5 algorithm. The experiments showed that results of this algorithm and C4.5 algorithm are comparable. Manuscript profile
      • Open Access Article

        2 - Integrating Data Envelopment Analysis and Decision Tree Models in Order to Evaluate Information Technology-Based Units
        Amir Amini ali alinezhad somaye shafaghizade
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data enve More
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data envelopment analysis (DEA) is a non-parametric method to measure the effectiveness and efficiency of decision-making units (DMUs). On the other hand, data mining technique allows DMUs to explore and discover meaningful information, which had previously been hidden in large databases. . This paper presents a general framework for combining DEA and regression tree for evaluating the effectiveness and efficiency of the DMUs. Resulting hybrid model is a set of rules that can be used by policy makers to discover reasons behind efficient and inefficient DMUs. Using the proposed method for examining factors related to productivity, a sample of 18 branches of Iran insurance in Tehran was elected as a case study. After modeling based on advanced model the input oriented LVM model with weak disposability in data envelopment analysis was calculated using undesirable output, and by use of decision tree technique deals with extracting and discovering the rules for the cause of increased productivity and reduced productivity. Manuscript profile
      • Open Access Article

        3 - ۹۳ / ۵٬۰۰۰ Integration of data envelopment analysis model and decision tree in order to evaluate units based on information technology
        Amir Amini علی رضا علی نژاد سمیه  شفقی¬زاده
        Every organization needs an evaluation system to measure this usefulness in order to know the performance and usefulness of its units, and this issue is more important for financial institutions, including companies based on information technology. Data envelopment anal More
        Every organization needs an evaluation system to measure this usefulness in order to know the performance and usefulness of its units, and this issue is more important for financial institutions, including companies based on information technology. Data envelopment analysis is a non-parametric method for measuring the efficiency and productivity of decision making units (DMUs). On the other hand, data mining techniques allow DMUs to explore and discover meaningful information, which was previously hidden in large databases. This paper proposes a general framework combining data envelopment analysis with regression trees to evaluate the efficiency and productivity of DMUs. The result of the hybrid model is a set of rules that can be used by policy makers to discover the reasons for efficient and inefficient DMUs. As a case study using the proposed method to investigate the factors related to productivity, a sample including 18 branches of Iranian insurance in Tehran was selected and after modeling based on the advanced input-oriented LVM model with poor accessibility in data coverage analysis with Undesirable output was calculated and with the decision tree technique, rules are extracted to discover the reasons for productivity increase and productivity regression. Manuscript profile
      • Open Access Article

        4 - Classification of two-level data with hyperrectangles parallel to the coordinate axes
        zahra moslehi palhang palhang
        One of the learning methods in machine learning and pattern recognition is supervised learning. In supervised learning and in two-category problems, the available educational data labels include positive and negative categories. The goal of the supervised learning algor More
        One of the learning methods in machine learning and pattern recognition is supervised learning. In supervised learning and in two-category problems, the available educational data labels include positive and negative categories. The goal of the supervised learning algorithm is to calculate a hypothesis that can separate positive and negative data with the least amount of error. In this article, among all supervised learning algorithms, we focus on the performance of decision trees. The geometric view of the decision tree brings us closer to the concept of separability in computational geometry. Among all the available resolution algorithms related to the decision tree, we raise the problem of calculating the rectangle with the maximum difference of two colors and implement the algorithm in one, two, three and m dimensions, where m represents the number of data features. The implementation result shows that this algorithm is competitive with the well-known C4.5 algorithm. Manuscript profile
      • Open Access Article

        5 - Automatic Lung Diseases Identification using Discrete Cosine Transform-based Features in Radiography Images
        Shamim Yousefi Samad Najjar-Ghabel
        The use of raw radiography results in lung disease identification has not acceptable performance. Machine learning can help identify diseases more accurately. Extensive studies were performed in classical and deep learning-based disease identification, but these methods More
        The use of raw radiography results in lung disease identification has not acceptable performance. Machine learning can help identify diseases more accurately. Extensive studies were performed in classical and deep learning-based disease identification, but these methods do not have acceptable accuracy and efficiency or require high learning data. In this paper, a new method is presented for automatic interstitial lung disease identification on radiography images to address these challenges. In the first step, patient information is removed from the images; the remaining pixels are standardized for more precise processing. In the second step, the reliability of the proposed method is improved by Radon transform, extra data is removed using the Top-hat filter, and the detection rate is increased by Discrete Wavelet Transform and Discrete Cosine Transform. Then, the number of final features is reduced with Locality Sensitive Discriminant Analysis. The processed images are divided into learning and test categories in the third step to create different models using learning data. Finally, the best model is selected using test data. Simulation results on the NIH dataset show that the decision tree provides the most accurate model by improving the harmonic mean of sensitivity and accuracy by up to 1.09times compared to similar approaches. Manuscript profile
      • Open Access Article

        6 - Design and implementation of a survival model for patients with melanoma based on data mining algorithms
        farinaz sanaei Seyed Abdollah  Amin Mousavi Abbas Toloie Eshlaghy ali rajabzadeh ghotri
        Background/Purpose: Among the most commonly diagnosed cancers, melanoma is the second leading cause of cancer-related death. A growing number of people are becoming victims of melanoma. Melanoma is also the most malignant and rare form of skin cancer. Advanced cases of More
        Background/Purpose: Among the most commonly diagnosed cancers, melanoma is the second leading cause of cancer-related death. A growing number of people are becoming victims of melanoma. Melanoma is also the most malignant and rare form of skin cancer. Advanced cases of the disease may cause death due to the spread of the disease to internal organs. The National Cancer Institute reported that approximately 99,780 people were diagnosed with melanoma in 2022, and approximately 7,650 died. Therefore, this study aims to develop an optimization algorithm for predicting melanoma patients' survival. Methodology: This applied research was a descriptive-analytical and retrospective study. The study population included patients with melanoma cancer identified from the National Cancer Research Center at Shahid Beheshti University between 2008 and 2013, with a follow-up period of five years. An optimization model was selected for melanoma survival prognosis based on the evaluation metrics of data mining algorithms. Findings: A neural network algorithm, a Naïve Bayes network, a Bayesian network, a combination of decision tree and Naïve Bayes network, logistic regression, J48, and ID3 were selected as the models used in the national database. Statistically, the studied neural network outperformed other selected algorithms in all evaluation metrics. Conclusion: The results of the present study showed that the neural network with a value of 0.97 has optimal performance in terms of reliability. Therefore, the predictive model of melanoma survival showed a better performance both in terms of discrimination power and reliability. Therefore, this algorithm was proposed as a melanoma survival prediction model. Manuscript profile