Text detection in images is one of the most important sources for image recognition. Although many researches have been conducted on text detection and recognition and end-to-end models (models that provide detection and recognition in a single model) based on deep lear More
Text detection in images is one of the most important sources for image recognition. Although many researches have been conducted on text detection and recognition and end-to-end models (models that provide detection and recognition in a single model) based on deep learning for languages such as English and Chinese, the main obstacle for developing such models for Persian language is the lack of a large training data set. In this paper, we design and build required tools for synthesizing a data set of scene text images with parameters such as color, size, font, and text rotation for Persian. These tools are used to generate a large still varied data set for training deep learning models. Due to considerations in synthesizing tools and resulted variety of texts, models do not depend on synthesis parameters and can be generalized. 7603 scene text images and 39660 cropped word images are synthesized as sample data set. The advantage of our method over real images is to synthesize any arbitrary number of images, without the need for manual annotations. As far as we know, this is the first open-source and large data set of scene text images for Persian language.
Manuscript profile
The purpose of this research is to present the framework of the national plan for transparency and information release. The research employs an integrated approach (qualitative and quantitative) and grounded theory as its research methodology. In the qualitative part، w More
The purpose of this research is to present the framework of the national plan for transparency and information release. The research employs an integrated approach (qualitative and quantitative) and grounded theory as its research methodology. In the qualitative part، with an in-depth and exploratory review of upstream laws and documents، models، theories، plans، and white papers of different countries related to transparency and information release، data analysis was done until theoretical saturation through three stages of open، axial، and selective coding. To acquire the dimensions، components، and subcomponents of this framework، 129 concepts were extracted from 620 primary codes، which were reduced to 593 secondary codes by removing the duplicated elements. Finally، 24 subcategories were placed under the five main components based on the paradigm model. In the quantitative section، the results of the analysis of the questionnaire indicated that، from a validity standpoint، the total value of the questionnaire، in different dimensions، was between 0.87 and 0.92، and the reliability coefficient was between 0.73 and 0.78. Based on data analysis، the establishment of a supranational management institution for transparency and information release، the precise determination of exceptions، network governance، demanding transparency، adherence to frameworks، maximum disclosure and support for legitimate disclosure، and the establishment of a data governance center are among the subcategories emphasized in this framework.
Manuscript profile