Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Three different Indic/Indo-Aryan languages - Bengali, Hindi and Nepali have been explored here in character level to find out similarities and dissimilarities. Having shared the same root, the Sanskrit, Indic languages bear common characteristics. That is why computer and language scientists can take the opportunity to develop common Natural Language Processing (NLP) techniques or algorithms. Bearing the concept in mind, we compare and analyze these three languages character by character. As an application of the hypothesis, we also developed a uniform sorting algorithm in two steps, first for the Bengali and Nepali languages only and then extended it for Hindi in the second step. Our thorough investigation with more than 30,000 words from each language suggests that, the algorithm maintains total accuracy as set by the local language authorities of the respective languages and good efficiency.

Similar Papers
  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-3-319-62521-8_20
MainIndex Sorting Algorithm
  • Sep 2, 2017
  • Adeel Ahmed + 5 more

Sorting algorithm remained hot topic in computer science from the birth of computer science to achieve maximum performance. Fortunately this achievement became possible due to good and fast sorting algorithms, such as heap sort, merge sort, radix sort and other sorting algorithms. Till this achievement is also under research to find more efficient algorithms. In sorting algorithm arrays and link list data structures are commonly used. We know arrays are efficient if we need consecutive kind of data structure and link lists are useful when we need to add and remove items in the data structure. In other word we can say both data structures have own its merits and demerits. So in our sorting algorithm we are going to use both kinds of data structure. We will use in our MainIndex sorting algorithm arrays as the MainIndex and link list as sorting cells. MainIdex sorting algorithm need some kind of information just the length of the number which is going to sort and the value of the number which is going to sort in sorting cell.

  • Research Article
  • Cite Count Icon 13
  • 10.5121/ijcsea.2012.2306
Proposal of a Two Way Sorting Algorithm and Performance Comparison with Existing Algorithms
  • Jun 30, 2012
  • International Journal of Computer Science, Engineering and Applications
  • Eshan Kapur

An algorithm is any well-defined procedure or set of instructions, that takes some input in the form of some values, processes them and gives some values as output. Sorting involves rearranging information into either ascending or descending order. Sortingis considered as a fundamental operation in computer science as it is used as an intermediate step in many operations. A new sorting algorithm namely ‘An Endto-End Bi-directional Sorting (EEBS) Algorithm’ is proposed to address the shortcomings of the current popular sorting algorithms. The goal of this research is to perform an extensive empirical analysis of the newly developed algorithm and present its functionality. The results of the analysis proved that EEBS is much more efficient than the other alg orithms having O(n 2 ) complexity, like bubble, selection and insertion sort.

  • Book Chapter
  • Cite Count Icon 38
  • 10.1007/978-3-030-63820-7_20
Detecting Alzheimer’s Disease by Exploiting Linguistic Information from Nepali Transcript
  • Jan 1, 2020
  • Surendrabikram Thapa + 5 more

Alzheimer’s disease (AD) is the most common form of neurodegenerating disorder accounting for 60–80% of all dementia cases. The lack of effective clinical treatment options to completely cure or even slow the progression of disease makes it even more serious. Treatment options are available to treat the milder stage of the disease to provide symptomatic short-term relief and improve quality of life. Early diagnosis is key in the treatment and management of AD as advanced stages of disease cause severe cognitive decline and permanent brain damage. This has prompted researchers to explore innovative ways to detect AD early on. Changes in speech are one of the main signs of AD patients. As the brain deteriorates the language processing ability of the patients deteriorates too. Previous research has been done in the English language using Natural Language Processing (NLP) techniques for early detection of AD. However, research using local languages and low resourced language like Nepali still lag behind. NLP is an important tool in Artificial Intelligence to decipher the human language and perform various tasks. In this paper, various classifiers have been discussed for the early detection of Alzheimer’s in the Nepali language. The proposed study makes a convincing conclusion that the difficulty in processing information in AD patients reflects in their speech while describing a picture. The study incorporates the speech decline of AD patients to classify them as control subjects or AD patients using various classifiers and NLP techniques. Furthermore, in this experiment a new dataset consisting of transcripts of AD patients and Control normal (CN) subjects in the Nepali language. In addition, this paper sets a baseline for the early detection of AD using NLP in the Nepali language.

  • Research Article
  • Cite Count Icon 1
  • 10.9734/ajrcos/2024/v17i10510
Modular Co-attention Networks in Nepali Visual Question Answering Systems
  • Oct 8, 2024
  • Asian Journal of Research in Computer Science
  • Aashish Gyanwali + 3 more

Visual question answering (VQA) has been regarded as a challenging task requiring a perfect blend of computer vision and natural language processing. As no dataset was available to train such a model for the Nepali language, a new dataset was developed during the research by translating the VQAv2 dataset. Then the dataset consisting of 202,577 images and 886,560 questions was used to train an attention-based VQA model. The dataset consists of yes/no, counting, and other questions with primarily one-word answers. Modular Co-attention Network (MCAN) was applied to the visual features extracted using the Faster RCNN framework and question embeddings extracted using the Nepali GloVe model. After co-attending the visual and language features for a few cascaded MCAN layers, the features are fused to train the whole network. During evaluation, an overall accuracy of 69.87% was obtained with 81.09% accuracy in yes/no type questions. The results surpassed the performance of models developed for Hindi and Bengali languages. Overall, novel research has been done in the Nepali Language VQA domain paving the way for further advancements.

  • Research Article
  • Cite Count Icon 8
  • 10.1186/s42787-019-0004-2
Complexity analysis and performance of double hashing sort algorithm
  • Apr 4, 2019
  • Journal of the Egyptian Mathematical Society
  • Hazem M Bahig

Sorting an array of n elements represents one of the leading problems in different fields of computer science such as databases, graphs, computational geometry, and bioinformatics. A large number of sorting algorithms have been proposed based on different strategies. Recently, a sequential algorithm, called double hashing sort (DHS) algorithm, has been shown to exceed the quick sort algorithm in performance by 10–25%. In this paper, we study this technique from the standpoints of complexity analysis and the algorithm’s practical performance. We propose a new complexity analysis for the DHS algorithm based on the relation between the size of the input and the domain of the input elements. Our results reveal that the previous complexity analysis was not accurate. We also show experimentally that the counting sort algorithm performs significantly better than the DHS algorithm. Our experimental studies are based on six benchmarks; the percentage of improvement was roughly 46% on the average for all cases studied.

  • Research Article
  • 10.30871/jaic.v7i2.6409
Comparative Analysis of OpenMP and MPI Parallel Computing Implementations in Team Sort Algorithm
  • Nov 29, 2023
  • Journal of Applied Informatics and Computing
  • Eko Dwi Nugroho + 4 more

Tim Sort is a sorting algorithm that combines Merge Sort and Binary Insertion Sort sorting algorithms. Parallel computing is a computational processing technique in parallel or is divided into several parts and carried out simultaneously. The application of parallel computing to algorithms is called parallelization. The purpose of parallelization is to reduce computational processing time, but not all parallelization can reduce computational processing time. Our research aims to analyse the effect of implementing parallel computing on the processing time of the Tim Sort algorithm. The Team Sort algorithm will be parallelized by dividing the flow or data into several parts, then each sorting and recombining them. The libraries we use are OpenMP and MPI, and tests are carried out using up to 16 core processors and data up to 4194304 numbers. The goal to be achieved by comparing the application of OpenMP and MPI to the Team Sort algorithm is to find out and choose which library is better for the case study, so that when there is a similar case, it can be used as a reference for using the library in solving the problem. The results of research for testing using 16 processor cores and the data used prove that the parallelization of the Sort Team algorithm using OpenMP is better with a speed increase of up to 8.48 times, compared to using MPI with a speed increase of 8.4 times. In addition, the increase in speed and efficiency increases as the amount of data increases. However, the increase in efficiency that is obtained by increasing the processor cores decreases.

  • Research Article
  • 10.17776/csj.93659
Enhancing and Optimization Sorting Algorithms: An Empirical Study
  • Jan 1, 2015
  • Cumhuriyet Science Journal
  • Mohammad Mehdi Karimizadeh + 3 more

Abstract. Sorting algorithms are used to sort a list of data. Also sorting is used in other computer operations such as searching, merging, and normalization. Since the sorting is considered as a one of the key operation in computer science, recognition of an optimization approaches can develop this science considerably. Optimization in the sorting algorithms, even in small scale, can cause saving a lot of time. The main discussion of the paper is on those algorithms which present optimized versions of classical sort algorithms. We studied classical and optimized methods some of sorting algorithm such as Selection sort, Bubble sort, Insertion sort, Quick sort and Heap sort and compared each algorithm in terms of the running times when used for sorting arrays of integers.

  • Conference Article
  • Cite Count Icon 20
  • 10.1109/csnt.2015.98
Analysis and Testing of Sorting Algorithms on a Standard Dataset
  • Apr 1, 2015
  • Neetu Faujdar + 1 more

Sorting is a huge demand research area in computer science. One of the fundamental issues in computer science is how to order the data in lexicographic order. In practical application computing requires things to be in order. The performance of any computation depends on the sorting algorithms. A lot of sorting algorithm has been developed by many authors to enhance the performance in terms of complexity analysis, and most of author analysis the time complexity and auxiliary space complexity of the sorting algorithms but the total space complexity taken by the algorithms not evaluated yet. The total space complexity contains the primary and secondary memory required to store input and output data, memory required to hold the code and working space. The goal of this paper is to test the various existing sorting algorithms and to evaluate the total space complexity of various sorting algorithms on a standard dataset. Sorting algorithms are evaluated on four cases of standard dataset. The four cases of the dataset are random with repeated data, reverse sorted with repeated data, sorted with repeated data, and nearly sorted with repeated data, and we measure the performance of each sorting algorithm in each case.

  • Research Article
  • Cite Count Icon 18
  • 10.4314/njbas.v29i1.5
Comparative Analysis between Selection Sort and Merge Sort Algorithms
  • Feb 8, 2022
  • Nigerian Journal of Basic and Applied Sciences
  • A.M Rabiu + 3 more

Sorting and merging are two problems that commonly arise in Computer Science especially in data processing tasks. To solve these problems, several algorithms have been developed. Similarly, existing merge and sorting algorithms have been improved to provide more efficient and accurate results. In this paper, selection and merging algorithms were developed on an octa-core processing machine using System.nanoTime methods in Java in order to compare their running times. The results obtained show that Merge Sort performs far better than selection sort with careful implementations by taking advantage of multiple processing cores in the test machine and some concurrency utility in Java. It was concluded that implementing algorithms using a machine with multiple numbers of cores in their Central Processing Unit (CPU) will result in a significant improvement in the performance of both algorithms.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/iccit57492.2022.10055404
Enhancement of Bubble and Insertion Sort Algorithm Using Block Partitioning
  • Dec 17, 2022
  • Tithi Paul

A list of components can be arranged in a certain order using a sorting algorithm, which is a fundamental concept in computer science. The temporal complexity of the two fundamental and widely used sorting algorithms, Bubble sort and Insertion sort is $\mathcal{O}\left( {{N^2}} \right)$, where N is the total number of items. When it comes to sorting a specific amount of items, it is superior. However, by adding more parts to its quadratic complexity, it loses efficiency. Because of this, it is less frequently employed in computer science's practical and real-world applications, despite being widely utilized as a subroutine in other areas. Numerous extension techniques for the insertion sort and bubble sort algorithms have been put out in the literature, but none of them tries to combine the two to create a combination algorithm like ours. The bubble and insertion sort method was modified in this study, and its computational complexity was estimated to be $\mathcal{O}(N\sqrt N )$. The technique begins by dividing the input array into a few pieces, sorting each of the blocks using a modified bubble sort, and then merging all of the blocks together using a modified insertion sort. The suggested bubble and insertion sort outperform traditional bubble and insertion sorting as well as all other sorting algorithms with a computational complexity of $\mathcal{O}\left( {{N^2}} \right)$.

  • Research Article
  • Cite Count Icon 1
  • 10.6092/polito/porto/2506355
General Purpose Computation on Graphics Processing Units Using OpenCL
  • Jan 1, 2013
  • Politecnico di Torino
  • Fiaz Gul Khan

Computational Science has emerged as a third pillar of science along with theory and experiment, where the parallelization for scientific computing is promised by different shared and distributed memory architectures such as, super-computer systems, grid and cluster based systems, multi-core and multiprocessor systems etc. In the recent years the use of GPUs (Graphic Processing Units) for General purpose computing commonly known as GPGPU made it an exciting addition to high performance computing systems (HPC) with respect to price and performance ratio. Current GPUs consist of several hundred computing cores arranged in streaming multi-processors so the degree of parallelism is promising. Moreover with the development of new and easy to use interfacing tools and programming languages such as OpenCL and CUDA made the GPUs suitable for different computation demanding applications such as micromagnetic simulations. In micromagnetic simulations, the study of magnetic behavior at very small time and space scale demands a huge computation time, where the calculation of magnetostatic field with complexity of O(Nlog(N)) using FFT algorithm for discrete convolution is the main contribution towards the whole simulation time, and it is computed many times at each time step interval. This study and observation of magnetization behavior at sub-nanosecond time-scales is crucial to a number of areas such as magnetic sensors, non volatile storage devices and magnetic nanowires etc. Since micromagnetic codes in general are suitable for parallel programming as it can be easily divided into independent parts which can run in parallel, therefore current trend for micromagnetic code concerns shifting the computationally intensive parts to GPUs. My PhD work mainly focuses on the development of highly parallel magnetostatic field solver for micromagnetic simulators on GPUs. I am using OpenCL for GPU implementation, with consideration that it is an open standard for parallel programming of heterogeneous systems for cross platform. The magnetostatic field calculation is dominated by the multidimensional FFTs (Fast Fourier Transform) computation. Therefore i have developed the specialized OpenCL based 3D-FFT library for magnetostatic field calculation which made it possible to fully exploit the zero padded input data with out transposition and symmetries inherent in the field calculation. Moreover it also provides a common interface for different vendors' GPUs. In order to fully utilize the GPUs parallel architecture the code needs to handle many hardware specific technicalities such as coalesced memory access, data transfer overhead between GPU and CPU, GPU global memory utilization, arithmetic computation, batch execution etc. In the second step to further increase the level of parallelism and performance, I have developed a parallel magnetostatic field solver on multiple GPUs. Utilizing multiple GPUs avoids dealing with many of the limitations of GPUs (e.g., on-chip memory resources) by exploiting the combined resources of multiple on board GPUs. The GPU implementation have shown an impressive speedup against equivalent OpenMp based parallel implementation on CPU, which means the micromagnetic simulations which require weeks of computation on CPU now can be performed very fast in hours or even in minutes on GPUs. In parallel I also worked on ordered queue management on GPUs. Ordered queue management is used in many applications including real-time systems, operating systems, and discrete event simulations. In most cases, the efficiency of an application itself depends on usage of a sorting algorithm for priority queues. Lately, the usage of graphic cards for general purpose computing has again revisited sorting algorithms. In this work i have presented the analysis of different sorting algorithms with respect to sorting time, sorting rate and speedup on different GPU and CPU architectures and provided a new sorting technique on GPUs

  • Research Article
  • 10.33395/sinkron.v8i2.12153
Theoretical Analysis of Standard Selection Sort Algorithm
  • Apr 4, 2023
  • SinkrOn
  • Rakhmat Purnomo + 1 more

Sorting algorithms plays an important role in the computer science field. Many applications use sorting algorithm. There are several sorting algorithms proposed by experts, namely bubble sort, exchange short, insertion short, heap sort, quick short, merge sort, standard selection sort. One well-known algorithm of sorting is selection sort. In this journal, discussion about standard selection sort is given with thorough analysis. Sorting is very important data structure concepts that has an important role in memory management, file management, in computer science in general, and in many real-life applications. Different sorting algorithms have differences in terms of time complexity, memory use, efficiency, and other factors. There are many sorting algorithms exist right now in the computer science field. Each algorithm has its benefits and limitations where a trade-off exists between execution time and the nature of the complexity of the algorithm itself. The method is theoretical analysis. Three theoretical analyses are given with deep explanation and analysis. Each with six index arrays, namely with six data on it. The numbers are sorted in ascending order. Pseudo code is also given, to understand this algorithm more thoroughly. It is concluded that this theoretical analysis explained the algorithm more clearly, by using process iteration by hand.

  • Research Article
  • Cite Count Icon 5
  • 10.20469/ijaps.5.50004-1
Bidirectional Enhanced Selection Sort Algorithm Technique
  • Mar 8, 2019
  • International Journal of Applied and Physical Sciences
  • Ramcis N Vilchez

Sorting algorithm refers to the arranging of numerical or alphabetical or character data in statistical order (ascending or descending). Sorting algorithm plays a vital role in searching and the field of data science. Most of the sorting algorithms with O(n2) time complexity are very efficient for a small list of elements. However, for large data, these algorithms are very inefficient. This study presented a remedy for the noted deficiencies of O(n2) sort algorithm for large data. Among the O(n2) algorithms, selection sort was the subject of the study considering its simplicity. Although selection sort is regarded as the most straightforward algorithm, it is also considered the second worst algorithm in terms of time complexity for large data. Several enhancements were conducted to address the inefficiencies of selection sort. However, the procedures presented in all the enhancements can still lead to some unnecessary comparisons, and iterations that cause poor sorting performance. The modified selection sort algorithm utilizes a Bidirectional Enhanced Selection Sort Algorithm Technique to reduce the number of comparisons and iterations that causes sorting delays. The modified algorithm was tested using varied data to validate the performance. The result was compared with the other O(n2) algorithm. The results show that the modified algorithm has a significant run time complexity improvement compared with the other O(n2) algorithms. This study has a significant contribution to the field of data structures in computer science and the field of data science.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3421766.3421879
Pipelined implementation of serial comparison based iterative sort on FPGA
  • Oct 15, 2020
  • Jingyang Zhou + 2 more

Sorting is a classic problem in computer science. Different kinds of sorting algorithms are required in different application scenarios. With regard to the real-time data processing applications implemented on FPGA, a faster throughput and more resource efficient sorting algorithm is needed to complete the data sorting. And the pipelined implementation of sorting algorithm is essential for improving the overall throughput. In this paper, a serial comparison based iterative sort algorithm is proposed and its implementation on FPGA is elaborated. To take advantages of the parallel characteristics of FPGA, the pipelined sorting module is realized by concatenating multiple serial comparison sorting submodules. Compared to other sorting algorithms implemented on FPGA, the serial comparison based iterative sort algorithm has the merit of requiring fewer resource consumptions, consuming less executing time and generating faster overall data throughput. The algorithm and its pipelined implementation have been successfully applied to the median filter of OS-CFAR processing in millimetre-wave MIMO radar, and their performance have been validated.

  • Conference Article
  • Cite Count Icon 53
  • 10.1109/mace.2011.5987184
Experimental study on the five sort algorithms
  • Jul 1, 2011
  • You Yang + 2 more

Sorting algorithm is one of the most basic research fields in computer science. It's goal is to make record easier to search, insert and delete. Through the description of five sort algorithms: bubble, select, insert, merger and quick, the time and space complexity was summarized. Furthermore, two categories of O(n2) and O(nlog n) could be found out. From the aspects of input sequence scale and input sequence random degree, some results were obtained based on the experiments. When the size of records is small, insertion sort or selection sort performs well. When the sequence is ordered, insertion sort or bubble sort performs well. When the size of records is large, quick sort or merge sort performs well. Different application could select appropriate sort algorithm according to these rules.

Save Icon
Up Arrow
Open/Close