Copyright Infringement Detection Research Articles

Clone detection has received much attention in many fields such as malicious code detection, vulnerability hunting, and code copyright infringement detection. However, cyber criminals may obfuscate code to impede violation detection. To date, few studies have investigated the robustness of clone detectors, especially in-fashion deep learning-based ones, against obfuscation. Meanwhile, most of these studies only measure the difference between one code snippet and its obfuscation version. However, in reality, the attackers may modify the original code before obfuscating it. Then what we should evaluate is the detection of obfuscated code from cloned code, not the original code. For this, we conduct a comprehensive study evaluating 3 popular deep-learning based clone detectors and 6 commonly used traditional ones. Regarding the data, we collect 6512 clone pairs of five types from the dataset BigCloneBench and obfuscate one program of each pair via 64 strategies of 6 state-of-art commercial obfuscators. We also collect 1424 non-clone pairs to evaluate the false positives. In sum, a benchmark of 524,148 code pairs (either clone or not) are generated, which are passed to clone detectors for evaluation. To automate the evaluation, we develop one uniform evaluation framework, integrating the clone detectors and obfuscators. The results bring us interesting findings on how obfuscation affects the performance of clone detection and what is the difference between traditional and deep learning-based clone detectors. In addition, we conduct manual code reviews to uncover the root cause of the phenomenon and give suggestions to users from different perspectives.

Logo classification has gained increasing attention for its various applications, such as copyright infringement detection, product recommendation and contextual advertising. Compared with other types of object images, the real-world logo images have larger variety in logo appearance and more complexity in their background. Therefore, recognizing the logo from images is challenging. To support efforts towards scalable logo classification task, we have curated a dataset, Logo-2K+, a new large-scale publicly available real-world logo dataset with 2,341 categories and 167,140 images. Compared with existing popular logo datasets, such as FlickrLogos-32 and LOGO-Net, Logo-2K+ has more comprehensive coverage of logo categories and larger quantity of logo images. Moreover, we propose a Discriminative Region Navigation and Augmentation Network (DRNA-Net), which is capable of discovering more informative logo regions and augmenting these image regions for logo classification. DRNA-Net consists of four sub-networks: the navigator sub-network first selected informative logo-relevant regions guided by the teacher sub-network, which can evaluate its confidence belonging to the ground-truth logo class. The data augmentation sub-network then augments the selected regions via both region cropping and region dropping. Finally, the scrutinizer sub-network fuses features from augmented regions and the whole image for logo classification. Comprehensive experiments on Logo-2K+ and other three existing benchmark datasets demonstrate the effectiveness of proposed method. Logo-2K+ and the proposed strong baseline DRNA-Net are expected to further the development of scalable logo image recognition, and the Logo-2K+ dataset can be found at https://github.com/msn199959/Logo-2k-plus-Dataset.

Copyright Infringement Detection Research Articles

Related Topics

Articles published on Copyright Infringement Detection

Using knowledge graphs for audio retrieval: a case study on copyright infringement detection

User-Generated Content and Copyright Liability: Assessing the Role of User in Copyright Infringement Detection

Are our clone detectors good enough? An empirical study of code effects by obfuscation

LogoDet-3K: A Large-scale Image Dataset for Logo Detection

Logo Detection Using Deep Learning with Pretrained CNN Models

Logo-2K+: A Large-Scale Logo Dataset for Scalable Logo Classification

Commentary on "Transpositions within user-posted YouTube lyric videos: A corpus study"

Audio Music Monitoring: Analyzing Current Techniques for Song Recognition and Identification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Copyright Infringement Detection Research Articles

Related Topics

Articles published on Copyright Infringement Detection

Using knowledge graphs for audio retrieval: a case study on copyright infringement detection

User-Generated Content and Copyright Liability: Assessing the Role of User in Copyright Infringement Detection

Are our clone detectors good enough? An empirical study of code effects by obfuscation

LogoDet-3K: A Large-scale Image Dataset for Logo Detection

Logo Detection Using Deep Learning with Pretrained CNN Models

Logo-2K+: A Large-Scale Logo Dataset for Scalable Logo Classification

Commentary on "Transpositions within user-posted YouTube lyric videos: A corpus study"

Audio Music Monitoring: Analyzing Current Techniques for Song Recognition and Identification