Blasted Cell Line Names

Jing Wang,Michael D Story,Wenbin Liu,John V Heymach,Keith A Baggerly,Jeffrey N Myers,Uma Giri,Luc Girard,Li Shen,John D Minna,K Kian Ang,Kevin R Coombes,John S Yordy,Lauren A Byers

doi:10.4137/cin.s5613

Abstract

Background:While trying to integrate multiple data sets collected by different researchers, we noticed that the sample names were frequently entered inconsistently. Most of the variations appeared to involve punctuation, white space, or their absence, at the juncture between alphabetic and numeric portions of the cell line name.Results:Reasoning that the variant names could be described in terms of mutations or deletions of character strings, we implemented a simple version of the Needleman-Wunsch global sequence alignment algorithm and applied it to the cell line names. All correct matches were found by this procedure. Incorrect matches only occured when a cell line was present in one data set but not in the other. The raw match scores tended to be substantially worse for the incorrect matches.Conclusions:A simple application of the Needleman-Wunsch global sequence alignment algorithm provides a useful first pass at matching sample names from different data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Cancer Informatics	Publication Date: Jan 1, 2010
Citations: 1	License type: cc-by-nc

R Discovery Prime

R Discovery Prime

Blasted Cell Line Names

Abstract

Talk to us

Similar Papers

More From: Cancer Informatics

Lead the way for us

Similar Papers

Application of Needleman-Wunsch Algorithm in Image Comparison
Kavya Duvvuri ... H N Vishwas
-
Kavya Duvvuri, et. al.Kavya Duvvuri ... H N Vishwas
25 Nov 2022
25 Nov 2022

Sequence Alignment Using Machine Learning-Based Needleman–Wunsch Algorithm
Amr Ezz El-Din Rashed ... Mervat El-Seddek
IEEE Access | VOL. 9
Amr Ezz El-Din Rashed, et. al.Amr Ezz El-Din Rashed ... Mervat El-Seddek
01 Jan 2020
IEEE Access | VOL. 9

Parallel Cache Efficient Algorithm and Implementation of Needleman-Wunsch Global Sequence Alignment
Marek Pałkowski ... Krzysztof Siedlecki
-
Marek Pałkowski, et. al.Marek Pałkowski ... Krzysztof Siedlecki
01 Jan 2018
01 Jan 2018

A scalable parallel algorithm for global sequence alignment with customizable scoring scheme
Muhammad Umair Sadiq ... Muhammad Murtaza Yousaf
Concurrency and Computation: Practice and Experience | VOL. 35
Muhammad Umair Sadiq, et. al.Muhammad Umair Sadiq ... Muhammad Murtaza Yousaf
07 Aug 2023
Concurrency and Computation: Practice and Experience | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Blasted Cell Line Names

Abstract

Talk to us

Similar Papers

More From: Cancer Informatics