Ekstraksi Data pada Tabel dari Halaman Web Menggunakan Pohon Document Object Model

Memen Akbar,Dini Nurmalasari,Cici Patmala

doi:10.22146/jnteti.v5i4.273

Abstract

Data on the web page can be available in various formats, such as table. With the growing of web pages, the need to extract data from tables is increasing. Results of the extraction can be used for integration with other web tables or stored in a database. This study discusses the extraction of data from a table on a web page using a Document Object Model (DOM) tree. The initial step of this extraction process is to transform the HTML document into a DOM tree. Then, by applying search methods Depth First Search (DFS), part of the data in the table is extracted and stored in a CSV file. An engine has been developed using Visual Basic. The results show that the engine can automatically extract data from the table that has the following characteristics: the number of rows and columns are not limited, able to handle all of the table orientation layout, and able to handle tables that are merged cells.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Ekstraksi Data pada Tabel dari Halaman Web Menggunakan Pohon Document Object Model

Abstract

Talk to us

Similar Papers

More From: Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI)

Lead the way for us

Journal: Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI)	Publication Date: Dec 27, 2016
License type: cc-by-sa

Similar Papers

Web Content Extraction by Integrating Textual and Visual Importance of Web Pages
J Anitha ... K Nethra
International Journal of Computer Applications | VOL. 91
J Anitha, et. al.J Anitha ... K Nethra
18 Apr 2014
International Journal of Computer Applications | VOL. 91

Hunting for DOM-Based XSS vulnerabilities in mobile cloud-based online social network
Shashank Gupta ... Pooja Chaudhary
Future Generation Computer Systems | VOL. 79
Shashank Gupta, et. al.Shashank Gupta ... Pooja Chaudhary
12 Jun 2017
Future Generation Computer Systems | VOL. 79

A novel algorithm for extracting the user reviews from web pages
Erdem Uçar ... Erdinç Uzun
Journal of Information Science | VOL. 43
Erdem Uçar, et. al.Erdem Uçar ... Erdinç Uzun
01 Sep 2016
Journal of Information Science | VOL. 43

WIERT: Web Information Extraction via Render Tree
Zimeng Li ... Bo Shao
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Zimeng Li, et. al.Zimeng Li ... Bo Shao
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ekstraksi Data pada Tabel dari Halaman Web Menggunakan Pohon Document Object Model

Abstract

Talk to us

Similar Papers

More From: Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI)