158743 scrape HTML table data for DB
N/A
ชำระเงินเมื่อจัดส่ง
Summary
Ruby or Python script for Win32 that will convert table data in html source files into a database
consumable file format (XML). Creating a relational structure to the data is also welcome.
Steps
1) write a html file parser (prefer Python or ruby code) that grab all the relevant table data from a
set of html files. Iterate this step thru a series of sequentially numbered files.
2) after grabbing all the table data in a file, the program will analyze certain html tables. The table
can be populated in several variations, so code should be intelligent to recognize the various table
structure (not that difficult since there are just a couple of slight variations).
3) after parsing the table, turn the relevant element data into an XML txt file (or some other format
that can be easily imported into a MySQL database)
(optional)
4) Create a relational structure to the data in the MySQL database.
Speed or efficiency not a priority. This is basically a one-time data port.
Skill Requirements
Ruby or Python (prefer Ruby). Simple MySQL database knowledge would help (how to import xml or other
data into database).
หมายเลขโปรเจค: #1904932
เกี่ยวกับโปรเจกต์
มอบให้กับ:
This can be done easily. I suggest bypassing the XML and entering the data directly from the parser into the database. I could accomplish it faster with php, but can also do it no problem in ruby or python.