Web scrap project
$250-750 USD
ชำระเงินเมื่อจัดส่ง
script language: PHP
front end: html/javascript
database: mysql
NO FRAME WORKS
Table Structure:
ALL TABLES:
id (auto insert)
createddate (auto insert)
modifieddate (auto update)
Table A (entities)
name varchar
state (varchar) 2 letter us state code
type varchar(city, county, school district, university, college)
url varchar
table B (url)
datasource (url or query where the data came from)
url varchar
googleposition
maxlayers (defaults 2)
statusname (values would include "found match", "no match")
table C (curl data)
url varchar
retrievedhtml largetext
match varchar
statusname (values would include "undefined" (default), "good", "bad", "review")
The 1st script will:
1 parse the LEA_NAME column for unique values for "school district" names from here - [login to view URL], get the state, school district name, & url. 25,000 results
2 parse the "county names" from here - [login to view URL], get state name, convert it to 2 letter code, 3,098 records
3 parse the "city names" from here - [login to view URL] grab city name, usps (state). 29,000 records.
4 parse the "US college/university" from here - [login to view URL] - & grab college/university name, & url. 2,073 records.
5 populate table A with the name, type, state code (2 digit) while skipping duplicate. convert the state name to 2 digit.
The 1st script will be a one time script, run from linux cli.
The 2nd script will:
1 Loop through table A, & attempt to find the url that matches with a google search, if one was not present from the datasource. The logic must skip certain false positives such as a domain with the word "weather" or "census" or "zillow" or "google" in it or url with ".jpg" or ".asp"
2 populate the record in table B, with :
datasource = the url of the data source above
url = url (skip duplicate)
statusname = null
googleposition = 1-20 (first page of google results only)
The 2nd script will return 35,000 - 200,000 results.
The 2nd script will run periodically from linux cli, on a crontab, & will be rerun, in the future, when additional excemptions are added.
The 2nd script should be multi threaded, & should cap out above a 100mb/second connection
The 3rd php script will:
1 Loop through table B, use curl to retrieve the web page
2 Loop through each of the child pages, for the value in the maxlayers column
3 Look for a particular pattern of text, including a case insensitive search for "bids" "request for proposal" "rfp" "rfq" "request for bids" "proposals"
4 Compare the curl returned html against the keywords
if there is match - insert a record into table C with the url (skip non unique url), the retrieved html, what keyword caused a match, & update table B statusname to "found match"
if there is no match, updated table B maxlayers count upward 1, & updated the statusname to "no match"
Each record from table B may have multiple records in table C
The 3rd script will run be run periodically from the linux cli
The 3rd script should be multi threaded, & should cap out above a 100mb/second connection
The 4th script will be ANDROID PHONE FRIENDLY:
1 Define an sql query which should return the top 10 selection from table C, sorted by modifieddate ASC, WHERE type != "bad" & != "good"
2 Provide a simple html table view front end to review each of the url, which should have columns for all values from table B.
3 An additional column will show a update status button, which when pressed shows the values (as buttons) from table C, which when pressed, update record in table C
The 4th script view is intended for a quality check employee to review all results from, & log if url matches our ultimate criteria or not.
หมายเลขโปรเจค: #18227988
เกี่ยวกับโปรเจกต์
freelancer 15 คน กำลังเสนอราคาในงานนี้ โดยมีราคาเฉลี่ยอยู่ที่ $550
Hello, Dear How are you? I have check your project description and am ready for discussing with you about project for now. I have experienced in PHHP and WebScraping , MySQL. I will work very hard and best for y เพิ่มเติม
Hello Sir, I am the expert freelancer here. I am on the 6th position through out the world to deliver the quality job. I have deliver here more than 400 + projects with 100% client satisfaction. I have more than 5 เพิ่มเติม
Hey? How are you? I have reviewed "Web scrap project" .I have good skills for these (MySQL, PHP, Web Scraping). I have been working for 7 yrs in this scope. While we contract and work in our jobs, I will get paid o เพิ่มเติม
How are you today? I am a super expert in this area. If you contact me, I can show you my past work too. Please contact me. Thank you.
I can scrape any data you require. Please contact me and we can discuss getting started, I'm eager to begin working for you.
"Hi, Hope you are doing well! Thanks for sharing your project requirement with us. It will be our great pleasure to work on your project. I have checked your requirement, yes we can do it, because we already work on si เพิ่มเติม