I need a program to help crawl and download images from various websites. Here's the rundown on what I need the PHP program to do.
Program will run by command line via cron. It will be hosted on a standard LAMP web server. Linux, PHP, MySQL.
1. IMPORTANT: Crawl sites, download pictures that ONLY contain at least one face. Use OpenCV or something similar to determine if a face is found
2. IMPORTANT: Program must download the same picture only ONCE per site. Program will be run multiple times!
3. Create separate directories for each site.
4. Insert a record into a MySQL database table for each image downloaded . Records needed include:
1) filename
2) website URL where the image was downloaded
3) timestamp of when image was downloaded
Here are the ways it needs to be able to grab images:
1) URL LIST1 - Have an array (or database table) that contains URLs to websites that the program needs to crawl and download all images from that site. This needs to crawl public image sites like Imgur, photobucket, tinypic, flickr, tumblr, [login to view URL], yfrog, twitpic, [login to view URL], [login to view URL], [login to view URL], pixhost.org. I would like for it crawl as many of these as possible.
So the program would create a directory structure like the following:
./all_images/tinypic
./all_images/yfrog
./all_images/photobucket
etc
LIST1 sample
$LIST1=array('[login to view URL]','[login to view URL]','[login to view URL]','[login to view URL]','[login to view URL]');
Sites that require logins:
2) URL LIST2 - This needs to be able to crawl forums that require username/passwords. It needs to be able to go to multiple pages on a forum post. If the post has 20 pages, it needs to crawl all 20 pages and download the images. Example forums include phpbb, bbPress, & vBulletin.
The program needs to be able to handle user/passwords for specific forums. Some forums don't let you view images unless you are logged in. The program needs to be able to login and then scan.
LIST2 sample as an array (could also be a database table instead)
$array2[0]['url']="[login to view URL]";
$array2[0]['username']="myusername";
$array2[0]['password']="mypassword";
$array2[1]['url']="[login to view URL]";
$array2[1]['username']="myusername2";
$array2[1]['password']="mypassword2";
3) Program needs to be able to crawl images posted openly on Facebook.
Thanks!
Hi! The project you described can be either straightforward or quite challenging depending on your detailed requirements. It won't be a problem for me either way
hi sir
Dolphinesoft is a professional Software Development Company providing complete IT solutions. website designing, software development and internet marketing and full featured web services including B2B and B2C e-commerce solutions and acting as an offshore development center for overseas development firms.