I need a program which automatically (scrapes) collects information from three websites and formats them in a text file.
The first site allows a mobile number to be entered (which can be generated by the program through the use of an increment through a loop or an input file with one mobile number per line) which it will validate as correct or not - it has a Captcha that is required to prevent automated verifications.
The other two sites do not have a captcha.
All the information needs to be collected which will comprise potentially several million numbers. So the program will have to be written in Java or using a technique where the whole page is not downloaded as the speed and resource requirements on the host will be too much.
The work needs to be completed within 1 week.
Payment will be made when all sites are completed and a sample of numbers are verified as correct from those gained.
The programs need to be written in a way that they are as generic as possible for use with other sites with minimal modifications - parameterisation should be used to prevent hard coding too much into the programs.
The code will need to be commented and written so it is easily maintainable.