Screen scraping

Completado Publicado Oct 24, 2015 Pagado a la entrega
Completado Pagado a la entrega

The task is to create a windows console application in c# that will extract data, like screen scraping, from a web site and write the data down into two different files semi colon separated. The program should be possible to run multiple times and should then append data to the output files.

Every day the website posts a lot of new information, but it always comes in id sequence. So if the current post is something like [url removed, login to view], then the next post will be the same only with 1322 at the end instead of 1321.

The intention is to have the program running as a batch job and fetch all data that have been posted since last time the program ran. Therefore the program needs a configuration parameter with a field like LastStoredId.

So the program should start by looking at that id, say that it is 100. Then it should fetch all data from the site like this:

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

Then when the program tries to fetch id 106 it might be that you get a handled error from the site saying something like “Oops, the page you searched for was not found. Please go back and try again”. When that happens you know that there are no more information to fetch. Then you store the last retrieved id, in this case 105 in configuration parameter file as LastStoredId. When running the program again next day or next week, then there might be more data to fetch so that you store data for id 106, 107… and so on until you once again can’t find any more info.

The site to collect data from also has forms authentication. The program should somehow be able to sign with a username and password that should be configurable from the configuration parameters file where LastStoredId field is also stored.

So the suggestion for the configuration file is that you only have three rows. Row one has a single number that is the value for LastStoredId, line two contains username and finally line three contains password.

Programación en C#

Nº del proyecto: #8758682

Sobre el proyecto

8 propuestas Proyecto remoto Activo Oct 24, 2015

Adjudicado a:

mananraja

Hi, I have done many web scraping projects in C# & Python...I have read the description & would like to discuss more...Looking forward to your response...Check my portfolio... https://www.freelancer.pk/projects/Web- Más

$150 USD en 2 días
(16 comentarios)
3.9

8 freelancers están ofertando un promedio de $169 por este trabajo

mdkass

A proposal has not yet been provided

$131 USD en 5 días
(15 comentarios)
4.4
vinod150987

Hi Hope you well ! I am Vinod Kumar. I am working as Senior Software Developer in IT company. I have 5 + years experience in .NET with C#,VB,ASP.NET MVC,IIS, Web Service, WCF, Javascript, JQuery, SQL, XML ,XSLT Más

$244 USD en 3 días
(16 comentarios)
4.0
cnharry

Hi, I have be working as a c# developer for 7 years, I am very good at doing web scrapers, can do this job!

$155 USD en 3 días
(0 comentarios)
0.0
niyazmohaideen

A proposal has not yet been provided

$100 USD en 2 días
(0 comentarios)
0.0
devAzhar

I have past experience of writing similar software to scrap and parse data. I am familiar with tools like EDF and I can also do custom scrapping and parsing in C# (console or desktop applications). Before proceeding Más

$166 USD en 3 días
(0 comentarios)
0.0
Kamalkishover

HI WE have done similar project, we provide you quality work as per your requirement. we have expert team for search and also have software for web scraping. Thanks

$155 USD en 3 días
(0 comentarios)
1.0