delphi html parser
$495-500 USD
Pagado a la entrega
The goal of this project is to make an "intelligent" html parser to extract data from HTML pages.
This parser should be able to automatically extract data such as:
companyName, address, email, fax, tel, website
this parser must be able to extract N times these data, since html pages will contain tablular data. (N data per page).
[url removed, login to view]();
while ([url removed, login to view]()) do begin;
data:=[url removed, login to view]();
// data should be an object or type like
// [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view]
end;
I think a good knowledge of DOM and og REGEX is necessary.
of course it will not work on ALL websites, but should be universal enough.
should work with data from
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
etc..
I think the good startegy would be:
1) find a repetitive fragment in the DOM (when a page contains 20 results, it should extract 20 HTML blocks)
2) apply a parser to each block that contain data to be extracted
Should be DELPHI 6 compatible.
Nº del proyecto: #3451768