Modify python's scrapy to specific use (Javascript rendering using splash/PhantomJS and match some regex etc)

Cerrado Publicado hace 6 años Pagado a la entrega
Cerrado Pagado a la entrega

Hi guys, it's a simple project, the architecture/scrapy logic is designed, all detail specification will be given later, some important point:

1. Read seed URL from txt file, one URL per line.

2. Scraping webpage content for 2 levels(hops), javascript rendering is need, like splash/PhantomJS etc.

3. Covert scraped 2 level content from html to text, and matching 2 predefined regex.

4. Output result as CSV format, comma separated, other columns are like title of root page, keyword info and description info of root page etc.

5. As the output format is CSV, some filtering of non-printable characters and special characters like punctuations. As webpages using different encodings, the final output will be using UTF8, and the characters will be convert to lower-case. Please kindly handle this in the program.

6. It will be well appreciated if the program/script can be done within 1~2 days. And as I'm not live in a well-developed country and this project has limited budget(10~20 USD), lower quote is very appreciated. Detail specification will be given later.

Regards

JavaScript Linux Python VPS Extracción de datos web

Nº del proyecto: #14058611

Sobre el proyecto

9 propuestas Proyecto remoto Activo hace 6 años

9 freelancers están ofertando un promedio de $75 por este trabajo

mantislin

Hi sir, This is kimi and I am scraping expert, I have did too many scraping projects, please check my profile page then you will know. https://www.freelancer.com/u/mantislin.html Can you tell me Más

$109 USD en 3 días
(195 comentarios)
7.1
schoudhary1553

Greetings sir, i am an expert freelancer for this job and your 100% satisfaction is assured if you allow me to serve. Here is the reason. Why you should pick me? a) I am a very expert and have the same kind of ex Más

$150 USD en 1 día
(18 comentarios)
5.2
stevegtdbz

Hello sir, I have completed many similar projects in the past. I mainly use python + selenium + phantomjs to scrap data. I can provide a very powerful python script using phantomjs - multi threading - proxy support - Más

$25 USD en 2 días
(16 comentarios)
4.4
shaliniramadass

Hi, we are a 1000 + employee firm. Charging 6$ an hour. Can start any technology immediately. Direct access to developers via Skype, G talk and hotline – 24/7 availability for all 1000+ resource. We can bet you that no Más

$25 USD en 1 día
(0 comentarios)
0.0
techcrunch2

Hello Sir, We have gone through the details you have provided and we have already worked on a similar project before and can deliver as u have mentioned and would be pleased to work on this with you to deliver the resu Más

$28 USD en 6 días
(0 comentarios)
0.0
iqranabi

Hello, I am Iqra, I am Data Entry/Data Processing Expert who knows the value of time, very hard working and always delivers the work on time. My Motive is to make my employer happy without adding additional charges. Más

$22 USD en 3 días
(0 comentarios)
0.0
ishtiyaqlone

Hello, My name is ishtiyaq, I am certified python expert I have 6 years+ experience in python language and I have completed 100+ projects using python .. Expertise : Python, Django, Django-Rest- Framework and many pyth Más

$10 USD en 5 días
(0 comentarios)
0.0