Find Jobs
Hire Freelancers

Extract CommonCrawl data (EC2, Cluster Operations) - great big data project for your CV

$30-250 USD

Cerrado
Publicado hace alrededor de 4 años

$30-250 USD

Pagado a la entrega
We'd like you to build a set of functions that allows us to work with Common Crawl on a monthly basis (or whenever a new crawl is released). [login to view URL] Here is an article that explains this for PHP or Python (see comment section). [login to view URL]@paulrim/mining-common-crawl-with-php-39e14082c55c Basically we need to get all crawl data on EC2 Ubuntu Spot Instance from the list of all files. We need to make sure the list (can be in a central database) is worked on fully and we can resume as spot instances do not live very long. We have a small PHP (or Python) script that parses HTML and extracts data (just like in the article). Once done next file should be worked on. We recommend periodic reboots for swap space and machine usage. We expect a set of tools that runs without interruptions and produces a folder in S3 with the parsed data. We will then get all the files in that folder and imported them (extract from S3 is not part of this project, we will cover this).
ID del proyecto: 24255131

Información sobre el proyecto

6 propuestas
Proyecto remoto
Activo hace 4 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
6 freelancers están ofertando un promedio de $183 USD por este trabajo
Avatar del usuario
Hi there, I have read your project description and i'm an expert in Python and machine learning therefore i can do this project for you perfectly.I still have a few questions. please leave a message on my chat so we can discuss the budget and deadline of the project. Thanks. .. .
$250 USD en 5 días
4,7 (85 comentarios)
7,5
7,5
Avatar del usuario
Your project Budget is too low. If you are open to negotiating then I will be happy to help you out. Besides you can take a look at my profile to verify my AWS Proficiency. Thanks Jay
$140 USD en 7 días
4,9 (87 comentarios)
6,8
6,8
Avatar del usuario
***AWS EXPERT*** Hi, Hope you are doing great !! I have major work experience in Server Administration and Project Management. AWS Services : EC2, S3, RDS, CloudFront and many more.I provide all kinds of solution related to Network administration for system admin, Linux, Unix, Apache, servers, lambda EC2, Open SSL. I am grateful for your time and consideration,and I look forward to speaking with you further about this position.I am willing to work to work for 40 hrs per week for your project if you hire me once Warm Regards, Ranu
$140 USD en 7 días
4,9 (62 comentarios)
6,0
6,0
Avatar del usuario
Hi, I have read your job description carefully and i am very interested in this job. I am full stack web developer and i have a strong experience with Web Crawler(with Php, Python), AWS EC2 & S3. I can start working right now and looking for a good long term relationship. Regards, Ionel.
$200 USD en 5 días
4,9 (28 comentarios)
5,6
5,6
Avatar del usuario
I have built a platform which profiles all companies fully automatically by crawling data from internet and use machine learning to do the data merging and scoring. The platform is fully run on AWS, and heavily used the spot machine as we have many crawlers and weekly update for all data points. The platform is https://www.seekr.ai. I believe the skills I used here should be a good match with your projects requirements. We can have more detail discussion if you like. Thanks.
$250 USD en 7 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
San Francisco, United States
5,0
231
Forma de pago verificada
Miembro desde abr 4, 2010

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.