Find Jobs
Hire Freelancers

Data Engineering

₹75000-150000 INR

Cerrado
Publicado hace 5 meses

₹75000-150000 INR

Pagado a la entrega
Project Brief: Secondary Sales Data Engineering 1. Introduction Our company currently utilizes a network of channel partners (Super Stockists) across various Indian states to sell products with specific SKUs to a downstream distribution network. However, due to the diverse software used by these Super Stockists, product names and SKUs often diverge from our standardized format, hindering the collection and analysis of secondary sales data. This data arrives in various formats such as CSV, Excel, and PDF, further complicating the process. 2. Project Objective To address this challenge, we propose a data engineering project focused on transforming and homogenizing secondary sales data received from Super Stockists. This project aims to achieve the following: Standardize product names and SKUs: Map non-uniform names and SKUs used by Super Stockists to our standardized format. Transform data format: Convert data from various formats (CSV, Excel, PDF) into a single, unified format. Clean and validate data: Identify and correct any inconsistencies or errors within the data. Aggregate and structure data: Organize the transformed data into a readily analyzable format for further utilization. 3. Expected Deliverables Data pipeline: An automated pipeline for ingesting, transforming, and cleaning secondary sales data. Standardized dataset: A clean and consistent dataset with uniform product names, SKUs, and format. Data quality report: A detailed report outlining the data cleaning process, identified issues, and applied corrections. Documentation: Comprehensive documentation outlining the data pipeline, data transformation steps, and data format specifications. 4. Key Success Factors Accuracy: The standardized dataset must reflect accurate and consistent product information. Completeness: The pipeline should capture and process all secondary sales data received from Super Stockists. Efficiency: The pipeline should operate efficiently to minimize processing time and resource consumption. Scalability: The solution should be scalable to accommodate future growth in data volume. Maintainability: The pipeline and code should be well-documented and easy to maintain for future updates and modifications. 5. Next Steps Detailed project proposal: Prepare a detailed project proposal outlining the proposed methodology, resources required, timelines, and project costs. Data source review: Conduct a comprehensive review of the data sources (formats, content, etc.) from Super Stockists. Data quality assessment: Evaluate the initial data quality and identify potential challenges and cleaning requirements. Prototype development: Develop and test a prototype of the data pipeline to demonstrate feasibility and address any technical hurdles. Project kickoff meeting: Convene a kickoff meeting with key stakeholders to finalize project scope, deliverables, and timeline. We mostly use Microsoft in our organization, so that would be the preference.
ID del proyecto: 37527200

Información sobre el proyecto

26 propuestas
Proyecto remoto
Activo hace 4 meses

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos

Sobre este cliente

Bandera de INDIA
Madurai, India
5,0
3
Forma de pago verificada
Miembro desde abr 30, 2021

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.