Big Data ( Cassandra & Spark )

$250-750 USD

Cerrado

Publicado

hace alrededor de 9 años

$250-750 USD

Pagado a la entrega

We need to build (Cassandra cluster) & (Apache Spark / Hadoop cluster) in AWS to investigate / POC and Demo to our clients such technology. Build the Cassandra cluster (3 nodes) that can expand as needed scaling the cluster should be matter of minutes with zero or minimum configuration (you need to build the image for the x-node) Build the initial app, this app will create the initial data structure and add the dummy data (Randomly generate by you) Users Table [UserID, Username, FirstName, LastName] Accounts Table (User --[one-to-many]-->Account) [AccountNo, Currency, Balance ] Transaction (Account --[one-to-many]-->Transaction) [TransactionID, time-stamp, details[String, 256 char], category[String, 12 char] ] Build the Spark cluster (same as Cassandra) Update app This app will continuously update Transaction table with dummy transaction data (100-1000 transaction / seconds ) You need to install/configure a driver/pipe to make data available to Spark (from Cassandra) We will need to create a query app (Java + SQL) that will connected and execute some queries (on both Cassandra & Spark) TERMS & CONDITIONS Please bid if you already experienced with (Cassandra / Apache Spark / Hadoop cluster in AWS) Only SQL or Java are allowed (no python or other scripts unless it's used for configuration, i.e. bash scrip is welcome) Only Linux OS. (Documentation + screenshots) of steps taken to create the cluster / scripts and any configuration in nodes/aws We will provide an account in AWS (for that we need some legal document from you , i.e. passport, id , certificate, etc. and signing NDA) The use of "app" does not mean it's a mobile app, it means small application. This will be used by skilled developers, so command line is welcome. (don't waste your time making fancy UI) While in this step Spark is reading data from Cassandra, in the near future Spark should be able to read from other sources and structure and non-structure data (i.e. log files) The idea in this project is to show the speed and scalability of Cassandra / Spark and speed is the key factor for a successful implementation i.e. retrieve the last 3 years transactions in matter of 100-1000 milliseconds This is POC to us and to you (Contractor) so future work could be needed.

ID del proyecto: 7261424

Información sobre el proyecto

12 propuestas

Proyecto remoto

Activo hace 9 años

¿Buscas ganar dinero?

Dirección de email

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto

Cobra por tu trabajo

Describe tu propuesta

Es gratis registrarse y presentar ofertas en los trabajos

12 freelancers están ofertando un promedio de $1.926 USD por este trabajo

@superior5

George Bailey here from Los Angeles, USA. I have done similar project already and expert in desired skills for the project. Please get back to me to discuss further and finalize the agreement. Regards,

$2.577 USD en 10 días

5,0

(10 comentarios)

5,6

@nabusteam

Hi , we are ready to help. could you provide us with more details of your project. Best Regards Dmytro Usenko

$1.444 USD en 18 días

5,0

(1 comentario)

4,1

@pbqeu

Hi I am very interested in your project. I am a highly skilled Java developer with over 10 years experience and currently exploring big data technologies: hadoop, storm, kafka, spark, cassandra, etc. I want to take this challenge to the next level.

$1.250 USD en 20 días

4,9

(2 comentarios)

4,0

@williamgranzotto

Hi, we are a team especialized in Web development. We can do your project. For more information, visit our portfolio. Kind regards, William.

$1.250 USD en 20 días

5,0

(1 comentario)

0,7

@PhDStandard

A proposal has not yet been provided

$1.200 USD en 5 días

0,0

(0 comentarios)

0,0

@orbitbizsol

Hello sir, i read your requirements very-well and ready to start work from now. Hope you will give me a chance to do work with you, waiting for your reply, Thank you Bhadresh Skype- bd_orbit

$2.500 USD en 25 días

0,0

(0 comentarios)

0,0

@naveen824127

I have 4+ years of working experience in data mining and machine learning domain and have master degree in computer science. Worked on many projects mainly in Predictive analytics, Natural language language, text mining, web mining etc. Expertise in R, Python, Hadoop, MapReduce, Hbase,Hive,Pig etc.

$1.500 USD en 30 días

0,0

(0 comentarios)

0,0

@kksuicmez

A proposal has not yet been provided

$5.555 USD en 5 días

0,0

(0 comentarios)

0,0

@sancsvision

Hi, I have total 5 years of experience in Hadoop and Big Data processing application development as well as Hadoop Administration, Hadoop cluster configuration, security configuration, HA cluster set-up, disaster recovery cluster set-up, YARN Configuration etc using Cloudera, Hortonworks, PivotalHD. Also have very good experience of cluster setup on Cloud ( a different approach) cluster creation and automatic deployment. Please contact me to get more detail about me and let me know when we can schedule a demo session. I am open to answer your question before taking the assignment. Thanks for your time! Hadoopdoop Developer freelancer bid I have done various industrial project in Hadoop and Big Data using Map Reduce, Pig, Hive, Sqoop, Oozie, Flume, Kafka, Cassandra, Hbase, Spark for reputed company of UK, France and USA. I have total 5 years of experience in Hadoop and Big Data processing system development as well as Hadoop Administration, Hadoop cluster configuration, security configuration, HA cluster setup, disaster recovery cluster setup etc. Having good knowledge/experience of structured and unstructured data and exposure of data migration from RDBMS to Hadoop , Hadoop to RDBMS, Big Data ETL framework.I am a certified hadoop developer and I consider myself as a good candidate for the project. Please contact me to get more detail about me. I am open to answer your question before taking the project. Looking forward to get in touch with you for further discus

$833 USD en 10 días

0,0

(0 comentarios)

0,0

@snark1974

The installation/setup is trivial, as we just recently went through this exercise to set up the environment similar to yours: AWS+Cassandra+Spark. We tried quite a few of different configurations, including Mesos, but ended up with a standalone cluster, managed by Datastax Max - really simplifies the integration. I have a few questions to clarify: - You mentioned Java. Is Scala ok? Given that it's Spark's native language and executed in a JVM. - You mentioned SQL. Did you imply SparkSQL needs to be used? In one of our projects, we have a setup identical to yours: AWS+Cassandra+Spark. JDBC over SparkSQL is a pain, and peformance is not great; yet Cassandra-Spark connector works quite nicely. - Which version of Cassandra/Spark are you planning to use? An open source Apache or Datastax Max, which includes Cassandra/Spark/Solr? - Your last requirement is about performance/scalability. Spark is lightning fast once the data is in memory, but fetching it from disk/db takes time, of course. So when you require "X rows to be processed in N seconds" - did you mean the first access, or the performance once the data has been loaded? Also, I must tell you that Spark is one of my "sweet spots", I'm working on a number of side projects related to Spark, in particular to Spark+Cassandra ecosystem. My goal is to expose Spark/Spark Streaming API for analysts, not developers, so that they could create data flows easily. If you're interested, I could share more detail

$2.777 USD en 10 días

0,0

(0 comentarios)

0,0

@Orb0

I have extensive experience in both cloud computing and Big Data, as you can see from my Resume, I've been working in the industry for large corporate clients, and governments, solving big data problems for massive organizations. I have experience with Hadoop, deploying the solution, configuring the infrastructure, and overall solution architecture. I also have strong experience with AWS, and automation of deployment. I propose a solution to design a mostly automated solution on AWS for your prototype/POC, as outlined in the requirements you have provided. Please see my proposed milestones, for an outline of the key deliverables, as the project progresses, with controlled risk for you throughout the process. I always strive towards a complete holistic solution, so an early detailed assessment (under NDA and awarded contract) of your entire requirements and needs is important to me. I see this as a first step to a potentially long term fruitful relationship between myself, and your organization. Thank you for your time in reviewing my bid, and please don't hesitate to contact me with any further questions, or clarifications. I'll look forward to hearing from you.

$1.222 USD en 15 días