Cutshort logo
Apache Flume Jobs in Chennai

11+ Apache Flume Jobs in Chennai | Apache Flume Job openings in Chennai

Apply to 11+ Apache Flume Jobs in Chennai on CutShort.io. Explore the latest Apache Flume Job opportunities across top companies like Google, Amazon & Adobe.

icon
GeakMinds Technologies Pvt Ltd
John Richardson
Posted by John Richardson
Chennai
1 - 5 yrs
₹1L - ₹6L / yr
Hadoop
Big Data
HDFS
Apache Sqoop
Apache Flume
+2 more
• Looking for Big Data Engineer with 3+ years of experience. • Hands-on experience with MapReduce-based platforms, like Pig, Spark, Shark. • Hands-on experience with data pipeline tools like Kafka, Storm, Spark Streaming. • Store and query data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto. • Hands-on experience in managing Big Data on a cluster with HDFS and MapReduce. • Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm. • Experience with Azure cloud, Cognitive Services, Databricks is preferred.
Read more
Tier 1 MNC

Tier 1 MNC

Agency job
Chennai, Pune, Bengaluru (Bangalore), Noida, Gurugram, Kochi (Cochin), Coimbatore, Hyderabad, Mumbai, Navi Mumbai
3 - 12 yrs
₹3L - ₹15L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+1 more
Greetings,
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Viswanath Subramanian
Posted by Viswanath Subramanian
Chennai, Bengaluru (Bangalore), Mumbai
4 - 6 yrs
₹7L - ₹15L / yr
SQL
skill iconAmazon Web Services (AWS)
Data Warehouse (DWH)
Informatica
ETL
+1 more

Responsibilities:

  • Must be able to write quality code and build secure, highly available systems.
  • Assemble large, complex datasets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing datadelivery, re-designing infrastructure for greater scalability, etc with the guidance.
  • Create datatools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Monitoring performance and advising any necessary infrastructure changes.
  • Defining dataretention policies.
  • Implementing the ETL process and optimal data pipeline architecture
  • Build analytics tools that utilize the datapipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Develop, test, and implement datasolutions based on finalized design documents.
  • Work with dataand analytics experts to strive for greater functionality in our data
  • Proactively identify potential production issues and recommend and implement solutions

Skillsets:

  • Good understanding of optimal extraction, transformation, and loading of datafrom a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable datasize (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of datafrom multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Creation of DAGs for dataengineering
  • Expert at Python /Scala programming, especially for dataengineering/ ETL purposes
Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
A leading global information technology and business process

A leading global information technology and business process

Agency job
via Jobdost by Mamatha A
Chennai
5 - 14 yrs
₹13L - ₹21L / yr
skill iconPython
skill iconJava
PySpark
skill iconJavascript
Hadoop

Python + Data scientist : 
• Hands-on and sound knowledge of Python, Pyspark, Java script

• Build data-driven models to understand the characteristics of engineering systems

• Train, tune, validate, and monitor predictive models

• Sound knowledge on Statistics

• Experience in developing data processing tasks using PySpark such as reading,

merging, enrichment, loading of data from external systems to target data destinations

• Working knowledge on Big Data or/and Hadoop environments

• Experience creating CI/CD Pipelines using Jenkins or like tools

• Practiced in eXtreme Programming (XP) disciplines 

Read more
Bungee Tech India
Abigail David
Posted by Abigail David
Remote, NCR (Delhi | Gurgaon | Noida), Chennai
5 - 10 yrs
₹10L - ₹30L / yr
Big Data
Hadoop
Apache Hive
Spark
ETL
+3 more

Company Description

At Bungee Tech, we help retailers and brands meet customers everywhere and, on every occasion, they are in. We believe that accurate, high-quality data matched with compelling market insights empowers retailers and brands to keep their customers at the center of all innovation and value they are delivering. 

 

We provide a clear and complete omnichannel picture of their competitive landscape to retailers and brands. We collect billions of data points every day and multiple times in a day from publicly available sources. Using high-quality extraction, we uncover detailed information on products or services, which we automatically match, and then proactively track for price, promotion, and availability. Plus, anything we do not match helps to identify a new assortment opportunity.

 

Empowered with this unrivalled intelligence, we unlock compelling analytics and insights that once blended with verified partner data from trusted sources such as Nielsen, paints a complete, consolidated picture of the competitive landscape.

We are looking for a Big Data Engineer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

You will also be responsible for integrating them with the architecture used in the company.

 

We're working on the future. If you are seeking an environment where you can drive innovation, If you want to apply state-of-the-art software technologies to solve real world problems, If you want the satisfaction of providing visible benefit to end-users in an iterative fast paced environment, this is your opportunity.

 

Responsibilities

As an experienced member of the team, in this role, you will:

 

  • Contribute to evolving the technical direction of analytical Systems and play a critical role their design and development

 

  • You will research, design and code, troubleshoot and support. What you create is also what you own.

 

  • Develop the next generation of automation tools for monitoring and measuring data quality, with associated user interfaces.

 

  • Be able to broaden your technical skills and work in an environment that thrives on creativity, efficient execution, and product innovation.

 

BASIC QUALIFICATIONS

  • Bachelor’s degree or higher in an analytical area such as Computer Science, Physics, Mathematics, Statistics, Engineering or similar.
  • 5+ years relevant professional experience in Data Engineering and Business Intelligence
  • 5+ years in with Advanced SQL (analytical functions), ETL, Data Warehousing.
  • Strong knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures, data modeling and performance tuning.
  • Ability to effectively communicate with both business and technical teams.
  • Excellent coding skills in Java, Python, C++, or equivalent object-oriented programming language
  • Understanding of relational and non-relational databases and basic SQL
  • Proficiency with at least one of these scripting languages: Perl / Python / Ruby / shell script

 

PREFERRED QUALIFICATIONS

 

  • Experience with building data pipelines from application databases.
  • Experience with AWS services - S3, Redshift, Spectrum, EMR, Glue, Athena, ELK etc.
  • Experience working with Data Lakes.
  • Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
  • Sharp problem solving skills and ability to resolve ambiguous requirements
  • Experience on working with Big Data
  • Knowledge and experience on working with Hive and the Hadoop ecosystem
  • Knowledge of Spark
  • Experience working with Data Science teams
Read more
TVS Credit Services

at TVS Credit Services

2 recruiters
Vinodhkumar Panneerselvam
Posted by Vinodhkumar Panneerselvam
Chennai
4 - 10 yrs
₹10L - ₹20L / yr
skill iconData Science
skill iconR Programming
skill iconPython
skill iconMachine Learning (ML)
Hadoop
+3 more
Job Description: Be responsible for scaling our analytics capability across all internal disciplines and guide our strategic direction in regards to analytics Organize and analyze large, diverse data sets across multiple platforms Identify key insights and leverage them to inform and influence product strategy Technical Interactions with vendor or partners in technical capacity for scope/ approach & deliverables. Develops proof of concept to prove or disprove validity of concept. Working with all parts of the business to identify analytical requirements and formalize an approach for reliable, relevant, accurate, efficientreporting on those requirements Designing and implementing advanced statistical testing for customized problem solving Deliver concise verbal and written explanations of analyses to senior management that elevate findings into strategic recommendations Desired Candidate Profile: MTech / BE / BTech / MSc in CS or Stats or Maths, Operation Research, Statistics, Econometrics or in any quantitative field Experience in using Python, R, SAS Experience in working with large data sets and big data systems (SQL, Hadoop, Hive, etc.) Keen aptitude for large-scale data analysis with a passion for identifying key insights from data Expert working knowledge in various machine learning algorithms such XGBoost, SVM Etc. We are looking candidates from the following: Experience in Unsecured Loans & SME Loans analytics (cards, installment loans) - risk based pricing analytics Experience in Differential pricing / selection analytics (retail, airlines / travel etc). Experience in Digital product companies or Digital eCommerce with Product mindset and experience Experience in Fraud / Risk from Banks, NBFC / Fintech / Credit Bureau Experience in Online media with knowledge of media, online ads & sales (agencies) - Knowledge of DMP, DFP, Adobe/Omniture tools, Cloud Experience in Consumer Durable Loans lending companies (Experience in Credit Cards, Personal Loan - optional) Experience in Tractor Loans lending companies (Experience in Farm) Experience in Recovery, Collections analytics Experience in Marketing Analytics with Digital Marketing, Market Mix modelling, Advertising Technology
Read more
Telecom  Client

Telecom Client

Agency job
via Eurka IT SOL by Srikanth a
Chennai
5 - 13 yrs
₹9L - ₹28L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Demonstrable experience owning and developing big data solutions, using Hadoop, Hive/Hbase, Spark, Databricks, ETL/ELT for 5+ years

·       10+ years of Information Technology experience, preferably with Telecom / wireless service providers.

·       Experience in designing data solution following Agile practices (SAFe methodology); designing for testability, deployability and releaseability; rapid prototyping, data modeling, and decentralized innovation

  • DataOps mindset: allowing the architecture of a system to evolve continuously over time, while simultaneously supporting the needs of current users
  • Create and maintain Architectural Runway, and Non-Functional Requirements.
  • Design for Continuous Delivery Pipeline (CI/CD data pipeline) and enables Built-in Quality & Security from the start.

·       To be able to demonstrate an understanding and ideally use of, at least one recognised architecture framework or standard e.g. TOGAF, Zachman Architecture Framework etc

·       The ability to apply data, research, and professional judgment and experience to ensure our products are making the biggest difference to consumers

·       Demonstrated ability to work collaboratively

·       Excellent written, verbal and social skills - You will be interacting with all types of people (user experience designers, developers, managers, marketers, etc.)

·       Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision

·       Technologies: .NET, AWS, Azure; Azure Synapse, Nifi, RDS, Apache Kafka, Azure Data bricks, Azure datalake storage, Power BI, Reporting Analytics, QlickView, SQL on-prem Datawarehouse; BSS, OSS & Enterprise Support Systems

Read more
netmedscom

at netmedscom

3 recruiters
Vijay Hemnath
Posted by Vijay Hemnath
Chennai
2 - 5 yrs
₹6L - ₹25L / yr
Big Data
Hadoop
Apache Hive
skill iconScala
Spark
+12 more

We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Roles and Responsibilities:

  • Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
  • Develop programs in Scala and Python as part of data cleaning and processing.
  • Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.  
  • Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
  • Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
  • Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Provide high operational excellence guaranteeing high availability and platform stability.
  • Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.

Skills:

  • Experience with Big Data pipeline, Big Data analytics, Data warehousing.
  • Experience with SQL/No-SQL, schema design and dimensional data modeling.
  • Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
  • Experience in designing systems that process structured as well as unstructured data at large scale.
  • Experience in AWS/Spark/Java/Scala/Python development.
  • Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
  • Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
  • Prior exposure to streaming data sources such as Kafka.
  • Should have knowledge on Shell Scripting and Python scripting.
  • High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
  • Experience with NoSQL databases such as Cassandra / MongoDB.
  • Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
  • Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
  • Experience building and deploying applications on on-premise and cloud-based infrastructure.
  • Having a good understanding of machine learning landscape and concepts. 

 

Qualifications and Experience:

Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.

Certifications:

Good to have at least one of the Certifications listed here:

    AZ 900 - Azure Fundamentals

    DP 200, DP 201, DP 203, AZ 204 - Data Engineering

    AZ 400 - Devops Certification

Read more
Lymbyc

at Lymbyc

1 video
2 recruiters
Venky Thiriveedhi
Posted by Venky Thiriveedhi
Bengaluru (Bangalore), Chennai
4 - 8 yrs
₹9L - ₹14L / yr
Apache Spark
Apache Kafka
Druid Database
Big Data
Apache Sqoop
+5 more
Key skill set : Apache NiFi, Kafka Connect (Confluent), Sqoop, Kylo, Spark, Druid, Presto, RESTful services, Lambda / Kappa architectures Responsibilities : - Build a scalable, reliable, operable and performant big data platform for both streaming and batch analytics - Design and implement data aggregation, cleansing and transformation layers Skills : - Around 4+ years of hands-on experience designing and operating large data platforms - Experience in Big data Ingestion, Transformation and stream/batch processing technologies using Apache NiFi, Apache Kafka, Kafka Connect (Confluent), Sqoop, Spark, Storm, Hive etc; - Experience in designing and building streaming data platforms in Lambda, Kappa architectures - Should have working experience in one of NoSQL, OLAP data stores like Druid, Cassandra, Elasticsearch, Pinot etc; - Experience in one of data warehousing tools like RedShift, BigQuery, Azure SQL Data Warehouse - Exposure to other Data Ingestion, Data Lake and querying frameworks like Marmaray, Kylo, Drill, Presto - Experience in designing and consuming microservices - Exposure to security and governance tools like Apache Ranger, Apache Atlas - Any contributions to open source projects a plus - Experience in performance benchmarks will be a plus
Read more
Mobile Programming LLC

at Mobile Programming LLC

1 video
34 recruiters
vandana chauhan
Posted by vandana chauhan
Remote, Chennai
3 - 7 yrs
₹12L - ₹18L / yr
Big Data
skill iconAmazon Web Services (AWS)
Hadoop
SQL
skill iconPython
+5 more
Position: Data Engineer  
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.mobileprogramming.com/) 
Client Name: Samsung 


We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.

Responsibilities for Data Engineer
 Create and maintain optimal data pipeline architecture,
 Assemble large, complex data sets that meet functional / non-functional business requirements.
 Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
 Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
 Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
 Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
 Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
 Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer
 Experience building and optimizing big data ETL pipelines, architectures and data sets.
 Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
 Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
 Strong analytic skills related to working with unstructured datasets.
 Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
 A successful history of manipulating, processing and extracting value from large disconnected
datasets.

 Working knowledge of message queuing, stream processing and highly scalable ‘big datadata
stores.
 Strong project management and organizational skills.
 Experience supporting and working with cross-functional teams in a dynamic environment.

We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
 Experience with big data tools: Spark, Kafka, HBase, Hive etc.
 Experience with relational SQL and NoSQL databases
 Experience with AWS cloud services: EC2, EMR, RDS, Redshift
 Experience with stream-processing systems: Storm, Spark-Streaming, etc.
 Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.

Skills: Big Data, AWS, Hive, Spark, Python, SQL
 
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort