EMC GreenPlum Jobs in Hyderabad

11+ EMC GreenPlum Jobs in Hyderabad | EMC GreenPlum Job openings in Hyderabad

Apply to 11+ EMC GreenPlum Jobs in Hyderabad on CutShort.io. Explore the latest EMC GreenPlum Job opportunities across top companies like Google, Amazon & Adobe.

Data Engineer

consulting & implementation services in the area of Oil & Gas, Mining and Manufacturing Industry

Agency job

via Jobdost by Sathish Kumar

Ahmedabad, Hyderabad, Pune, Delhi

5 - 7 yrs

₹18L - ₹25L / yr

AWS Lambda

AWS Simple Notification Service (SNS)

AWS Simple Queuing Service (SQS)

Python

PySpark

+9 more

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Senior Data Engineer (L2)

at Publicis Sapient

10 recruiters

Posted by Mohit Singh

Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida

5 - 11 yrs

₹20L - ₹36L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.

Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Publicis Sapient Overview:

Job Summary:

Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Data Engineer

Product and Service based company

Agency job

via Jobdost by Sathish Kumar

Hyderabad, Ahmedabad

4 - 8 yrs

₹15L - ₹30L / yr

Amazon Web Services (AWS)

Apache

Snow flake schema

Python

Spark

+13 more

Job Description

Mandatory Requirements

Experience in AWS Glue
Experience in Apache Parquet
Proficient in AWS S3 and data lake
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
Identify and interpret trends and patterns from complex data sets
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
Key participant in regular Scrum ceremonies with the agile teams
Proficient at developing queries, writing reports and presenting findings
Mentor junior members and bring best industry practices.

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C#
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
Good written and oral communication skills and ability to present results to non-technical audiences
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus:

AWS certification
Spark Streaming
Kafka Streaming / Kafka Connect
ELK Stack
Cassandra / MongoDB
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Job Description

Mandatory Requirements

Experience in AWS Glue
Experience in Apache Parquet
Proficient in AWS S3 and data lake
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
Identify and interpret trends and patterns from complex data sets
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
Key participant in regular Scrum ceremonies with the agile teams
Proficient at developing queries, writing reports and presenting findings
Mentor junior members and bring best industry practices.

QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C#
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools.
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
Good written and oral communication skills and ability to present results to non-technical audiences
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus:

AWS certification
Spark Streaming
Kafka Streaming / Kafka Connect
ELK Stack
Cassandra / MongoDB
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Architect - Analytics / K8s

Product Development

Agency job

via Purple Hirez by Aditya K

Hyderabad

12 - 20 yrs

₹15L - ₹50L / yr

Analytics

Data Analytics

Kubernetes

PySpark

Python

+1 more

Job Description

We are looking for an experienced engineer with superb technical skills. Primarily be responsible for architecting and building large scale data pipelines that delivers AI and Analytical solutions to our customers. The right candidate will enthusiastically take ownership in developing and managing a continuously improving, robust, scalable software solutions.

Although your primary responsibilities will be around back-end work, we prize individuals who are willing to step in and contribute to other areas including automation, tooling, and management applications. Experience with or desire to learn Machine Learning a plus.

Skills

Bachelors/Masters/Phd in CS or equivalent industry experience
Demonstrated expertise of building and shipping cloud native applications
5+ years of industry experience in administering (including setting up, managing, monitoring) data processing pipelines (both streaming and batch) using frameworks such as Kafka Streams, Py Spark, and streaming databases like druid or equivalent like Hive
Strong industry expertise with containerization technologies including kubernetes (EKS/AKS), Kubeflow
Experience with cloud platform services such as AWS, Azure or GCP especially with EKS, Managed Kafka
5+ Industry experience in python
Experience with popular modern web frameworks such as Spring boot, Play framework, or Django
Experience with scripting languages. Python experience highly desirable. Experience in API development using Swagger
Implementing automated testing platforms and unit tests
Proficient understanding of code versioning tools, such as Git
Familiarity with continuous integration, Jenkins

Responsibilities

Architect, Design and Implement Large scale data processing pipelines using Kafka Streams, PySpark, Fluentd and Druid
Create custom Operators for Kubernetes, Kubeflow
Develop data ingestion processes and ETLs
Assist in dev ops operations
Design and Implement APIs
Identify performance bottlenecks and bugs, and devise solutions to these problems
Help maintain code quality, organization, and documentation
Communicate with stakeholders regarding various aspects of solution.
Mentor team members on best practices

Job Description

Skills

Bachelors/Masters/Phd in CS or equivalent industry experience
Demonstrated expertise of building and shipping cloud native applications
5+ years of industry experience in administering (including setting up, managing, monitoring) data processing pipelines (both streaming and batch) using frameworks such as Kafka Streams, Py Spark, and streaming databases like druid or equivalent like Hive
Strong industry expertise with containerization technologies including kubernetes (EKS/AKS), Kubeflow
Experience with cloud platform services such as AWS, Azure or GCP especially with EKS, Managed Kafka
5+ Industry experience in python
Experience with popular modern web frameworks such as Spring boot, Play framework, or Django
Experience with scripting languages. Python experience highly desirable. Experience in API development using Swagger
Implementing automated testing platforms and unit tests
Proficient understanding of code versioning tools, such as Git
Familiarity with continuous integration, Jenkins

Responsibilities

Architect, Design and Implement Large scale data processing pipelines using Kafka Streams, PySpark, Fluentd and Druid
Create custom Operators for Kubernetes, Kubeflow
Develop data ingestion processes and ETLs
Assist in dev ops operations
Design and Implement APIs
Identify performance bottlenecks and bugs, and devise solutions to these problems
Help maintain code quality, organization, and documentation
Communicate with stakeholders regarding various aspects of solution.
Mentor team members on best practices

Data Engineer

at Hammoq

1 recruiter

Posted by Nikitha Muthuswamy

Remote, Indore, Ujjain, Hyderabad, Bengaluru (Bangalore)

5 - 8 yrs

₹5L - ₹15L / yr

pandas

NumPy

Data engineering

Data Engineer

Apache Spark

+6 more

Does analytics to extract insights from raw historical data of the organization.
Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
Tests the short/long term impact of productized MV models on those trends.
Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory.

Does analytics to extract insights from raw historical data of the organization.
Generates usable training dataset for any/all MV projects with the help of Annotators, if needed.
Analyses user trends, and identifies their biggest bottlenecks in Hammoq Workflow.
Tests the short/long term impact of productized MV models on those trends.
Skills - Numpy, Pandas, SPARK, APACHE SPARK, PYSPARK, ETL mandatory.

Data Scientist

at Material Depot

1 recruiter

Posted by Sarthak Agrawal

Hyderabad, Bengaluru (Bangalore)

2 - 6 yrs

₹12L - ₹20L / yr

Data Science

Machine Learning (ML)

Deep Learning

R Programming

Python

+1 more

Job Description

Do you have a passion for computer vision and deep learning problems? We are looking for someone who thrives on collaboration and wants to push the boundaries of what is possible today! Material Depot (materialdepot.in) is on a mission to be India’s largest tech company in the Architecture, Engineering and Construction space by democratizing the construction ecosystem and bringing stakeholders onto a common digital platform. Our engineering team is responsible for developing Computer Vision and Machine Learning tools to enable digitization across the construction ecosystem. The founding team includes people from top management consulting firms and top colleges in India (like BCG, IITB), and have worked extensively in the construction space globally and is funded by top Indian VCs.

Our team empowers Architectural and Design Businesses to effectively manage their day to day operations. We are seeking an experienced, talented Data Scientist to join our team. You’ll be bringing your talents and expertise to continue building and evolving our highly available and distributed platform.

Our solutions need complex problem solving in computer vision that require robust, efficient, well tested, and clean solutions. The ideal candidate will possess the self-motivation, curiosity, and initiative to achieve those goals. Analogously, the candidate is a lifelong learner who passionately seeks to improve themselves and the quality of their work. You will work together with similar minds in a unique team where your skills and expertise can be used to influence future user experiences that will be used by millions.

In this role, you will:

Extensive knowledge in machine learning and deep learning techniques
Solid background in image processing/computer vision
Experience in building datasets for computer vision tasks
Experience working with and creating data structures / architectures
Proficiency in at least one major machine learning framework
Experience visualizing data to stakeholders
Ability to analyze and debug complex algorithms
Good understanding and applied experience in classic 2D image processing and segmentation
Robust semantic object detection under different lighting conditions
Segmentation of non-rigid contours in challenging/low contrast scenarios
Sub-pixel accurate refinement of contours and features
Experience in image quality assessment
Experience with in depth failure analysis of algorithms
Highly skilled in at least one scripting language such as Python or Matlab and solid experience in C++
Creativity and curiosity for solving highly complex problems
Excellent communication and collaboration skills
Mentor and support other technical team members in the organization
Create, improve, and refine workflows and processes for delivering quality software on time and with carefully calculated debt
Work closely with product managers, customer support representatives, and account executives to help the business move fast and efficiently through relentless automation.

How you will do this:

You’re part of an agile, multidisciplinary team.
You bring your own unique skill set to the table and collaborate with others to accomplish your team’s goals.
You prioritize your work with the team and its product owner, weighing both the business and technical value of each task.
You experiment, test, try, fail, and learn continuously.
You don’t do things just because they were always done that way, you bring your experience and expertise with you and help the team make the best decisions.

For this role, you must have:

Strong knowledge of and experience with the functional programming paradigm.
Experience conducting code reviews, providing feedback to other engineers.
Great communication skills and a proven ability to work as part of a tight-knit team.

Job Description

In this role, you will:

Extensive knowledge in machine learning and deep learning techniques
Solid background in image processing/computer vision
Experience in building datasets for computer vision tasks
Experience working with and creating data structures / architectures
Proficiency in at least one major machine learning framework
Experience visualizing data to stakeholders
Ability to analyze and debug complex algorithms
Good understanding and applied experience in classic 2D image processing and segmentation
Robust semantic object detection under different lighting conditions
Segmentation of non-rigid contours in challenging/low contrast scenarios
Sub-pixel accurate refinement of contours and features
Experience in image quality assessment
Experience with in depth failure analysis of algorithms
Highly skilled in at least one scripting language such as Python or Matlab and solid experience in C++
Creativity and curiosity for solving highly complex problems
Excellent communication and collaboration skills
Mentor and support other technical team members in the organization
Create, improve, and refine workflows and processes for delivering quality software on time and with carefully calculated debt
Work closely with product managers, customer support representatives, and account executives to help the business move fast and efficiently through relentless automation.

How you will do this:

You’re part of an agile, multidisciplinary team.
You bring your own unique skill set to the table and collaborate with others to accomplish your team’s goals.
You prioritize your work with the team and its product owner, weighing both the business and technical value of each task.
You experiment, test, try, fail, and learn continuously.
You don’t do things just because they were always done that way, you bring your experience and expertise with you and help the team make the best decisions.

For this role, you must have:

Strong knowledge of and experience with the functional programming paradigm.
Experience conducting code reviews, providing feedback to other engineers.
Great communication skills and a proven ability to work as part of a tight-knit team.

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Apurva kalsotra

Mohali, Gurugram, Bengaluru (Bangalore), Chennai, Hyderabad, Pune

3 - 8 yrs

₹3L - ₹9L / yr

Data Warehouse (DWH)

Big Data

Spark

Apache Kafka

Data engineering

+14 more

Day-to-day Activities
Develop complex queries, pipelines and software programs to solve analytics and data mining problems
Interact with other data scientists, product managers, and engineers to understand business problems, technical requirements to deliver predictive and smart data solutions
Prototype new applications or data systems
Lead data investigations to troubleshoot data issues that arise along the data pipelines
Collaborate with different product owners to incorporate data science solutions
Maintain and improve data science platform
Must Have
BS/MS/PhD in Computer Science, Electrical Engineering or related disciplines
Strong fundamentals: data structures, algorithms, database
5+ years of software industry experience with 2+ years in analytics, data mining, and/or data warehouse
Fluency with Python
Experience developing web services using REST approaches.
Proficiency with SQL/Unix/Shell
Experience in DevOps (CI/CD, Docker, Kubernetes)
Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multi-task and manage expectations
Preferred
Industry experience with big data processing technologies such as Spark and Kafka
Experience with machine learning algorithms and/or R a plus
Experience in Java/Scala a plus
Experience with any MPP analytics engines like Vertica
Experience with data integration tools like Pentaho/SAP Analytics Cloud

Data Engineer

at Mobile Programming LLC

1 video

34 recruiters

Posted by Apurva kalsotra

Mohali, Gurugram, Pune, Bengaluru (Bangalore), Hyderabad, Chennai

3 - 8 yrs

₹2L - ₹9L / yr

Data engineering

Data engineer

Spark

Apache Spark

Apache Kafka

+13 more

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Responsibilities for Data Engineer

Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

Pyspark Developer

at Virtusa

2 recruiters

Agency job

via Devenir by Rakesh Kumar

Chennai, Hyderabad

4 - 6 yrs

₹10L - ₹20L / yr

PySpark

Amazon Web Services (AWS)

Python

Hands-on experience in Development
4-6 years of Hands on experience with Python scripts
2-3 years of Hands on experience in PySpark coding. Worked in spark cluster computing technology.
3-4 years of Hands on end to end data pipeline experience working on AWS environments
3-4 years of Hands on experience working on AWS services – Glue, Lambda, Step Functions, EC2, RDS, SES, SNS, DMS, CloudWatch etc.
2-3 years of Hands on experience working on AWS redshift
6+ years of Hands on experience with writing Unix Shell scripts
Good communication skills

Hands-on experience in Development
4-6 years of Hands on experience with Python scripts
2-3 years of Hands on experience in PySpark coding. Worked in spark cluster computing technology.
3-4 years of Hands on end to end data pipeline experience working on AWS environments
3-4 years of Hands on experience working on AWS services – Glue, Lambda, Step Functions, EC2, RDS, SES, SNS, DMS, CloudWatch etc.
2-3 years of Hands on experience working on AWS redshift
6+ years of Hands on experience with writing Unix Shell scripts
Good communication skills

Machine Learning Engineer

Centime

Agency job

via FlexAbility by srikanth voona

Hyderabad

8 - 14 yrs

₹15L - ₹35L / yr

Machine Learning (ML)

Artificial Intelligence (AI)

Deep Learning

Java

Python

Required skill

Around 6- 8.5 years of experience and around 4+ years in AI / Machine learning space
Extensive experience in designing large scale machine learning solution for the ML use case, large scale deployments and establishing continues automated improvement / retraining framework.
Strong experience in Python and Java is required.
Hands on experience on Scikit-learn, Pandas, NLTK
Experience in Handling of Timeseries data and associated techniques like Prophet, LSTM
Experience in Regression, Clustering, classification algorithms
Extensive experience in buildings traditional Machine Learning SVM, XGBoost, Decision tree and Deep Neural Network models like RNN, Feedforward is required.
Experience in AutoML like TPOT or other
Must have strong hands on experience in Deep learning frameworks like Keras, TensorFlow or PyTorch
Knowledge of Capsule Network or reinforcement learning, SageMaker is a desirable skill
Understanding of Financial domain is desirable skill

Responsibilities

Design and implementation of solutions for ML Use cases
Productionize System and Maintain those
Lead and implement data acquisition process for ML work
Learn new methods and model quickly and utilize those in solving use cases

Required skill

Around 6- 8.5 years of experience and around 4+ years in AI / Machine learning space
Extensive experience in designing large scale machine learning solution for the ML use case, large scale deployments and establishing continues automated improvement / retraining framework.
Strong experience in Python and Java is required.
Hands on experience on Scikit-learn, Pandas, NLTK
Experience in Handling of Timeseries data and associated techniques like Prophet, LSTM
Experience in Regression, Clustering, classification algorithms
Extensive experience in buildings traditional Machine Learning SVM, XGBoost, Decision tree and Deep Neural Network models like RNN, Feedforward is required.
Experience in AutoML like TPOT or other
Must have strong hands on experience in Deep learning frameworks like Keras, TensorFlow or PyTorch
Knowledge of Capsule Network or reinforcement learning, SageMaker is a desirable skill
Understanding of Financial domain is desirable skill