AWS Simple Queuing Service (SQS) Jobs in Pune

11+ AWS Simple Queuing Service (SQS) Jobs in Pune | AWS Simple Queuing Service (SQS) Job openings in Pune

Apply to 11+ AWS Simple Queuing Service (SQS) Jobs in Pune on CutShort.io. Explore the latest AWS Simple Queuing Service (SQS) Job opportunities across top companies like Google, Amazon & Adobe.

Aws simple queuing service sqs jobs in other cities

AWS Simple Queuing Service (SQS) Jobs AWS Simple Queuing Service (SQS) Jobs in Ahmedabad AWS Simple Queuing Service (SQS) Jobs in Bangalore (Bengaluru)AWS Simple Queuing Service (SQS) Jobs in Chennai AWS Simple Queuing Service (SQS) Jobs in Coimbatore AWS Simple Queuing Service (SQS) Jobs in Delhi, NCR and Gurgaon AWS Simple Queuing Service (SQS) Jobs in Hyderabad AWS Simple Queuing Service (SQS) Jobs in Mumbai

Jobs by Category

Fullstack Developer Jobs Backend Developer Jobs Frontend Developer Jobs Android Developer Jobs iOS Developer Jobs DevOps Jobs Data Science Jobs

Business Developer Jobs Digital Marketing Jobs Sales Jobs

UX Designer Jobs Graphic Designer Jobs

Jobs by Location

Startup Jobs in Bangalore Startup Jobs in Pune Startup Jobs in Delhi All Startup jobs

Collections

Funded Startup Jobs Product Startup Jobs

Data Engineer

consulting & implementation services in the area of Oil & Gas, Mining and Manufacturing Industry

Agency job

via Jobdost by Sathish Kumar

Ahmedabad, Hyderabad, Pune, Delhi

5 - 7 yrs

₹18L - ₹25L / yr

AWS Lambda

AWS Simple Notification Service (SNS)

AWS Simple Queuing Service (SQS)

Python

PySpark

+9 more

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Data Engineer

Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON

Mandatory Requirements 

Experience in AWS Glue
Experience in Apache Parquet 
Proficient in AWS S3 and data lake 
Knowledge of Snowflake
Understanding of file-based ingestion best practices.
Scripting language - Python & pyspark

CORE RESPONSIBILITIES

Create and manage cloud resources in AWS 
Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies 
Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform 
Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations 
Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
Define process improvement opportunities to optimize data collection, insights and displays.
Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible 
Identify and interpret trends and patterns from complex data sets 
Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. 
Key participant in regular Scrum ceremonies with the agile teams  
Proficient at developing queries, writing reports and presenting findings 
Mentor junior members and bring best industry practices

 QUALIFICATIONS

5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales) 
Strong background in math, statistics, computer science, data science or related discipline
Advanced knowledge one of language: Java, Scala, Python, C# 
Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake  
Proficient with
Data mining/programming tools (e.g. SAS, SQL, R, Python)
Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
Data visualization (e.g. Tableau, Looker, MicroStrategy)
Comfortable learning about and deploying new technologies and tools. 
Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines. 
Good written and oral communication skills and ability to present results to non-technical audiences 
Knowledge of business intelligence and analytical tools, technologies and techniques.

Familiarity and experience in the following is a plus: 

AWS certification
Spark Streaming 
Kafka Streaming / Kafka Connect 
ELK Stack 
Cassandra / MongoDB 
CI/CD: Jenkins, GitLab, Jira, Confluence other related tools

Machine Learning Ops Engineer

at Concinnity Media Technologies

2 candid answers

Posted by Anirban Biswas

Pune

6 - 10 yrs

₹18L - ₹27L / yr

Machine Learning (ML)

Data Science

Natural Language Processing (NLP)

Computer Vision

recommendation algorithm

+9 more

Develop, train, and optimize machine learning models using Python, ML algorithms, deep learning frameworks (e.g., TensorFlow, PyTorch), and other relevant technologies.
Implement MLOps best practices, including model deployment, monitoring, and versioning.
Utilize Vertex AI, MLFlow, KubeFlow, TFX, and other relevant MLOps tools and frameworks to streamline the machine learning lifecycle.
Collaborate with cross-functional teams to design and implement CI/CD pipelines for continuous integration and deployment using tools such as GitHub Actions, TeamCity, and similar platforms.
Conduct research and stay up-to-date with the latest advancements in machine learning, deep learning, and MLOps technologies.
Provide guidance and support to data scientists and software engineers on best practices for machine learning development and deployment.
Assist in developing tooling strategies by evaluating various options, vendors, and product roadmaps to enhance the efficiency and effectiveness of our AI and data science initiatives.

Develop, train, and optimize machine learning models using Python, ML algorithms, deep learning frameworks (e.g., TensorFlow, PyTorch), and other relevant technologies.
Implement MLOps best practices, including model deployment, monitoring, and versioning.
Utilize Vertex AI, MLFlow, KubeFlow, TFX, and other relevant MLOps tools and frameworks to streamline the machine learning lifecycle.
Collaborate with cross-functional teams to design and implement CI/CD pipelines for continuous integration and deployment using tools such as GitHub Actions, TeamCity, and similar platforms.
Conduct research and stay up-to-date with the latest advancements in machine learning, deep learning, and MLOps technologies.
Provide guidance and support to data scientists and software engineers on best practices for machine learning development and deployment.
Assist in developing tooling strategies by evaluating various options, vendors, and product roadmaps to enhance the efficiency and effectiveness of our AI and data science initiatives.

Data Engineer

MNC Company - Product Based

Agency job

via Bharat Headhunters by Ranjini C. N

Bengaluru (Bangalore), Chennai, Hyderabad, Pune, Delhi, Gurugram, Noida, Ghaziabad, Faridabad

5 - 9 yrs

₹10L - ₹15L / yr

Data Warehouse (DWH)

Informatica

ETL

Python

Google Cloud Platform (GCP)

+2 more

Job Responsibilities

Design, build & test ETL processes using Python & SQL for the corporate data warehouse
Inform, influence, support, and execute our product decisions
Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
Evaluate and prototype new technologies in the area of data processing
Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
High energy level, strong team player and good work ethic
Data analysis, understanding of business requirements and translation into logical pipelines & processes
Identification, analysis & resolution of production & development bugs
Support the release process including completing & reviewing documentation
Configure data mappings & transformations to orchestrate data integration & validation
Provide subject matter expertise
Document solutions, tools & processes
Create & support test plans with hands-on testing
Peer reviews of work developed by other data engineers within the team
Establish good working relationships & communication channels with relevant departments

Skills and Qualifications we look for

University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
4 - 6 years experience with data engineering.
Strong coding ability and software development experience in Python.
Strong hands-on experience with SQL and Data Processing.
Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
Good working experience in any one of the ETL tools (Airflow would be preferable).
Should possess strong analytical and problem solving skills.
Good to have skills - Apache pyspark, CircleCI, Terraform
Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
Understanding & experience of agile / scrum delivery methodology

Job Responsibilities

Design, build & test ETL processes using Python & SQL for the corporate data warehouse
Inform, influence, support, and execute our product decisions
Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
Evaluate and prototype new technologies in the area of data processing
Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
High energy level, strong team player and good work ethic
Data analysis, understanding of business requirements and translation into logical pipelines & processes
Identification, analysis & resolution of production & development bugs
Support the release process including completing & reviewing documentation
Configure data mappings & transformations to orchestrate data integration & validation
Provide subject matter expertise
Document solutions, tools & processes
Create & support test plans with hands-on testing
Peer reviews of work developed by other data engineers within the team
Establish good working relationships & communication channels with relevant departments

Skills and Qualifications we look for

University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
4 - 6 years experience with data engineering.
Strong coding ability and software development experience in Python.
Strong hands-on experience with SQL and Data Processing.
Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
Good working experience in any one of the ETL tools (Airflow would be preferable).
Should possess strong analytical and problem solving skills.
Good to have skills - Apache pyspark, CircleCI, Terraform
Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
Understanding & experience of agile / scrum delivery methodology

Snowflake Developer- Architect/Manager

at Tredence

Posted by Jyoti Chetry

Bengaluru (Bangalore), Pune, Gurugram, Chennai

8 - 12 yrs

₹12L - ₹30L / yr

Snow flake schema

Snowflake

SQL

Data modeling

Data engineering

+1 more

JOB DESCRIPTION:. THE IDEAL CANDIDATE WILL:

• Ensure new features and subject areas are modelled to integrate with existing structures and provide a consistent view. Develop and maintain documentation of the data architecture, data flow and data models of the data warehouse appropriate for various audiences. Provide direction on adoption of Cloud technologies (Snowflake) and industry best practices in the field of data warehouse architecture and modelling.

• Providing technical leadership to large enterprise scale projects. You will also be responsible for preparing estimates and defining technical solutions to proposals (RFPs). This role requires a broad range of skills and the ability to step into different roles depending on the size and scope of the project Roles & Responsibilities.

ELIGIBILITY CRITERIA: Desired Experience/Skills:
• Must have total 5+ yrs. in IT and 2+ years' experience working as a snowflake Data Architect and 4+ years in Data warehouse, ETL, BI projects.
• Must have experience at least two end to end implementation of Snowflake cloud data warehouse and 3 end to end data warehouse implementations on-premise preferably on Oracle.

• Expertise in Snowflake – data modelling, ELT using Snowflake SQL, implementing complex stored Procedures and standard DWH and ETL concepts
• Expertise in Snowflake advanced concepts like setting up resource monitors, RBAC controls, virtual warehouse sizing, query performance tuning, Zero copy clone, time travel and understand how to use these features
• Expertise in deploying Snowflake features such as data sharing, events and lake-house patterns
• Hands-on experience with Snowflake utilities, SnowSQL, SnowPipe, Big Data model techniques using Python
• Experience in Data Migration from RDBMS to Snowflake cloud data warehouse
• Deep understanding of relational as well as NoSQL data stores, methods and approaches (star and snowflake, dimensional modelling)
• Experience with data security and data access controls and design
• Experience with AWS or Azure data storage and management technologies such as S3 and ADLS
• Build processes supporting data transformation, data structures, metadata, dependency and workload management
• Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning and troubleshoot
• Provide resolution to an extensive range of complicated data pipeline related problems, proactively and as issues surface
• Must have expertise in AWS or Azure Platform as a Service (PAAS)
• Certified Snowflake cloud data warehouse Architect (Desirable)
• Should be able to troubleshoot problems across infrastructure, platform and application domains.
• Must have experience of Agile development methodologies
• Strong written communication skills. Is effective and persuasive in both written and oral communication

Nice to have Skills/Qualifications:Bachelor's and/or master’s degree in computer science or equivalent experience.
• Strong communication, analytical and problem-solving skills with a high attention to detail.

About you:
• You are self-motivated, collaborative, eager to learn, and hands on
• You love trying out new apps, and find yourself coming up with ideas to improve them
• You stay ahead with all the latest trends and technologies
• You are particular about following industry best practices and have high standards regarding quality

Big Data Engineer

Hiring for one of the MNC for India location

Agency job

via Natalie Consultants by Rahul Kumar

Gurugram, Pune, Bengaluru (Bangalore), Delhi, Noida, Ghaziabad, Faridabad

2 - 9 yrs

₹8L - ₹20L / yr

Python

Hadoop

Big Data

Spark

Data engineering

+3 more

Key Responsibilities : ( Data Developer Python, Spark)

Exp : 2 to 9 Yrs

Development of data platforms, integration frameworks, processes, and code.

Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages

Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.

Elaborate stories in a collaborative agile environment (SCRUM or Kanban)

Familiarity with cloud platforms like GCP, AWS or Azure.

Experience with large data volumes.

Familiarity with writing rest-based services.

Experience with distributed processing and systems

Experience with Hadoop / Spark toolsets

Experience with relational database management systems (RDBMS)

Experience with Data Flow development

Knowledge of Agile and associated development techniques including:

Key Responsibilities : ( Data Developer Python, Spark)

Exp : 2 to 9 Yrs

Development of data platforms, integration frameworks, processes, and code.

Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages

Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.

Elaborate stories in a collaborative agile environment (SCRUM or Kanban)

Familiarity with cloud platforms like GCP, AWS or Azure.

Experience with large data volumes.

Familiarity with writing rest-based services.

Experience with distributed processing and systems

Experience with Hadoop / Spark toolsets

Experience with relational database management systems (RDBMS)

Experience with Data Flow development

Knowledge of Agile and associated development techniques including:

Azure Data Engineer

at Fragma Data Systems

8 recruiters

Posted by Evelyn Charles

Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune

8 - 15 yrs

₹16L - ₹28L / yr

PySpark

SQL Azure

azure synapse

Windows Azure

Azure Data Engineer

+3 more

Technology Skills:

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Technology Skills:

Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
Designing and implementing data engineering, ingestion, and transformation functions

Good to Have:

Experience with Azure Analysis Services
Experience in Power BI
Experience with third-party solutions like Attunity/Stream sets, Informatica
Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
Capacity Planning and Performance Tuning on Azure Stack and Spark.

Data Engineer

Intergral Add Science

Agency job

via Vipsa Talent Solutions by Prashma S R

Pune

5 - 8 yrs

₹9L - ₹25L / yr

Java

Hadoop

Apache Spark

Scala

Python

+3 more

6+ years of recent hands-on Java development
Developing data pipelines in AWS or Google Cloud
Java, Python, JavaScript programming languages
Great understanding of designing for performance, scalability, and reliability of data intensive application
Hadoop MapReduce, Spark, Pig. Understanding of database fundamentals and advanced SQL knowledge.
In-depth understanding of object oriented programming concepts and design patterns
Ability to communicate clearly to technical and non-technical audiences, verbally and in writing
Understanding of full software development life cycle, agile development and continuous integration
Experience in Agile methodologies including Scrum and Kanban

6+ years of recent hands-on Java development
Developing data pipelines in AWS or Google Cloud
Java, Python, JavaScript programming languages
Great understanding of designing for performance, scalability, and reliability of data intensive application
Hadoop MapReduce, Spark, Pig. Understanding of database fundamentals and advanced SQL knowledge.
In-depth understanding of object oriented programming concepts and design patterns
Ability to communicate clearly to technical and non-technical audiences, verbally and in writing
Understanding of full software development life cycle, agile development and continuous integration
Experience in Agile methodologies including Scrum and Kanban

Data Engineer

Fast paced Startup

Agency job

via Kavayah People Consulting by Kavita Singh

Pune

3 - 6 yrs

₹15L - ₹22L / yr

Big Data

Data engineering

Hadoop

Spark

Apache Hive

+6 more

ears of Exp: 3-6+ Years
Skills: Scala, Python, Hive, Airflow, Spark

Languages: Java, Python, Shell Scripting

GCP: BigTable, DataProc, BigQuery, GCS, Pubsub

OR
AWS: Athena, Glue, EMR, S3, Redshift

MongoDB, MySQL, Kafka

Platforms: Cloudera / Hortonworks
AdTech domain experience is a plus.
Job Type - Full Time

Sr Data Engineer

at Infogain

Agency job

via Technogen India PvtLtd by RAHUL BATTA

Bengaluru (Bangalore), Pune, Noida, NCR (Delhi | Gurgaon | Noida)

7 - 10 yrs

₹20L - ₹25L / yr

Data engineering

Python

SQL

Spark

PySpark

+10 more

Sr. Data Engineer:

Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
Have good understanding on Foundry Platform landscape and it’s capabilities
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
Designs data integrations and data quality framework.
Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

Strong data engineering background
Experience with Clinical Data Model is preferred
Experience in

SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
Java and Groovy for our back-end applications and data integration tools
Python for data processing and analysis
Cloud infrastructure based on AWS EC2 and S3

7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
5+ years of Python and Pyspark development experience
Strong troubleshooting and problem solving skills
BTech or master's degree in computer science or a related technical field
Experience designing, building, and maintaining big data pipelines systems
Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
Hand-on experience in AWS / Azure cloud platform and stack
Strong in API based architecture and concept, able to do quick PoC using API integration and development
Knowledge of machine learning and AI
Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Sr. Data Engineer:

Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python

Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred

Major accountabilities:

Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
Have good understanding on Foundry Platform landscape and it’s capabilities
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
Designs data integrations and data quality framework.
Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed

Desired Candidate Profile :

Strong data engineering background
Experience with Clinical Data Model is preferred
Experience in

SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
Java and Groovy for our back-end applications and data integration tools
Python for data processing and analysis
Cloud infrastructure based on AWS EC2 and S3

7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
5+ years of Python and Pyspark development experience
Strong troubleshooting and problem solving skills
BTech or master's degree in computer science or a related technical field
Experience designing, building, and maintaining big data pipelines systems
Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
Hand-on experience in AWS / Azure cloud platform and stack
Strong in API based architecture and concept, able to do quick PoC using API integration and development
Knowledge of machine learning and AI
Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.

Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision

Data Scientist

at Foghorn Systems

1 recruiter

Posted by Abhishek Vijayvargia

Pune

0 - 7 yrs

₹15L - ₹50L / yr

R Programming

Python

Data Science

Role and Responsibilities

Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).

Core Qualifications
Candidates must meet ALL of the following qualifications:

Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
Three years of industry work experience, developing data mining models which were deployed and used.
Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks

Bonus Qualifications
Any of the following extra qualifications will make a candidate more competitive:

Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains

About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.

Press: https://www.foghorn.io/press-room/">https://www.foghorn.io/press-room/

Awards: https://www.foghorn.io/awards-and-recognition/">https://www.foghorn.io/awards-and-recognition/

2019 Edge Computing Company of the Year – Compass Intelligence
2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
2018 The 10 Most Innovative AI Solution Providers – Insights Success
2018 The AI 100 – CB Insights
2017 Cool Vendor in IoT Edge Computing – Gartner
2017 20 Most Promising AI Service Providers – CIO Review

Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.

About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.

Our data scientists like our team because:

We care about “best practices”
Have a direct impact on the company’s revenue
Give or receive mentoring as part of the collaborative process
Questions and challenging the status quo with data is safe
Intellectual curiosity balanced with humility
Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning

Role and Responsibilities

Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).

Core Qualifications
Candidates must meet ALL of the following qualifications:

Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
Three years of industry work experience, developing data mining models which were deployed and used.
Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks

Bonus Qualifications
Any of the following extra qualifications will make a candidate more competitive:

Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains

Press: https://www.foghorn.io/press-room/">https://www.foghorn.io/press-room/

Awards: https://www.foghorn.io/awards-and-recognition/">https://www.foghorn.io/awards-and-recognition/

2019 Edge Computing Company of the Year – Compass Intelligence
2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
2018 The 10 Most Innovative AI Solution Providers – Insights Success
2018 The AI 100 – CB Insights
2017 Cool Vendor in IoT Edge Computing – Gartner
2017 20 Most Promising AI Service Providers – CIO Review

Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.

Our data scientists like our team because:

We care about “best practices”
Have a direct impact on the company’s revenue
Give or receive mentoring as part of the collaborative process
Questions and challenging the status quo with data is safe
Intellectual curiosity balanced with humility
Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning

Bigdata Lead

at Saama Technologies

6 recruiters

Posted by Sandeep Chaudhary

Pune

2 - 5 yrs

₹1L - ₹18L / yr

Hadoop

Spark

Apache Hive

Apache Flume

Java

+5 more

Description Deep experience and understanding of Apache Hadoop and surrounding technologies required; Experience with Spark, Impala, Hive, Flume, Parquet and MapReduce. Strong understanding of development languages to include: Java, Python, Scala, Shell Scripting Expertise in Apache Spark 2. x framework principals and usages. Should be proficient in developing Spark Batch and Streaming job in Python, Scala or Java. Should have proven experience in performance tuning of Spark applications both from application code and configuration perspective. Should be proficient in Kafka and integration with Spark. Should be proficient in Spark SQL and data warehousing techniques using Hive. Should be very proficient in Unix shell scripting and in operating on Linux. Should have knowledge about any cloud based infrastructure. Good experience in tuning Spark applications and performance improvements. Strong understanding of data profiling concepts and ability to operationalize analyses into design and development activities Experience with best practices of software development; Version control systems, automated builds, etc. Experienced in and able to lead the following phases of the Software Development Life Cycle on any project (feasibility planning, analysis, development, integration, test and implementation) Capable of working within the team or as an individual Experience to create technical documentation

Get to hear about interesting companies hiring right now

Follow Cutshort

Why apply via Cutshort?

Connect with actual hiring teams and get their fast response. No spam.

Find more jobs

Get to hear about interesting companies hiring right now

Follow Cutshort