Data steward Jobs in Pune

11+ Data steward Jobs in Pune | Data steward Job openings in Pune

Apply to 11+ Data steward Jobs in Pune on CutShort.io. Explore the latest Data steward Job opportunities across top companies like Google, Amazon & Adobe.

Data Steward

at Infogain

Agency job

via Technogen India PvtLtd by RAHUL BATTA

NCR (Delhi | Gurgaon | Noida), Bengaluru (Bangalore), Mumbai, Pune

7 - 8 yrs

₹15L - ₹16L / yr

Data steward

MDM

Tamr

Reltio

Data engineering

+7 more

Data Steward :

Data Steward will collaborate and work closely within the group software engineering and business division. Data Steward has overall accountability for the group's / Divisions overall data and reporting posture by responsibly managing data assets, data lineage, and data access, supporting sound data analysis. This role requires focus on data strategy, execution, and support for projects, programs, application enhancements, and production data fixes. Makes well-thought-out decisions on complex or ambiguous data issues and establishes the data stewardship and information management strategy and direction for the group. Effectively communicates to individuals at various levels of the technical and business communities. This individual will become part of the corporate Data Quality and Data management/entity resolution team supporting various systems across the board.

Primary Responsibilities:

Responsible for data quality and data accuracy across all group/division delivery initiatives.
Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
Responsible for reviewing and governing data queries and DML.
Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
Accountable for the performance, quality, and alignment to requirements for all data query design and development.
Responsible for defining standards and best practices for data analysis, modeling, and queries.
Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
Owns group's data assets including reports, data warehouse, etc.
Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
Responsible for solving data-related issues and communicating resolutions with other solution domains.
Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.

Additional Responsibilities:

Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
Knowledge and understanding of Information Technology systems and software development.
Experience with data modeling and test data management tools.
Experience in the data integration project • Good problem solving & decision-making skills.
Good communication skills within the team, site, and with the customer

Knowledge, Skills and Abilities

Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
Solid understanding of key DBMS platforms like SQL Server, Azure SQL
Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
Experience in Report and Dashboard development
Statistical and Machine Learning models
Python (sklearn, numpy, pandas, genism)
Nice to Have:
1yr of ETL experience
Natural Language Processing
Neural networks and Deep learning
xperience in keras,tensorflow,spacy, nltk, LightGBM python library

Interaction : Frequently interacts with subordinate supervisors.

Education : Bachelor’s degree, preferably in Computer Science, B.E or other quantitative field related to the area of assignment. Professional certification related to the area of assignment may be required

Experience : 7 years of Pharmaceutical /Biotech/life sciences experience, 5 years of Clinical Trials experience and knowledge, Excellent Documentation, Communication, and Presentation Skills including PowerPoint

Data Steward :

Primary Responsibilities:

Responsible for data quality and data accuracy across all group/division delivery initiatives.
Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
Responsible for reviewing and governing data queries and DML.
Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
Accountable for the performance, quality, and alignment to requirements for all data query design and development.
Responsible for defining standards and best practices for data analysis, modeling, and queries.
Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
Owns group's data assets including reports, data warehouse, etc.
Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
Responsible for solving data-related issues and communicating resolutions with other solution domains.
Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.

Additional Responsibilities:

Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
Knowledge and understanding of Information Technology systems and software development.
Experience with data modeling and test data management tools.
Experience in the data integration project • Good problem solving & decision-making skills.
Good communication skills within the team, site, and with the customer

Knowledge, Skills and Abilities

Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
Solid understanding of key DBMS platforms like SQL Server, Azure SQL
Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
Experience in Report and Dashboard development
Statistical and Machine Learning models
Python (sklearn, numpy, pandas, genism)
Nice to Have:
1yr of ETL experience
Natural Language Processing
Neural networks and Deep learning
xperience in keras,tensorflow,spacy, nltk, LightGBM python library

Interaction : Frequently interacts with subordinate supervisors.

AWS Data Engineer (Contractual)

at Forward Eye Technologies

Posted by Jaya S

Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai

3 - 7 yrs

₹8L - ₹15L / yr

AWS Lambda

Amazon S3

Amazon VPC

Amazon EC2

Amazon Redshift

+3 more

Technical Skills:

Ability to understand and translate business requirements into design.
Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
Experience in creating ETL jobs using Python/PySpark.
Proficiency in creating AWS Lambda functions for event-based jobs.
Knowledge of automating ETL processes using AWS Step Functions.
Competence in building data warehouses and loading data into them.

Responsibilities:

Understand business requirements and translate them into design.
Assess AWS infrastructure needs for development work.
Develop ETL jobs using Python/PySpark to meet requirements.
Implement AWS Lambda for event-based tasks.
Automate ETL processes using AWS Step Functions.
Build data warehouses and manage data loading.
Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.

Technical Skills:

Ability to understand and translate business requirements into design.
Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
Experience in creating ETL jobs using Python/PySpark.
Proficiency in creating AWS Lambda functions for event-based jobs.
Knowledge of automating ETL processes using AWS Step Functions.
Competence in building data warehouses and loading data into them.

Responsibilities:

Understand business requirements and translate them into design.
Assess AWS infrastructure needs for development work.
Develop ETL jobs using Python/PySpark to meet requirements.
Implement AWS Lambda for event-based tasks.
Automate ETL processes using AWS Step Functions.
Build data warehouses and manage data loading.
Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.

Senior GCP Data Lead

at Arahas Technologies

Posted by Nidhi Shivane

Pune

3 - 8 yrs

₹10L - ₹20L / yr

PySpark

Data engineering

Big Data

Hadoop

Spark

+3 more

Role Description

This is a full-time hybrid role as a GCP Data Engineer,. As a GCP Data Engineer, you will be responsible for managing large sets of structured and unstructured data and developing processes to convert data into insights, information, and knowledge.

Skill Name: GCP Data Engineer

Experience: 7-10 years

Notice Period: 0-15 days

Location :-Pune

If you have a passion for data engineering and possess the following , we would love to hear from you:

🔹 7 to 10 years of experience working on Software Development Life Cycle (SDLC)

🔹 At least 4+ years of experience in Google Cloud platform, with a focus on Big Query

🔹 Proficiency in Java and Python, along with experience in Google Cloud SDK & API Scripting

🔹 Experience in the Finance/Revenue domain would be considered an added advantage

🔹 Familiarity with GCP Migration activities and the DBT Tool would also be beneficial

You will play a crucial role in developing and maintaining our data infrastructure on the Google Cloud platform.

Your expertise in SDLC, Big Query, Java, Python, and Google Cloud SDK & API Scripting will be instrumental in ensuring the smooth operation of our data systems..

Join our dynamic team and contribute to our mission of harnessing the power of data to make informed business decisions.