11+ Data cleansing Jobs in Chennai | Data cleansing Job openings in Chennai
Apply to 11+ Data cleansing Jobs in Chennai on CutShort.io. Explore the latest Data cleansing Job opportunities across top companies like Google, Amazon & Adobe.
- Partnering with internal business owners (product, marketing, edit, etc.) to understand needs and develop custom analysis to optimize for user engagement and retention
- Good understanding of the underlying business and workings of cross functional teams for successful execution
- Design and develop analyses based on business requirement needs and challenges.
- Leveraging statistical analysis on consumer research and data mining projects, including segmentation, clustering, factor analysis, multivariate regression, predictive modeling, etc.
- Providing statistical analysis on custom research projects and consult on A/B testing and other statistical analysis as needed. Other reports and custom analysis as required.
- Identify and use appropriate investigative and analytical technologies to interpret and verify results.
- Apply and learn a wide variety of tools and languages to achieve results
- Use best practices to develop statistical and/ or machine learning techniques to build models that address business needs.
Requirements
- 2 - 4 years of relevant experience in Data science.
- Preferred education: Bachelor's degree in a technical field or equivalent experience.
- Experience in advanced analytics, model building, statistical modeling, optimization, and machine learning algorithms.
- Machine Learning Algorithms: Crystal clear understanding, coding, implementation, error analysis, model tuning knowledge on Linear Regression, Logistic Regression, SVM, shallow Neural Networks, clustering, Decision Trees, Random forest, XGBoost, Recommender Systems, ARIMA and Anomaly Detection. Feature selection, hyper parameters tuning, model selection and error analysis, boosting and ensemble methods.
- Strong with programming languages like Python and data processing using SQL or equivalent and ability to experiment with newer open source tools.
- Experience in normalizing data to ensure it is homogeneous and consistently formatted to enable sorting, query and analysis.
- Experience designing, developing, implementing and maintaining a database and programs to manage data analysis efforts.
- Experience with big data and cloud computing viz. Spark, Hadoop (MapReduce, PIG, HIVE).
- Experience in risk and credit score domains preferred.
Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.
Key Responsibilities:
- Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
- Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
- Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
- Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
- Optimize and tune data pipelines for performance, scalability, and reliability.
- Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
- Implement security measures and best practices to protect sensitive data.
- Stay updated with emerging technologies and best practices in data engineering and data visualization.
- Document processes, workflows, and configurations to maintain a comprehensive knowledge base.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
- Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
- Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
- Solid understanding of data analytics concepts and techniques.
- Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
- Proficiency in programming languages such as Python and SQL.
- Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
- Excellent analytical and problem-solving skills with attention to detail.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
- Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.
Preferred Qualifications:
- Advanced degree in Computer Science, Engineering, or related field.
- Certifications in Power BI or related technologies.
- Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
- Knowledge of machine learning concepts and frameworks.
Job Description – Data Science
Basic Qualification:
- ME/MS from premier institute with a background in Mechanical/Industrial/Chemical/Materials engineering.
- Strong Analytical skills and application of Statistical techniques to problem solving
- Expertise in algorithms, data structures and performance optimization techniques
- Proven track record of demonstrating end to end ownership involving taking an idea from incubator to market
- Minimum years of experience in data analysis (2+), statistical analysis, data mining, algorithms for optimization.
Responsibilities
The Data Engineer/Analyst will
- Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions.
- Clear interaction with Business teams including product planning, sales, marketing, finance for defining the projects, objectives.
- Mine and analyze data from company databases to drive optimization and improvement of product and process development, marketing techniques and business strategies
- Coordinate with different R&D and Business teams to implement models and monitor outcomes.
- Mentor team members towards developing quick solutions for business impact.
- Skilled at all stages of the analysis process including defining key business questions, recommending measures, data sources, methodology and study design, dataset creation, analysis execution, interpretation and presentation and publication of results.
- 4+ years’ experience in MNC environment with projects involving ML, DL and/or DS
- Experience in Machine Learning, Data Mining or Machine Intelligence (Artificial Intelligence)
- Knowledge on Microsoft Azure will be desired.
- Expertise in machine learning such as Classification, Data/Text Mining, NLP, Image Processing, Decision Trees, Random Forest, Neural Networks, Deep Learning Algorithms
- Proficient in Python and its various libraries such as Numpy, MatPlotLib, Pandas
- Superior verbal and written communication skills, ability to convey rigorous mathematical concepts and considerations to Business Teams.
- Experience in infra development / building platforms is highly desired.
- A drive to learn and master new technologies and techniques.
Company - Tekclan Software Solutions
Position – SQL Developer
Experience – Minimum 4+ years of experience in MS SQL server, SQL Programming, ETL development.
Location - Chennai
We are seeking a highly skilled SQL Developer with expertise in MS SQL Server, SSRS, SQL programming, writing stored procedures, and proficiency in ETL using SSIS. The ideal candidate will have a strong understanding of database concepts, query optimization, and data modeling.
Responsibilities:
1. Develop, optimize, and maintain SQL queries, stored procedures, and functions for efficient data retrieval and manipulation.
2. Design and implement ETL processes using SSIS for data extraction, transformation, and loading from various sources.
3. Collaborate with cross-functional teams to gather business requirements and translate them into technical specifications.
4. Create and maintain data models, ensuring data integrity, normalization, and performance.
5. Generate insightful reports and dashboards using SSRS to facilitate data-driven decision making.
6. Troubleshoot and resolve database performance issues, bottlenecks, and data inconsistencies.
7. Conduct thorough testing and debugging of SQL code to ensure accuracy and reliability.
8. Stay up-to-date with emerging trends and advancements in SQL technologies and provide recommendations for improvement.
9. Should be an independent and individual contributor.
Requirements:
1. Minimum of 4+ years of experience in MS SQL server, SQL Programming, ETL development.
2. Proven experience as a SQL Developer with a strong focus on MS SQL Server.
3. Proficiency in SQL programming, including writing complex queries, stored procedures, and functions.
4. In-depth knowledge of ETL processes and hands-on experience with SSIS.
5. Strong expertise in creating reports and dashboards using SSRS.
6. Familiarity with database design principles, query optimization, and data modeling.
7. Experience with performance tuning and troubleshooting SQL-related issues.
8. Excellent problem-solving skills and attention to detail.
9. Strong communication and collaboration abilities.
10. Ability to work independently and handle multiple tasks simultaneously.
Preferred Skills:
1. Certification in MS SQL Server or related technologies.
2. Knowledge of other database systems such as Oracle or MySQL.
3. Familiarity with data warehousing concepts and tools.
4. Experience with version control systems.
Responsibilities:
- Must be able to write quality code and build secure, highly available systems.
- Assemble large, complex datasets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing datadelivery, re-designing infrastructure for greater scalability, etc with the guidance.
- Create datatools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Monitoring performance and advising any necessary infrastructure changes.
- Defining dataretention policies.
- Implementing the ETL process and optimal data pipeline architecture
- Build analytics tools that utilize the datapipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Develop, test, and implement datasolutions based on finalized design documents.
- Work with dataand analytics experts to strive for greater functionality in our data
- Proactively identify potential production issues and recommend and implement solutions
Skillsets:
- Good understanding of optimal extraction, transformation, and loading of datafrom a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable datasize (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of datafrom multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Creation of DAGs for dataengineering
- Expert at Python /Scala programming, especially for dataengineering/ ETL purposes
Skills and requirements
- Experience analyzing complex and varied data in a commercial or academic setting.
- Desire to solve new and complex problems every day.
- Excellent ability to communicate scientific results to both technical and non-technical team members.
Desirable
- A degree in a numerically focused discipline such as, Maths, Physics, Chemistry, Engineering or Biological Sciences..
- Hands on experience on Python, Pyspark, SQL
- Hands on experience on building End to End Data Pipelines.
- Hands on Experience on Azure Data Factory, Azure Data Bricks, Data Lake - added advantage
- Hands on Experience in building data pipelines.
- Experience with Bigdata Tools, Hadoop, Hive, Sqoop, Spark, SparkSQL
- Experience with SQL or NoSQL databases for the purposes of data retrieval and management.
- Experience in data warehousing and business intelligence tools, techniques and technology, as well as experience in diving deep on data analysis or technical issues to come up with effective solutions.
- BS degree in math, statistics, computer science or equivalent technical field.
- Experience in data mining structured and unstructured data (SQL, ETL, data warehouse, Machine Learning etc.) in a business environment with large-scale, complex data sets.
- Proven ability to look at solutions in unconventional ways. Sees opportunities to innovate and can lead the way.
- Willing to learn and work on Data Science, ML, AI.
Experience – 3 – 12 yrs
Budget - Open
Location - PAN India (Noida/Bangaluru/Hyderabad/Chennai)
Presto Developer (4)
Understanding of distributed SQL query engine running on Hadoop
Design and develop core components for Presto
Contribute to the ongoing Presto development by implementing new features, bug fixes, and other improvements
Develop new and extend existing Presto connectors to various data sources
Lead complex and technically challenging projects from concept to completion
Write tests and contribute to ongoing automation infrastructure development
Run and analyze software performance metrics
Collaborate with teams globally across multiple time zones and operate in an Agile development environment
Hands-on experience and interest with Hadoop
Job Summary
As a Data Science Lead, you will manage multiple consulting projects of varying complexity and ensure on-time and on-budget delivery for clients. You will lead a team of data scientists and collaborate across cross-functional groups, while contributing to new business development, supporting strategic business decisions and maintaining & strengthening client base
- Work with team to define business requirements, come up with analytical solution and deliver the solution with specific focus on Big Picture to drive robustness of the solution
- Work with teams of smart collaborators. Be responsible for their appraisals and career development.
- Participate and lead executive presentations with client leadership stakeholders.
- Be part of an inclusive and open environment. A culture where making mistakes and learning from them is part of life
- See how your work contributes to building an organization and be able to drive Org level initiatives that will challenge and grow your capabilities.
Role & Responsibilities
- Serve as expert in Data Science, build framework to develop Production level DS/AI models.
- Apply AI research and ML models to accelerate business innovation and solve impactful business problems for our clients.
- Lead multiple teams across clients ensuring quality and timely outcomes on all projects.
- Lead and manage the onsite-offshore relation, at the same time adding value to the client.
- Partner with business and technical stakeholders to translate challenging business problems into state-of-the-art data science solutions.
- Build a winning team focused on client success. Help team members build lasting career in data science and create a constant learning/development environment.
- Present results, insights, and recommendations to senior management with an emphasis on the business impact.
- Build engaging rapport with client leadership through relevant conversations and genuine business recommendations that impact the growth and profitability of the organization.
- Lead or contribute to org level initiatives to build the Tredence of tomorrow.
Qualification & Experience
- Bachelor's /Master's /PhD degree in a quantitative field (CS, Machine learning, Mathematics, Statistics, Data Science) or equivalent experience.
- 6-10+ years of experience in data science, building hands-on ML models
- Expertise in ML – Regression, Classification, Clustering, Time Series Modeling, Graph Network, Recommender System, Bayesian modeling, Deep learning, Computer Vision, NLP/NLU, Reinforcement learning, Federated Learning, Meta Learning.
- Proficient in some or all of the following techniques: Linear & Logistic Regression, Decision Trees, Random Forests, K-Nearest Neighbors, Support Vector Machines ANOVA , Principal Component Analysis, Gradient Boosted Trees, ANN, CNN, RNN, Transformers.
- Knowledge of programming languages SQL, Python/ R, Spark.
- Expertise in ML frameworks and libraries (TensorFlow, Keras, PyTorch).
- Experience with cloud computing services (AWS, GCP or Azure)
- Expert in Statistical Modelling & Algorithms E.g. Hypothesis testing, Sample size estimation, A/B testing
- Knowledge in Mathematical programming – Linear Programming, Mixed Integer Programming etc , Stochastic Modelling – Markov chains, Monte Carlo, Stochastic Simulation, Queuing Models.
- Experience with Optimization Solvers (Gurobi, Cplex) and Algebraic programming Languages(PulP)
- Knowledge in GPU code optimization, Spark MLlib Optimization.
- Familiarity to deploy and monitor ML models in production, delivering data products to end-users.
- Experience with ML CI/CD pipelines.
15 years US based Product Company
- Should have good hands-on experience in Informatica MDM Customer 360, Data Integration(ETL) using PowerCenter, Data Quality.
- Must have strong skills in Data Analysis, Data Mapping for ETL processes, and Data Modeling.
- Experience with the SIF framework including real-time integration
- Should have experience in building C360 Insights using Informatica
- Should have good experience in creating performant design using Mapplets, Mappings, Workflows for Data Quality(cleansing), ETL.
- Should have experience in building different data warehouse architecture like Enterprise,
- Federated, and Multi-Tier architecture.
- Should have experience in configuring Informatica Data Director in reference to the Data
- Governance of users, IT Managers, and Data Stewards.
- Should have good knowledge in developing complex PL/SQL queries.
- Should have working experience on UNIX and shell scripting to run the Informatica workflows and to control the ETL flow.
- Should know about Informatica Server installation and knowledge on the Administration console.
- Working experience with Developer with Administration is added knowledge.
- Working experience in Amazon Web Services (AWS) is an added advantage. Particularly on AWS S3, Data pipeline, Lambda, Kinesis, DynamoDB, and EMR.
- Should be responsible for the creation of automated BI solutions, including requirements, design,development, testing, and deployment