11+ Natural Language Toolkit (NLTK) Jobs in Hyderabad | Natural Language Toolkit (NLTK) Job openings in Hyderabad
Apply to 11+ Natural Language Toolkit (NLTK) Jobs in Hyderabad on CutShort.io. Explore the latest Natural Language Toolkit (NLTK) Job opportunities across top companies like Google, Amazon & Adobe.
Publicis Sapient Overview:
The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
.
Job Summary:
As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.
Role & Responsibilities:
Your role is focused on Design, Development and delivery of solutions involving:
• Data Integration, Processing & Governance
• Data Storage and Computation Frameworks, Performance Optimizations
• Analytics & Visualizations
• Infrastructure & Cloud Computing
• Data Management Platforms
• Implement scalable architectural models for data processing and storage
• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode
• Build functionality for data analytics, search and aggregation
Experience Guidelines:
Mandatory Experience and Competencies:
# Competency
1.Overall 5+ years of IT experience with 3+ years in Data related technologies
2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)
3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.
4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable
5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc
6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security
Preferred Experience and Knowledge (Good to Have):
# Competency
1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience
2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc
3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures
4.Performance tuning and optimization of data pipelines
5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality
6.Cloud data specialty and other related Big data technology certifications
Personal Attributes:
• Strong written and verbal communication skills
• Articulation skills
• Good team player
• Self-starter who requires minimal oversight
• Ability to prioritize and manage multiple tasks
• Process orientation and the ability to define and set up processes
A Bachelor’s degree in data science, statistics, computer science, or a similar field
2+ years industry experience working in a data science role, such as statistics, machine learning,
deep learning, quantitative financial analysis, data engineering or natural language processing
Domain experience in Financial Services (banking, insurance, risk, funds) is preferred
Have and experience and be involved in producing and rapidly delivering minimum viable products,
results focused with ability to prioritize the most impactful deliverables
Strong Applied Statistics capabilities. Including excellent understanding of Machine Learning
techniques and algorithms
Hands on experience preferable in implementing scalable Machine Learning solutions using Python /
Scala / Java on Azure, AWS or Google cloud platform
Experience with storage frameworks like Hadoop, Spark, Kafka etc
Experience in building &deploying unsupervised, semi-supervised, and supervised models and be
knowledgeable in various ML algorithms such as regression models, Tree-based algorithms,
ensemble learning techniques, distance-based ML algorithms etc
Ability to track down complex data quality and data integration issues, evaluate different algorithmic
approaches, and analyse data to solve problems.
Experience in implementing parallel processing and in-memory frameworks such as H2O.ai
We are looking out for a Snowflake developer for one of our premium clients for their PAN India loaction
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
Job Title: Data Engineer
Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.
Responsibilities:
- Design, build, and maintain data pipelines to collect, store, and process data from various sources.
- Create and manage data warehousing and data lake solutions.
- Develop and maintain data processing and data integration tools.
- Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
- Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
- Ensure data quality and integrity across all data sources.
- Develop and implement best practices for data governance, security, and privacy.
- Monitor data pipeline performance / Errors and troubleshoot issues as needed.
- Stay up-to-date with emerging data technologies and best practices.
Requirements:
Bachelor's degree in Computer Science, Information Systems, or a related field.
Experience with ETL tools like Matillion,SSIS,Informatica
Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.
Experience in writing complex SQL queries
Strong programming skills in languages such as Python, Java, or Scala.
Experience with data modeling, data warehousing, and data integration.
Strong problem-solving skills and ability to work independently.
Excellent communication and collaboration skills.
Familiarity with big data technologies such as Hadoop, Spark, or Kafka.
Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks
Familiarity with cloud computing platforms such as AWS, Azure, or GCP.
Familiarity with Reporting tools
Teamwork/ growth contribution
- Helping the team in taking the Interviews and identifying right candidates
- Adhering to timelines
- Intime status communication and upfront communication of any risks
- Tech, train, share knowledge with peers.
- Good Communication skills
- Proven abilities to take initiative and be innovative
- Analytical mind with a problem-solving aptitude
Good to have :
Master's degree in Computer Science, Information Systems, or a related field.
Experience with NoSQL databases such as MongoDB or Cassandra.
Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.
Knowledge of machine learning and statistical modeling techniques.
If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.
• Hadoop Ecosystem (HBase, Hive, MapReduce, HDFS, Pig, Sqoop etc)
• should have good hands-on Spark (spark with java/PySpark)
• Hive
• must be good with SQL's(spark SQL/ HiveQL)
• Application design, software development and automated testing
Environment Experience:
• Experience with implementing integrated automated release management using tools/technologies/frameworks like Maven, Git, code/security review tools, Jenkins, Automated testing, and Junit.
• Demonstrated experience with Agile or other rapid application development methods
• Cloud development (AWS/Azure/GCP)
• Unix / Shell scripting
• Web services , open API development, and REST concepts
at Altimetrik
- Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
- Experience in developing lambda functions with AWS Lambda
- Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
- Should be able to code in Python and Scala.
- Snowflake experience will be a plus
Tiger Analytics is a global AI & analytics consulting firm. With data and technology at the core of our solutions, we are solving some of the toughest problems out there. Our culture is modeled around expertise and mutual respect with a team first mindset. Working at Tiger, you’ll be at the heart of this AI revolution. You’ll work with teams that push the boundaries of what-is-possible and build solutions that energize and inspire.
We are headquartered in the Silicon Valley and have our delivery centres across the globe. The below role is for our Chennai or Bangalore office, or you can choose to work remotely.
About the Role:
As an Associate Director - Data Science at Tiger Analytics, you will lead data science aspects of endto-end client AI & analytics programs. Your role will be a combination of hands-on contribution, technical team management, and client interaction.
• Work closely with internal teams and client stakeholders to design analytical approaches to
solve business problems
• Develop and enhance a broad range of cutting-edge data analytics and machine learning
problems across a variety of industries.
• Work on various aspects of the ML ecosystem – model building, ML pipelines, logging &
versioning, documentation, scaling, deployment, monitoring and maintenance etc.
• Lead a team of data scientists and engineers to embed AI and analytics into the client
business decision processes.
Desired Skills:
• High level of proficiency in a structured programming language, e.g. Python, R.
• Experience designing data science solutions to business problems
• Deep understanding of ML algorithms for common use cases in both structured and
unstructured data ecosystems.
• Comfortable with large scale data processing and distributed computing
• Excellent written and verbal communication skills
• 10+ years exp of which 8 years of relevant data science experience including hands-on
programming.
Designation will be commensurate with expertise/experience. Compensation packages among the best in the industry.