11+ XGBoost Jobs in Pune | XGBoost Job openings in Pune
Apply to 11+ XGBoost Jobs in Pune on CutShort.io. Explore the latest XGBoost Job opportunities across top companies like Google, Amazon & Adobe.
XressBees – a logistics company started in 2015 – is amongst the fastest growing companies of its sector. Our
vision to evolve into a strong full-service logistics organization reflects itself in the various lines of business like B2C
logistics 3PL, B2B Xpress, Hyperlocal and Cross border Logistics.
Our strong domain expertise and constant focus on innovation has helped us rapidly evolve as the most trusted
logistics partner of India. XB has progressively carved our way towards best-in-class technology platforms, an
extensive logistics network reach, and a seamless last mile management system.
While on this aggressive growth path, we seek to become the one-stop-shop for end-to-end logistics solutions. Our
big focus areas for the very near future include strengthening our presence as service providers of choice and
leveraging the power of technology to drive supply chain efficiencies.
Job Overview
XpressBees would enrich and scale its end-to-end logistics solutions at a high pace. This is a great opportunity to join
the team working on forming and delivering the operational strategy behind Artificial Intelligence / Machine Learning
and Data Engineering, leading projects and teams of AI Engineers collaborating with Data Scientists. In your role, you
will build high performance AI/ML solutions using groundbreaking AI/ML and BigData technologies. You will need to
understand business requirements and convert them to a solvable data science problem statement. You will be
involved in end to end AI/ML projects, starting from smaller scale POCs all the way to full scale ML pipelines in
production.
Seasoned AI/ML Engineers would own the implementation and productionzation of cutting-edge AI driven algorithmic
components for search, recommendation and insights to improve the efficiencies of the logistics supply chain and
serve the customer better.
You will apply innovative ML tools and concepts to deliver value to our teams and customers and make an impact to
the organization while solving challenging problems in the areas of AI, ML , Data Analytics and Computer Science.
Opportunities for application:
- Route Optimization
- Address / Geo-Coding Engine
- Anomaly detection, Computer Vision (e.g. loading / unloading)
- Fraud Detection (fake delivery attempts)
- Promise Recommendation Engine etc.
- Customer & Tech support solutions, e.g. chat bots.
- Breach detection / prediction
An Artificial Intelligence Engineer would apply himself/herself in the areas of -
- Deep Learning, NLP, Reinforcement Learning
- Machine Learning - Logistic Regression, Decision Trees, Random Forests, XGBoost, etc..
- Driving Optimization via LPs, MILPs, Stochastic Programs, and MDPs
- Operations Research, Supply Chain Optimization, and Data Analytics/Visualization
- Computer Vision and OCR technologies
The AI Engineering team enables internal teams to add AI capabilities to their Apps and Workflows easily via APIs
without needing to build AI expertise in each team – Decision Support, NLP, Computer Vision, for Public Clouds and
Enterprise in NLU, Vision and Conversational AI.Candidate is adept at working with large data sets to find
opportunities for product and process optimization and using models to test the effectiveness of different courses of
action. They must have knowledge using a variety of data mining/data analysis methods, using a variety of data tools,
building, and implementing models, using/creating algorithms, and creating/running simulations. They must be
comfortable working with a wide range of stakeholders and functional teams. The right candidate will have a passion
for discovering solutions hidden in large data sets and working with stakeholders to improve business outcomes.
Roles & Responsibilities
● Develop scalable infrastructure, including microservices and backend, that automates training and
deployment of ML models.
● Building cloud services in Decision Support (Anomaly Detection, Time series forecasting, Fraud detection,
Risk prevention, Predictive analytics), computer vision, natural language processing (NLP) and speech that
work out of the box.
● Brainstorm and Design various POCs using ML/DL/NLP solutions for new or existing enterprise problems.
● Work with fellow data scientists/SW engineers to build out other parts of the infrastructure, effectively
communicating your needs and understanding theirs and address external and internal shareholder's
product challenges.
● Build core of Artificial Intelligence and AI Services such as Decision Support, Vision, Speech, Text, NLP, NLU,
and others.
● Leverage Cloud technology –AWS, GCP, Azure
● Experiment with ML models in Python using machine learning libraries (Pytorch, Tensorflow), Big Data,
Hadoop, HBase, Spark, etc
● Work with stakeholders throughout the organization to identify opportunities for leveraging company data to
drive business solutions.
● Mine and analyze data from company databases to drive optimization and improvement of product
development, marketing techniques and business strategies.
● Assess the effectiveness and accuracy of new data sources and data gathering techniques.
● Develop custom data models and algorithms to apply to data sets.
● Use predictive modeling to increase and optimize customer experiences, supply chain metric and other
business outcomes.
● Develop company A/B testing framework and test model quality.
● Coordinate with different functional teams to implement models and monitor outcomes.
● Develop processes and tools to monitor and analyze model performance and data accuracy.
● Develop scalable infrastructure, including microservices and backend, that automates training and
deployment of ML models.
● Brainstorm and Design various POCs using ML/DL/NLP solutions for new or existing enterprise problems.
● Work with fellow data scientists/SW engineers to build out other parts of the infrastructure, effectively
communicating your needs and understanding theirs and address external and internal shareholder's
product challenges.
● Deliver machine learning and data science projects with data science techniques and associated libraries
such as AI/ ML or equivalent NLP (Natural Language Processing) packages. Such techniques include a good
to phenomenal understanding of statistical models, probabilistic algorithms, classification, clustering, deep
learning or related approaches as it applies to financial applications.
● The role will encourage you to learn a wide array of capabilities, toolsets and architectural patterns for
successful delivery.
What is required of you?
You will get an opportunity to build and operate a suite of massive scale, integrated data/ML platforms in a broadly
distributed, multi-tenant cloud environment.
● B.S., M.S., or Ph.D. in Computer Science, Computer Engineering
● Coding knowledge and experience with several languages: C, C++, Java,JavaScript, etc.
● Experience with building high-performance, resilient, scalable, and well-engineered systems
● Experience in CI/CD and development best practices, instrumentation, logging systems
● Experience using statistical computer languages (R, Python, SLQ, etc.) to manipulate data and draw insights
from large data sets.
● Experience working with and creating data architectures.
● Good understanding of various machine learning and natural language processing technologies, such as
classification, information retrieval, clustering, knowledge graph, semi-supervised learning and ranking.
● Knowledge and experience in statistical and data mining techniques: GLM/Regression, Random Forest,
Boosting, Trees, text mining, social network analysis, etc.
● Knowledge on using web services: Redshift, S3, Spark, Digital Ocean, etc.
● Knowledge on creating and using advanced machine learning algorithms and statistics: regression,
simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.
● Knowledge on analyzing data from 3rd party providers: Google Analytics, Site Catalyst, Core metrics,
AdWords, Crimson Hexagon, Facebook Insights, etc.
● Knowledge on distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, MySQL, Kafka etc.
● Knowledge on visualizing/presenting data for stakeholders using: Quicksight, Periscope, Business Objects,
D3, ggplot, Tableau etc.
● Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural
networks, etc.) and their real-world advantages/drawbacks.
● Knowledge of advanced statistical techniques and concepts (regression, properties of distributions,
statistical tests, and proper usage, etc.) and experience with applications.
● Experience building data pipelines that prep data for Machine learning and complete feedback loops.
● Knowledge of Machine Learning lifecycle and experience working with data scientists
● Experience with Relational databases and NoSQL databases
● Experience with workflow scheduling / orchestration such as Airflow or Oozie
● Working knowledge of current techniques and approaches in machine learning and statistical or
mathematical models
● Strong Data Engineering & ETL skills to build scalable data pipelines. Exposure to data streaming stack (e.g.
Kafka)
● Relevant experience in fine tuning and optimizing ML (especially Deep Learning) models to bring down
serving latency.
● Exposure to ML model productionzation stack (e.g. MLFlow, Docker)
● Excellent exploratory data analysis skills to slice & dice data at scale using SQL in Redshift/BigQuery.
Job Description:
1.Be a hands on problem solver with consultative approach, who can apply Machine Learning & Deep Learning algorithms to solve business challenges
a. Use the knowledge of wide variety of AI/ML techniques and algorithms to find what combinations of these techniques can best solve the problem
b. Improve Model accuracy to deliver greater business impact
c.Estimate business impact due to deployment of model
2.Work with the domain/customer teams to understand business context , data dictionaries and apply relevant Deep Learning solution for the given business challenge
3.Working with tools and scripts for sufficiently pre-processing the data & feature engineering for model development – Python / R / SQL / Cloud data pipelines
4.Design , develop & deploy Deep learning models using Tensorflow / Pytorch
5.Experience in using Deep learning models with text, speech, image and video data
a.Design & Develop NLP models for Text Classification, Custom Entity Recognition, Relationship extraction, Text Summarization, Topic Modeling, Reasoning over Knowledge Graphs, Semantic Search using NLP tools like Spacy and opensource Tensorflow, Pytorch, etc
b.Design and develop Image recognition & video analysis models using Deep learning algorithms and open source tools like OpenCV
c.Knowledge of State of the art Deep learning algorithms
6.Optimize and tune Deep Learnings model for best possible accuracy
7.Use visualization tools/modules to be able to explore and analyze outcomes & for Model validation eg: using Power BI / Tableau
8.Work with application teams, in deploying models on cloud as a service or on-prem
a.Deployment of models in Test / Control framework for tracking
b.Build CI/CD pipelines for ML model deployment
9.Integrating AI&ML models with other applications using REST APIs and other connector technologies
10.Constantly upskill and update with the latest techniques and best practices. Write white papers and create demonstrable assets to summarize the AIML work and its impact.
· Technology/Subject Matter Expertise
- Sufficient expertise in machine learning, mathematical and statistical sciences
- Use of versioning & Collaborative tools like Git / Github
- Good understanding of landscape of AI solutions – cloud, GPU based compute, data security and privacy, API gateways, microservices based architecture, big data ingestion, storage and processing, CUDA Programming
- Develop prototype level ideas into a solution that can scale to industrial grade strength
- Ability to quantify & estimate the impact of ML models.
· Softskills Profile
- Curiosity to think in fresh and unique ways with the intent of breaking new ground.
- Must have the ability to share, explain and “sell” their thoughts, processes, ideas and opinions, even outside their own span of control
- Ability to think ahead, and anticipate the needs for solving the problem will be important
· Ability to communicate key messages effectively, and articulate strong opinions in large forums
· Desirable Experience:
- Keen contributor to open source communities, and communities like Kaggle
- Ability to process Huge amount of Data using Pyspark/Hadoop
- Development & Application of Reinforcement Learning
- Knowledge of Optimization/Genetic Algorithms
- Operationalizing Deep learning model for a customer and understanding nuances of scaling such models in real scenarios
- Optimize and tune deep learning model for best possible accuracy
- Understanding of stream data processing, RPA, edge computing, AR/VR etc
- Appreciation of digital ethics, data privacy will be important
- Experience of working with AI & Cognitive services platforms like Azure ML, IBM Watson, AWS Sagemaker, Google Cloud will all be a big plus
- Experience in platforms like Data robot, Cognitive scale, H2O.AI etc will all be a big plus
1.Advanced knowledge of statistical techniques, NLP, machine learning algorithms and deep
learning
frameworks like Tensorflow, Theano, Keras, Pytorch
2. Proficiency with modern statistical modeling (regression, boosting trees, random forests,
etc.),
machine learning (text mining, neural network, NLP, etc.), optimization (linear
optimization,
nonlinear optimization, stochastic optimization, etc.) methodologies.
3. Build complex predictive models using ML and DL techniques with production quality
code and jointly
own complex data science workflows with the Data Engineering team.
4. Familiar with modern data analytics architecture and data engineering technologies
(SQL and No-SQL databases)
5. Knowledge of REST APIs and Web Services
6. Experience with Python, R, sh/bash
Required Skills (Non-Technical):-
1. Fluent in English Communication (Spoken and verbal)
2. Should be a team player
3. Should have a learning aptitude
4. Detail-oriented, analytical and inquisitive
5. Ability to work independently and with others
6. Extremely organized with strong time-management skills
7. Problem Solving & Critical Thinking
Required Experience Level : Junior level - 2+ years || Senior level- 4+Years
Work Location : Pune preferred, Remote option available
Work Timing : 2:30 PM to 11:30 PM IST
• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team
Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills
Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel
Work Environment
• Customer Office (Mumbai) / Remote Work
Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
at Persistent Systems
Location: Pune/Nagpur,Goa,Hyderabad/
Job Requirements:
- 9 years and above of total experience preferably in bigdata space.
- Creating spark applications using Scala to process data.
- Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
- Experience in spark job performance tuning and optimizations.
- Should have experience in processing data using Kafka/Pyhton.
- Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
- Should be proficient in writing SQL queries to process data in Data Warehouse.
- Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
- Experience on AWS services like EMR.
Technical/Core skills
- Minimum 3 yrs of exp in Informatica Big data Developer(BDM) in Hadoop environment.
- Have knowledge of informatica Power exchange (PWX).
- Minimum 3 yrs of exp in big data querying tool like Hive and Impala.
- Ability to designing/development of complex mappings using informatica Big data Developer.
- Create and manage Informatica power exchange and CDC real time implementation
- Strong Unix knowledge skills for writing shell scripts and troubleshoot of existing scripts.
- Good knowledge of big data platforms and its framework.
- Good to have an experience in cloudera data platform (CDP)
- Experience with building stream processing systems using Kafka and spark
- Excellent SQL knowledge
Soft skills :
- Ability to work independently
- Strong analytical and problem solving skills
- Attitude of learning new technology
- Regular interaction with vendors, partners and stakeholders
Advanced degree in computer science, math, statistics or a related discipline ( Must have master degree )
Extensive data modeling and data architecture skills
Programming experience in Python, R
Background in machine learning frameworks such as TensorFlow or Keras
Knowledge of Hadoop or another distributed computing systems
Experience working in an Agile environment
Advanced math skills (Linear algebra
Discrete math
Differential equations (ODEs and numerical)
Theory of statistics 1
Numerical analysis 1 (numerical linear algebra) and 2 (quadrature)
Abstract algebra
Number theory
Real analysis
Complex analysis
Intermediate analysis (point set topology)) ( important )
Strong written and verbal communications
Hands on experience on NLP and NLG
Experience in advanced statistical techniques and concepts. ( GLM/regression, Random forest, boosting, trees, text mining ) and experience with application.
- 3+ years of experience in Machine Learning
- Bachelors/Masters in Computer Engineering/Science.
- Bachelors/Masters in Engineering/Mathematics/Statistics with sound knowledge of programming and computer concepts.
- 10 and 12th acedemics 70 % & above.
Skills :
- Strong Python/ programming skills
- Good conceptual understanding of Machine Learning/Deep Learning/Natural Language Processing
- Strong verbal and written communication skills.
- Should be able to manage team, meet project deadlines and interface with clients.
- Should be able to work across different domains and quickly ramp up the business processes & flows & translate business problems into the data solutions
Responsibilities for Data Scientist/ NLP Engineer
Work with customers to identify opportunities for leveraging their data to drive business
solutions.
• Develop custom data models and algorithms to apply to data sets.
• Basic data cleaning and annotation for any incoming raw data.
• Use predictive modeling to increase and optimize customer experiences, revenue
generation, ad targeting and other business outcomes.
• Develop company A/B testing framework and test model quality.
• Deployment of ML model in production.
Qualifications for Junior Data Scientist/ NLP Engineer
• BS, MS in Computer Science, Engineering, or related discipline.
• 3+ Years of experience in Data Science/Machine Learning.
• Experience with programming language Python.
• Familiar with at least one database query language, such as SQL
• Knowledge of Text Classification & Clustering, Question Answering & Query Understanding,
Search Indexing & Fuzzy Matching.
• Excellent written and verbal communication skills for coordinating acrossteams.
• Willing to learn and master new technologies and techniques.
• Knowledge and experience in statistical and data mining techniques:
GLM/Regression, Random Forest, Boosting, Trees, text mining, NLP, etc.
• Experience with chatbots would be bonus but not required