11+ MLS Jobs in Delhi, NCR and Gurgaon | MLS Job openings in Delhi, NCR and Gurgaon
Apply to 11+ MLS Jobs in Delhi, NCR and Gurgaon on CutShort.io. Explore the latest MLS Job opportunities across top companies like Google, Amazon & Adobe.
Object-oriented languages (e.g. Python, PySpark, Java, C#, C++ ) and frameworks (e.g. J2EE or .NET)
- Mandatory - Hands on experience in Python and PySpark.
Ā
- Build pySpark applications using Spark Dataframes in Python using Jupyter notebook and PyCharm(IDE).
Ā
- Worked on optimizing spark jobs that processes huge volumes of data.
Ā
- Hands on experience in version control tools like Git.
Ā
- Worked on Amazonās Analytics services like Amazon EMR, Lambda function etc
Ā
- Worked on Amazonās Compute services like Amazon Lambda, Amazon EC2 and Amazonās Storage service like S3 and few other services like SNS.
Ā
- Experience/knowledge of bash/shell scripting will be a plus.
Ā
- Experience in working with fixed width, delimited , multi record file formats etc.
Ā
- Hands on experience in tools like Jenkins to build, test and deploy the applications
Ā
- Awareness of Devops concepts and be able to work in an automated release pipeline environment.
Ā
- Excellent debugging skills.
š Exciting Opportunity: Data Engineer Position in Gurugram š
HelloĀ
We are actively seeking a talented and experienced Data Engineer to join our dynamic team at Reality Motivational Venture in Gurugram (Gurgaon). If you're passionate about data, thrive in a collaborative environment, and possess the skills we're looking for, we want to hear from you!
Position: Data EngineerĀ Ā
Location: Gurugram (Gurgaon)Ā Ā
Experience: 5+ yearsĀ
Key Skills:
- Python
- Spark, Pyspark
- Data Governance
- Cloud (AWS/Azure/GCP)
Main Responsibilities:
- Define and set up analytics environments for "Big Data" applications in collaboration with domain experts.
- Implement ETL processes for telemetry-based and stationary test data.
- Support in defining data governance, including data lifecycle management.
- Develop large-scale data processing engines and real-time search and analytics based on time series data.
- Ensure technical, methodological, and quality aspects.
- Support CI/CD processes.
- Foster know-how development and transfer, continuous improvement of leading technologies within Data Engineering.
- Collaborate with solution architects on the development of complex on-premise, hybrid, and cloud solution architectures.
Qualification Requirements:
- BSc, MSc, MEng, or PhD in Computer Science, Informatics/Telematics, Mathematics/Statistics, or a comparable engineering degree.
- Proficiency in Python and the PyData stack (Pandas/Numpy).
- Experience in high-level programming languages (C#/C++/Java).
- Familiarity with scalable processing environments like Dask (or Spark).
- Proficient in Linux and scripting languages (Bash Scripts).
- Experience in containerization and orchestration of containerized services (Kubernetes).
- Education in database technologies (SQL/OLAP and Non-SQL).
- Interest in Big Data storage technologies (Elastic, ClickHouse).
- Familiarity with Cloud technologies (Azure, AWS, GCP).
- Fluent English communication skills (speaking and writing).
- Ability to work constructively with a global team.
- Willingness to travel for business trips during development projects.
Preferable:
- Working knowledge of vehicle architectures, communication, and components.
- Experience in additional programming languages (C#/C++/Java, R, Scala, MATLAB).
- Experience in time-series processing.
How to Apply:
Interested candidates, please share your updated CV/resume with me.
Thank you for considering this exciting opportunity.
Responsibilities:
Ā
- Designing and implementing fine-tuned production ready data/ML pipelines in HadoopĀ platform.
- Driving optimization, testing and tooling to improve quality.
- Reviewing and approving high level & amp; detailed design to ensure that the solutionĀ delivers to the business needs and aligns to the data & analytics architecture principlesĀ and roadmap.
- Understanding business requirements and solution design to develop and implementĀ solutions that adhere to big data architectural guidelines and address business requirements.
- Following proper SDLC (Code review, sprint process).
- Identifying, designing, and implementing internal process improvements: automatingĀ manual processes, optimizing data delivery, etc.
- Building robust and scalable data infrastructure (both batch processing and real-time) toĀ support needs from internal and external users.
- Understanding various data security standards and using secure data security tools toĀ apply and adhere to the required data controls for user access in the Hadoop platform.
- Supporting and contributing to development guidelines and standards for data ingestion.
- Working with a data scientist and business analytics team to assist in data ingestion andĀ data related technical issues.
- Designing and documenting the development & deployment flow.
Ā
Requirements:
Ā
- Experience in developing rest API services using one of the Scala frameworks.
- Ability to troubleshoot and optimize complex queries on the Spark platform
- Expert in building and optimizing ābig dataā data/ML pipelines, architectures and data sets.
- Knowledge in modelling unstructured to structured data design.
- Experience in Big Data access and storage techniques.
- Experience in doing cost estimation based on the design and development.
- Excellent debugging skills for the technical stack mentioned above which even includesĀ analyzing server logs and application logs.
- Highly organized, self-motivated, proactive, and ability to propose best design solutions.
- Good time management and multitasking skills to work to deadlines by workingĀ independently and as a part of a team.
Ā
Client is a Machine Learning company based in New Delhi.
Job Responsibilities
- Design machine learning systems
- Research and implement appropriate ML algorithms and tools
- Develop machine learning applications according to requirements
- Select appropriate datasets and data representation methods
- Run machine learning tests and experiments
- Perform statistical analysis and fine-tuning using test results
- Train and retrain systems when necessary
Ā
Requirements for the Job
Ā
- Bachelorās/Master's/PhD in Computer Science, Mathematics, Statistics or equivalent field andmust have a minimum of 2 years of overall experience in tier one collegesĀ
- Minimum 1 year of experience working as a Data Scientist in deploying ML at scale in production
- Experience in machine learning techniques (e.g. NLP, Computer Vision, BERT, LSTM etc..) andframeworks (e.g. TensorFlow, PyTorch, Scikit-learn, etc.)
- Working knowledge in deployment of Python systems (using Flask, Tensorflow Serving)
- Previous experience in following areas will be preferred: Natural Language Processing(NLP) - Using LSTM and BERT; chatbots or dialogue systems, machine translation, comprehension of text, text summarization.
- Computer Vision - Deep Neural Networks/CNNs for object detection and image classification, transfer learning pipeline and object detection/instance segmentation (Mask R-CNN, Yolo, SSD).
Job Description:
The data science team is responsible for solving business problems with complex data. Data complexity could be characterized in terms of volume, dimensionality and multiple touchpoints/sources. We understand the data, ask fundamental-first-principle questions, apply our analytical and machine learning skills to solve the problem in the best way possible.Ā
Ā
Our ideal candidate
The role would be a client facing one, hence good communication skills are a must.Ā
The candidate should have the ability to communicate complex models and analysis in a clear and precise manner.Ā
Ā
The candidate would be responsible for:
- Comprehending business problems properly - what to predict, how to build DV, what value addition he/she is bringing to the client, etc.
- Understanding and analyzing large, complex, multi-dimensional datasets and build features relevant for business
- Understanding the math behind algorithms and choosing one over another
- Understanding approaches like stacking, ensemble and applying them correctly to increase accuracy
Desired technical requirements
- Proficiency with Python and the ability to write production-ready codes.Ā
- Experience in pyspark, machine learning and deep learning
- Big data experience, e.g. familiarity with Spark, Hadoop, is highly preferred
- Familiarity with SQL or other databases.
- KSQL
- Data Engineering spectrum (Java/Spark)
- Spark Scala / Kafka Streaming
- Confluent Kafka components
- Basic understanding of Hadoop
- Sr. Data Engineer:
Ā Core Skills āĀ Data Engineering, Big Data, Pyspark, Spark SQL and Python
Candidate with priorĀ Palantir Cloud Foundry OR Clinical Trial Data ModelĀ background is preferred
Major accountabilities:
- Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
- Have good understanding on Foundry Platform landscape and itās capabilities
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
- Designs data integrations and data quality framework.
- Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
- Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
- Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed
Desired Candidate ProfileĀ :
- Strong data engineering background
- Experience with Clinical Data Model is preferred
- Experience in
- SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
- Java and Groovy for our back-end applications and data integration tools
- Python for data processing and analysis
- Cloud infrastructure based on AWS EC2 and S3
- 7+ years IT experience, 2+ yearsā experience inĀ PalantirĀ Foundry Platform, 4+ yearsā experience inĀ Big DataĀ platform
- 5+ years ofĀ PythonĀ andĀ PysparkĀ development experience
- Strong troubleshooting and problem solving skills
- BTech or master's degree in computer science or a related technical field
- Experience designing, building, and maintaining big data pipelines systems
- Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
- Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
- Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
- Hand-on experience in AWS / Azure cloud platform and stack
- Strong in API based architecture and concept, able to do quick PoC using API integration and development
- Knowledge of machine learning and AI
- Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.
Ā Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision
Responsibilities:
- Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
- Verifying data quality, and/or ensuring it via data cleaning.
- Able to adapt and work fast in producing the output which upgrades the decision making of stakeholders using ML.
- To design and develop Machine Learning systems and schemes.Ā
- To perform statistical analysis and fine-tune models using test results.
- To train and retrain ML systems and models as and when necessary.Ā
- To deploy ML models in production and maintain the cost of cloud infrastructure.
- To develop Machine Learning apps according to client and data scientist requirements.
- To analyze the problem-solving capabilities and use-cases of ML algorithms and rank them by how successful they are in meeting the objective.
Technical Knowledge:
- Worked with real time problems, solved them using ML and deep learning models deployed in real time and should have some awesome projects under his belt to showcase.Ā
- Proficiency in Python and experience with working with Jupyter Framework, Google collab and cloud hosted notebooks such as AWS sagemaker, DataBricks etc.
- Proficiency in working with libraries Sklearn, Tensorflow, Open CV2, Pyspark,Ā Pandas, Numpy and related libraries.
- Expert in visualising and manipulating complex datasets.
- Proficiency in working with visualisation libraries such as seaborn, plotly, matplotlib etc.
- Proficiency in Linear Algebra, statistics and probability required for Machine Learning.
- Proficiency in ML Based algorithms for example, Gradient boosting, stacked Machine learning, classification algorithms and deep learning algorithms. Need to have experience in hypertuning various models and comparing the results of algorithm performance.
- Big data Technologies such as Hadoop stack and Spark.Ā
- Basic use of clouds (VMās example EC2).
- Brownie points for Kubernetes and Task Queues.Ā Ā Ā Ā Ā Ā
- Strong written and verbal communications.
- Experience working in an Agile environment.