SH
Available for opportunities

Hi, I'm
Sourav Halder

I architect scalable cloud solutions. 
Sourav Halder
AWS Certified
Data Science
Deep Learning
Development
API Development
Impact Telemetry

System Metrics & Operations

Telemetric ledger showing automated sync operations, pipeline efficiencies, and service level metrics.

telemetry_check.sh
sh sourav@dwh-orchestrator ~ % ./check_pipelines.sh
>>> Loading connection parameters SAP_ERP...
>>> Syncing Delta lake layers (Apache Iceberg)...
>>> Status: 93.75% extraction traffic reduction [OK]
>>> Validating Redshift clusters (14 active marts)...
>>> Checking ServiceNow MTTR integration...
>>> Status: MTTR reduced by 40% [OK]
>>> EventBridge & AWS Lambda active. SLA: 99.5%
--------------------------------------------------
SYSTEM STATUS: SECURE & STABLE. METRICS SYNCED.
sh sourav@dwh-orchestrator ~ %
Job Count

0+

ETL Pipelines

Active production pipelines extracting SAP ERP, SQL Server, and REST API endpoints into a unified Redshift warehouse.

Service Level

0.0%

Operational SLA

Availability verified via automated event-driven checks, retry logic, and event alert routing.

Replication Volume

0.00%

Traffic Reduction

Migrating from legacy batch databases to a delta-load layer using Apache Iceberg, avoiding SAP database bottlenecks.

Alert Resolution

-0%

MTTR Improvement

Automated integration linking pipeline alerts to ServiceNow API, replacing manual triaging and routing.

What I Do

Expertise & Services

Bridging the gap between raw data systems, machine learning models, and secure cloud orchestration.

01 // INFRASTRUCTURE LAYER

Cloud Architecture

Designing and implementing highly scalable, resilient, and cost-effective cloud solutions. Expert in infrastructure automation, serverless computing, and secure networking layouts.

  • Infrastructure as Code
  • High Availability Clustering
  • Serverless Design Patterns
  • Cost Optimization Analytics
AWSECS/EC2LambdaS3VPC/IAMRDSDynamoDBCloudWatch
02 // ORCHESTRATION LAYER

Data Engineering

Building production-grade ETL/ELT pipelines and delta lakehouses. Specializing in delta-load mechanics, data warehousing, and automated operational observability.

  • Incremental Delta Loads
  • Medallion Lakehouse Architectures
  • Orchestration & Retry Logic
  • Incident Telemetry Pipelines
AWS GlueRedshiftDatabricksApache IcebergPostgreSQL
03 // INTELLIGENCE LAYER

Data Science

Extracting actionable insights from complex datasets using advanced mathematics, statistical validation, and intelligent machine learning pipeline orchestration.

  • Predictive Modeling
  • LLM & Agent Architectures
  • Feature Engineering
  • Neural Network Design
PythonTensorFlowScikit-LearnMLflowLangChain
Professional Career

Professional Experience

Detailed operational domains and systems implemented during my consulting career.

Infosys Limited
Oct 2021 - Present

Senior Associate Consultant

Infosys Limited

Serving as a Senior Associate Consultant at Infosys Limited, specializing in enterprise data engineering pipeline architecture, serverless cloud orchestration, data lakehouse migrations, and automated systems telemetry.

Contributions Ledger

Key Contributions & Achievements

  • Developed and maintained 50+ production ETL pipelines using Python, SQL, and AWS Glue, orchestrating data extraction from multi source SAP ERP, SQL Server, and REST APIs into Redshift data warehouse supporting 14 data marts, ensuring timely data availability for business dashboards and analytics teams.
  • Implemented robust pipeline orchestration with event-driven and schedule-based triggers (Lambda, EventBridge, S3), incorporating automated retry logic, data validation checks, eliminated synchronization failures by 85%, resolved data mismatches, and achieved 99.5% pipeline reliability.
  • Optimized data delivery cycles to support monthly Sales & Operations Planning (S&OP) processes for leadership and regional sales heads, cutting data delays by 60% and ensuring accurate, timely insights for strategic sales planning and forecasting.
  • Re-architected data extraction pipeline by migrating from direct SQL database to Apache Iceberg-based live layer with incremental delta loads, achieving 93.75% reduction in data extraction and eliminating performance bottlenecks caused by high-volume ERP replication traffic.
  • Automated incident management by integrating ServiceNow API for pipeline monitoring, enabling automatic ticket creation and intelligent team assignment for job failures—decreased mean time to resolution (MTTR) by 40% and eliminated manual triaging overhead.
Academic Background

Education Records

Academic qualifications, specialized engineering domains, and formal credentials.

U4
U3
U2
U1
RACK_CABINET_JU-BW // NODE_ONLINE
Uptime: 14,285.4h
Sys_Load: 24.5%
Temp: 38°C
Power: 340W
HDD
SYS
NET
TMP
02U // JU_ROBOTICS_NODE
M.Tech // Intelligent Automation and Robotics
Jadavpur University
FAN_IDLE
GRAD_SCORE:[ 90.36% ]
HDD
SYS
NET
TMP
01U // BW_POWER_NODE
B.Tech // Electrical Engineering
Brainware Group of Institutions
FAN_IDLE
FINAL_CGPA:[ 8.49 ]
U4
U3
U2
U1
Capabilities

Technical Skills

Organized the way I think about engineering and automation — directly inside a simulated code editor.

sourav-skills — VS Code
data_science.ipynb
srcskillsdata_science.ipynb
# Jupyter Notebook - Machine Learning & AI Workflows
In [1]:# Machine Learning
classMachineLearning:
NumPy
NumPy
Pandas
Pandas
Scikit-Learn
Scikit-Learn
MLflow
MLflow
Jupyter
Jupyter
In [2]:# Deep Learning & GenAI
classDeepLearningGenAI:
TensorFlow
TensorFlow
Keras
Keras
Hugging Face
Hugging Face
LangChain
LangChain
Portkey AI
Portkey AI
In [3]:# Visualization
classVisualization:
Matplotlib
Matplotlib
Seaborn
Seaborn
Plotly
Plotly
main0 errors
Ln 0, Col 1Spaces: 4Jupyter Notebook
Portfolio

Featured Projects

Architecture-first implementations focused on cloud automation, data pipelines, and machine learning structures.

Delta Lakehouse Pipeline
DATA-ENGINEERING // SQL
93.75%Data

Delta Lakehouse Pipeline

SAP ERP → Glue → Iceberg → Redshift

Optimized a mission-critical financial ledger ETL pipeline by introducing partition-aware incremental delta-load extraction, slashing data processing workloads dramatically.

AWS GlueApache IcebergAmazon RedshiftS3+2
ML-Powered Anomaly Detection
DATA-SCIENCE // DATA-ENGINEERING
40%MTTR

ML-Powered Anomaly Detection

Real-time anomaly pipeline on streaming data

Designed and deployed a streaming telemetry ingest engine that processes high-frequency IoT readings to detect operational anomalies in real-time.

Apache KafkaApache SparkPythonScikit-Learn+2
Cloud Infrastructure Automation
CLOUD
99.5%Uptime

Cloud Infrastructure Automation

Terraform + ECS + CI/CD orchestration

Standardized cloud operations across production clusters using declarative infrastructure definitions and secure automated deployment gates.

TerraformGitHub ActionsAWS ECSAWS VPC+2
AWS Three-Tier Web Architecture
CLOUD
100%Infrastructure

AWS Three-Tier Web Architecture

Auto Scaled Load Balanced Infrastructure

Utilized a public Application Load Balancer to route client traffic to auto-scaling Nginx web servers, communicating with internal application servers and RDS databases.

AWSEC2ELBVPC+3
Spotify Big Data Analytics Pipeline
DATA-ENGINEERING // CLOUD
DailyAutomated

Spotify Big Data Analytics Pipeline

Serverless Ingest & Aggregation ETL

Built a serverless data pipeline to extract, transform, and load Spotify top songs data into a visual analysis dashboard using AWS serverless analytics services.

AWS GlueS3AthenaAmazon QuickSight+2
DynamoDB Real-Time Audit Table
CLOUD
Real-timeData

DynamoDB Real-Time Audit Table

NoSQL Database Mutation Capturing

Designed an automated audit trail system for DynamoDB databases, tracking data mutations in real-time for compliance and operational analytics.

AWS LambdaDynamoDB StreamsS3Amazon Kinesis+1
WildRydes Serverless App
CLOUD
ServerlessZero-Idle

WildRydes Serverless App

Complete Serverless App Ingestion

Built an end-to-end serverless ride-sharing web application leveraging AWS serverless services to manage user authentication, ride requests, and backend processing.

AWS AmplifyAmazon CognitoAWS LambdaAPI Gateway+2
Medallion Data Warehouse Schema
SQL // DATA-ENGINEERING
3-LayerMedallion

Medallion Data Warehouse Schema

Layered Relational Database Design

A comprehensive data warehousing solution, cataloging and exploring relational datasets using SQL scripts to structure Bronze, Silver, and Gold layers.

SQLPostgreSQLData ModelingETL+1
Credentials

Certifications & Recognition

Professional cloud architecture, data engineering, and governance credentials. Click any badge to verify.

Databricks Certified Machine Learning Engineer AssociateDatabricks Certified Data Engineer ProfessionalDatabricks Certified Data Engineer AssociateAWS Certified Solutions Architect – AssociateAWS Certified Developer – AssociateAWS Certified AI PractitionerAWS Certified Cloud PractitionerCollibra Solution Architect CertificationCollibra Workflow Engineer CertificationCollibra Integration Engineer CertificationCollibra Data Steward CertificationCollibra AI Governance-ReadyInfosys RISE INSTA AwardsDatabricks Certified Machine Learning Engineer AssociateDatabricks Certified Data Engineer ProfessionalDatabricks Certified Data Engineer AssociateAWS Certified Solutions Architect – AssociateAWS Certified Developer – AssociateAWS Certified AI PractitionerAWS Certified Cloud PractitionerCollibra Solution Architect CertificationCollibra Workflow Engineer CertificationCollibra Integration Engineer CertificationCollibra Data Steward CertificationCollibra AI Governance-ReadyInfosys RISE INSTA AwardsDatabricks Certified Machine Learning Engineer AssociateDatabricks Certified Data Engineer ProfessionalDatabricks Certified Data Engineer AssociateAWS Certified Solutions Architect – AssociateAWS Certified Developer – AssociateAWS Certified AI PractitionerAWS Certified Cloud PractitionerCollibra Solution Architect CertificationCollibra Workflow Engineer CertificationCollibra Integration Engineer CertificationCollibra Data Steward CertificationCollibra AI Governance-ReadyInfosys RISE INSTA AwardsDatabricks Certified Machine Learning Engineer AssociateDatabricks Certified Data Engineer ProfessionalDatabricks Certified Data Engineer AssociateAWS Certified Solutions Architect – AssociateAWS Certified Developer – AssociateAWS Certified AI PractitionerAWS Certified Cloud PractitionerCollibra Solution Architect CertificationCollibra Workflow Engineer CertificationCollibra Integration Engineer CertificationCollibra Data Steward CertificationCollibra AI Governance-ReadyInfosys RISE INSTA Awards
Connect

Get In Touch

Let's collaborate on data pipelines, cloud architecture, or intelligent software automation.

Communication Details

Contact Information

Location

Kolkata, West Bengal, India

Birthday

July 17, 1996
Interactive Terminal

Send A Message