A Data Engineer with expertise in building Data Assets (Data warehouse, Data lakes) by building scalable data pipelines, optimizing data systems, and enabling actionable insights through efficient data solutions for BFSI/eComm. Currently, I am working as a Senior Business Intelligence Engineer at Amazon. Prior to this, I have worked as Senior Consultant (BIE) at Deloitte, SE (DE) at SAS R&D, Data Engineer at Citigroup and as an Associate Consultant (DE) at Capgemini. It has been now 9 years, messing with the data. I have also pursued PG Diploma in Machine Learning & AI from IIIT-Bangalore.
Download CVcall me
My passion to dig business insights out of raw data provides me an edge to complete the tasks swiftly. I'm accountable and highly focussed to deliver results and tend to go an extra mile to improve the process and quality of the solution. Also, my abilities of being a quick learner & experience of working in a collaborated team as well independently have enabled successful project outcome. I am resilient to any changes as the world is continuously evolving and so should we.
I would bring expertise on these 8 aspects of Data Engineering & Science, which will make sure, the Products, Solutions and Services that you offer continue to be at par-industry-standard, robust and market leader.
Key Focus: Acquiring data from diverse sources into a centralized system. Core Activities: Designing and implementing batch and real-time data pipelines. Handling structured, semi-structured, and unstructured data formats. Ensuring reliable, low-latency data ingestion using tools like Apache Kafka, Flume, or AWS Kinesis.
Data Prepartion forms the core of any BI and Data Science products. I have been exposed to different phases of prepartion like fetching, integration, transformation, cleansing, quality enhancement & Standardization, Loading & Archiving. The data is integrated from different DB's, Raw files & other applications to get a unified view of the business. Due to inconsistency in data across sources, prep forms an integral part. Data is then loaded to central repo in a DB to be used subsequently for different analytical use cases. I have implemented these with SAS Coding including Macros, SQL, Python, DI Studio (self-service etl tool) and Dataflux Studio (DQ tool), VA & Tableau (visualization tool). With the correct & consistent data, the model performance increases. Put simply, data preparation is the process of taking raw data and getting it ready for ingestion in an analytics platform.
As of now, I have been a part of building an analytics platform, MIS and Campaign Reporting Solution and Regulatory Risk Management and Reporting Solution, mainly focussed in BFSI industry. I have implemented e-DWH and dashboarding solution from scratch using SAS, SQL, DI Studio, Visual Analytics. This helped to generate actionable insights for my client. The MIS and Campaign Reporting was implemented for segmentation, retention, churn and similar metrics. These analysis enabled Product Managers to roll-out offers and campaigns. The Regulatory Risk solution to calculate the banking health by calculating the required capital and analysis of credit defaulters to understand it's driving features. The Regulatory Reporting product to enable financial institutions for their financial and common reporting submission to the authorities. These solutions have laid a strong foundation for the BFSI domain and credit risk management.
In the present world, we are getting rid of manual intervention in the technological process to support business decisions. For the similar cause, I did develop a Reporting Framework which automates the MIS reporting of Credit Card Portfolio and Campaign management. The process starts with the fetching of transactional data, followed by data preparation, trend analysis, report generation in multi formats. The main features are OS-Independent, Password-protected contents to exchange, Dynamic report creation and their Automated E-Mails. At SAS R&D, I had built a robust dev-test pipeline to generate and reconcile regulatory reports across different configurations. It handshaked with different applications and servers like IRM, RFW, Metadata using SAS, SQL, Python & Batch scripting. This helped to have a holistic view of the features impacted due to the ongoing development.
Key Focus: Storing data in systems optimized for performance, scalability, and cost-efficiency. Core Activities: Setting up and managing data lakes (e.g., S3, Azure Data Lake) and data warehouses (e.g., Snowflake, Redshift, BigQuery). Designing data models and schemas for efficient storage and retrieval. Implementing strategies for data partitioning, indexing, and compression to improve storage and query performance.
Key Focus: Making data available for analysis while ensuring security and compliance. Core Activities: Enabling seamless access to data through APIs, BI tools, or direct queries. Implementing role-based access controls (RBAC) and encryption. Ensuring compliance with data privacy regulations (e.g., GDPR, CCPA). Monitoring data pipelines to maintain reliability and address bottlenecks.
Key Focus: To improve the ongoing DE process in terms of efficiency, scalability, infra maintenance, customer support and documentation. Core Activities: Building generic Utilities to perform repetitive tasks like cluster cleanup, tackle bottlenecks, long running queries, etc Writing well-commented codes to explain the business logic. Regular broadcast to downstream users w.r.t to schema updates, busines logic updates and data quality issues. Regular updates in the documentation repo for all the re-designs, new programs, prrof-of-concepts, etc
I have an understanding of customer analytics using their behavioural and demographic information to run customer segmentation and campaigns. Worked on different techniques like linear & logistic regression, random forest and clustering to solve prediction and classification use-cases like sales prediction, loan defaulters, churn prediction and customer segmentation for the purpose of cross-selling, loyalty rewards program and exclusive campaigns. During the data prep, hundreds of datasets and libraries are analysed for their metadata athrough automation.