
Balachandra
Data Engineer with GCP, AWS, Databricks, Snowflake, dbt,, Informatica
Habilidades

Revisa mis servicios

Porfolio
Experiencia laboral
Senior Data Engineer
Equinox • Tiempo completo
Jun 2022 - Dec 2024 • 2 yrs 6 mos
Architected end-to-end data solutions leveraging GCP services such as BigQuery, Dataflow, Dataproc, Data Fusion, Pub/Sub, Cloud Functions, and Cloud Composer to build scalable, secure, and cost-optimized data platforms. Designed and standardized data integration frameworks to unify data from diverse sources (APIs, flat files, Smartsheets, SharePoint, Kafka) into centralized, analytics-ready datasets. Implemented event-driven and streaming architectures using Pub/Sub, GCS triggers, and Cloud Functions to enable near real-time data ingestion and processing. Established CI/CD and version-controlled data workflows using GitHub and Dataform to ensure reliable, repeatable, and auditable data deployments. Defined orchestration and scheduling strategies with Airflow (Cloud Composer) and Control-M, optimizing job dependencies and resource utilization across workloads. Partnered with cross-functional teams to define data governance, quality, and performance standards—enhancing data reliability and driving a measurable 20% increase in business revenue.
Senior Data Engineer
Apartment Geofencing • Tiempo completo
Oct 2019 - Jun 2022 • 2 yrs 8 mos
• Designed and implemented cloud-native ETL pipelines using Dataflow, BigQuery, and Apache Spark (RDD/DataFrame), enabling scalable processing of large datasets and reducing data processing time by up to 40%. • Developed Lua and Python-based automation frameworks for staging and dimensional data management in Exasol, improving data consistency, reducing manual intervention, and accelerating data warehouse load cycles. • Streamlined ETL development through GCP Data Fusion, creating reusable integration templates and reducing development effort by 25%, resulting in faster onboarding of new data sources. • Performed complex data validation, cleansing, and transformation using PySpark, Pandas, and Spark SQL, improving data accuracy and helping achieve high-quality datasets for analytics and reporting. • Optimized Spark transformations and BigQuery workloads, improving query performance by 20% and reducing overall pipeline execution times while maintaining SLA compliance. • Collaborated with business and analytics teams to translate data requirements into scalable solutions, enabling faster access to actionable insights and supporting data-driven decision-making. • Mentored 10+ junior engineers, conducted code reviews, and established development best practices, improving code quality, reducing production defects, and increasing team productivity. • Contributed to the successful delivery of multiple data engineering initiatives by driving technical excellence, knowledge sharing, and adherence to data governance standards.
ETL Lead
BI-2 • Tiempo completo
Oct 2013 - Oct 2019 • 6 yrs
• Led the migration and modernization of legacy Informatica workflows to GCP-based data platforms, improving scalability, reducing infrastructure dependency, and accelerating data processing capabilities. • Designed and implemented Change Data Capture (CDC) frameworks and reusable ETL components in Informatica PowerCenter, enabling efficient incremental data processing and reducing overall pipeline execution times. • Developed highly parameterized workflows and automated SCD Type 1 and Type 2 processing using MD5-based change detection, reducing development effort by over 40% and ensuring consistent handling of historical data. • Automated workflow configuration, parameter management, and job scheduling processes, significantly reducing manual intervention, improving operational efficiency, and enhancing batch processing reliability. • Standardized ETL design patterns and reusable frameworks, accelerating onboarding of new data integrations and reducing maintenance overhead across multiple projects. • Optimized ETL workflows through performance tuning and parallel processing techniques, resulting in improved throughput and reduced batch window durations. • Implemented robust monitoring, exception handling, and recovery mechanisms, improving job success rates and reducing production support incidents by 20%. • Collaborated with business and data stakeholders to streamline data integration processes, improving data accuracy, timeliness, and overall trust in enterprise reporting and analytics platforms.