Skip to main content

Lam Tran

Skills

  • Confident: Python, SQL, Spark, Polars, Snowflake, AWS, Databricks, Kafka
  • Familiar: Java, Scala, Dbt, Hadoop, Airflow, MySQL, Postgres, MongoDB, Docker, Impala, StreamSets, GitLab CI, Web Development (ReactJS, Spring Boot), Kubernetes, Helm

Experience

Lead Data Engineer - Techcombank
Feb 2025 - Present
Data Architect - FPT Software
Mar 2023 - Feb 2025
Senior Staff Software Engineer (L5) - Episource
  • Owned and re-engineered the Salesforce to Snowflake ingestion pipeline. Cut processing cost by 30% compare to legacy ones while improving the pipeline’s robustness and reducing processing time by multiple times.
  • Developed a custom data validation framework using asynchronous HTTP calls for Salesforce data retrieval and Polars for data quality checks, resulting in 2 times faster than the existing framework.
  • Drove the delivery of Standard Data Pipeline on enterprise-grade Snowflake data platform, designed data models for business entities such as Claims, Members, Providers, MMR, MOR,… for risk calculation, directly enabling high-stakes Medicaid and Medicare analytics across the organization.
  • Led high-impact engineering teams in the design and deployment of enterprise-grade pipelines toward strategic objectives.
Data Architect - Sysmex
  • Re-architected a high-performance migration of micro-batch workflows from AWS Glue (Spark) to Lambda (Polars), slashing job latency from 10 minutes to sub-second and reducing cloud costs by 90%.
  • Systematized a modular, high quality codebase for Lambda and Glue pipelines, aligning with Sonar standards to achieve ~95% unit test coverage and significantly improving maintainability.
  • Steered the technical roadmap and execution for data engineering teams, ensuring high-velocity delivery.
Data Core Team contribution
  • Exposed technical competency by delivering 5 data workshops at Business Unit level.
  • Conducted 30+ technical interviews.
  • Formalized the technical onboarding and training roadmap for associate engineers, accelerating time-to-productivity for new hires.
  • Architected technical solutions for high-stakes client proposals.
Big Data Engineer - Giaohangtietkiem
Dec 2020 - Mar 2023
  • Owned high-velocity ingestion pipelines streaming 5TB of daily data from Kafka to HDFS via Spark Streaming, ensuring data integrity for critical near-real-time reporting across 1,000+ tables.
  • Engineered a high-performance analytical SQL engine by deploying a customized Spark Thrift Server v3.3.0 on Cloudera data platform, achieving a 40% performance increase over legacy versions while hardening security through Kerberos and Ranger integrations.
  • Designed data marts for each business major, speeding up the whole system by hours every day (A pick, deliver, return packages mart which is used in 20+ reports). Use Dbt (Data build tool) as a data transformation tool in our data warehouse and Airflow as a scheduling tool integrating with Dbt.
  • Bridged open source gaps, including a customized dbt-impala adapter through a Kerberized JDBC connection, since it was not officially adopted at that time. Engineered tools such as a web application to show all Impala queries for easier monitoring, a RESTful client using SPNEGO protocol for making requests to Kerberized Ranger,…
Big Data Engineer - Viettel Group
Mar 2019 - Dec 2020
  • Owned and maintained prepaid and postpaid data pipelines that process billions of VND per day, trace down bugs if there is abnormal revenue in the daily report.
  • Orchestrated several ETL jobs using SparkSQL, Spark, Pentaho, and Nifi to streamline cross-subsidiary data integration, enabling seamless operational continuity across Viettel Group.

Education

Hanoi University of Science and Technology
Talent Program of Electronics and Telecommunication

Licenses & Certifications

Awards