Lam Tran
LinkedIn: ltrandata
Phone: (+84) 962007024
Email: lamtran.data@gmail.com
Blog: lam-tran.dev
Github: LTranData
Skills
- Confident: Python, SQL, Spark, Polars, Snowflake, AWS, Kafka, MySQL
- Familiar: Java, Scala, GCP, Hadoop, Airflow, Postgres, MongoDB, Dbt, Docker, Impala, StreamSets, GitLab CI, Web Development (ReactJS, Spring Boot), Kubernetes, Helm
Experience
Lead Data Engineer - Techcombank
Feb 2025 - Present
Data Architect - FPT Software
Mar 2023 - Feb 2025
Senior Staff Software Engineer (L5) - Episource
- Own Salesforce to Snowflake sync pipeline for hundreds of tables, ensuring data latency SLA of ~100%. Reduce 30% cost with S3 mid-layer and increase speed multiple times with parallel asynchronous Salesforce API calls.
- Develop a data validation script. Introduce multithread processing with Polars and asynchronous HTTP calls for Salesforce data retrieval, which reduces the time taken for data quality checks by 84% compared to the legacy tool.
- Shared ownership of Standard Data Pipeline for Unified Data Platform on Snowflake cloud that serves as a centralized hub that brings together information from various sources, support business entities such as Claims, Members, Providers, MMR, MOR,… for Medicare, Medicaid membership and risk calculation. Ingest and transform files from external clients into the data platform, and provide data for various value streams within the enterprise. Implement the testing framework to ensure functional testing and data quality end-to-end.
- Lead Enterprise Pipeline teams and guide the teams toward goals and objectives.
Data Architect - Sysmex
- Redesign Analyzer Statistics and Rule Validation micro-batch jobs workflow from using Glue (Spark) to Lambda (Polars) and reduce processing time for each job from 8-10 minutes to seconds, reducing cost at least 10 times.
- Restructure the source code of Lambda micro-batch jobs and daily Glue jobs so that code is well organized and modular aligned with Sonar rules, resulting in better code quality and easier for writing UTs, achieving ~90% UT coverage in both projects.
- Lead the development of the data project and data engineering teams.
Data Core Team contribution
- Main technical content creator of 5 data workshops at BU level.
- Main interviewer of several technical interviews for headcount preparation OKR.
- Contribute to team international certifications and build training roadmap for mentees.
- Architecture design for bidding on customer projects.
Big Data Engineer - Giaohangtietkiem
Dec 2020 - Mar 2023
- Develop an ingestion pipeline to stream data from Kafka topics to HDFS using Spark Streaming and Sqoop, sync up around 1000 tables with 5TB daily data velocity, ensuring data quality for near real time reports.
- Install, customize, and maintain a Spark Thrift Server v3.3.0 to work with the Cloudera platform, integrate with other services (Kerberos, Ranger, Kudu), and act as a SQL engine on Hadoop for complex ETL jobs with complex datatypes. Tune the server and make it run 40% faster than the old version (v2.4.3).
- Design data marts for each business major, speed up the whole system hours every day (A pick, deliver, return packages mart which is used in 20+ reports). Use Dbt (Data build tool) as a data transformation tool in our data warehouse and Airflow as a scheduling tool integrating with Dbt.
- Customize Dbt to work with Impala through a kerberized JDBC connection since it is not officially adopted at that time. Create data tools for work such as a web application to show all Impala queries for easier monitoring, RESTful client using SPNEGO protocol for making requests to kerberized Ranger,…
Big Data Engineer - Viettel Group
Mar 2019 - Dec 2020
- Create several ETL jobs using SparkSQL, Spark, Pentaho, and Nifi to provide, ingest and aggregate data for each request from other subsidiary companies of Viettel Group, and help other businesses operate smoothly.
- Maintain and add more formulas to compute telecom subscription charges for prepaid and postpaid pipelines which process billions of VND per day, trace down bugs if there is abnormal revenue in the daily report.
Education
Hanoi University of Science and Technology
Talent Program of Electronics and Telecommunication
Licenses & Certifications
- AWS Certified Solutions Architect Associate - AWS
- SnowPro® Core Certification - Snowflake
- Machine Learning, Deep Learning Specialization - Coursera
- CEFR C1 547 points - British Council
Awards
- Best Performance Employee - FPT Software
- Employee Of The Year - FPT Software
- Best Employee Of The Quarter - Techcombank