Blog | Lam Tran

MySQL series - Multiversion concurrency control

October 7, 2022 · 3 min read

Data Engineer

Usually storage engines do not use a simple row lock mechanism, to achieve good performance in a highly concurrent read and write environment, storage engines implement row locking with a certain complexity, the method is often used, is multiversion concurrency control (MVCC).

MVCC Overall

MySQL series - Transaction In MySQL

October 6, 2022 · 6 min read

Lam Tran

Data Engineer

Poster

The next article in the MySQL series is about transactions. A very common operation in MySQL in particular and relational databases in general. Let's go to the article.

MySQL series - MySQL Architecture Overview

October 4, 2022 · 5 min read

Lam Tran

Data Engineer

Hello everyone, recently, I did some research in MySQL because I think whoever doing data engineering should go in-depth with a certain relational database. Once you get a deep understanding of one RDBMS, you can easily learn the other RDBMS since they have many similarities. For the next few blogs, I will have a series about MySQL, and this is the first article.

Create A Data Streaming Pipeline With Spark Streaming, Kafka And Docker

September 11, 2022 · 9 min read

Lam Tran

Data Engineer

Architecture

Hi guys, I'm back after a long time without writing anything. Today, I want to share about how to create a Spark Streaming pipeline that consumes data from Kafka, everything is built on Docker.

Create A Standalone Spark Cluster With Docker

January 1, 2022 · 7 min read

Lam Tran

Data Engineer

Cluster Overview

Lately, I've spent a lot of time teaching myself how to build Hadoop clusters, Spark, Hive integration, and more. This article will write about how you can build a Spark cluster for data processing using Docker, including 1 master node and 2 worker nodes, the cluster type is standalone cluster (maybe the upcoming articles I will do about Hadoop cluster and integrated resource manager is Yarn). Let's go to the article.

Một số câu hỏi phỏng vấn AI/ML

July 24, 2021 · 9 min read

Lam Tran

Data Engineer

Intro

Gần đây, AI/ML là một cái trend, người người AI, nhà nhà AI. Sinh viên đổ xô đi học AI hết, các trường đại học cũng dần mở các môn về học máy, trí tuệ nhân tạo, rồi thị giác máy tính để "bắt kịp".

Receptive field trong thị giác máy tính

July 24, 2021 · 4 min read

Lam Tran

Data Engineer

Convolution

Trong bài viết này, mình muốn nói về receptive field, một khái niệm rất quan trọng trong các bài toán thị giác máy tính mà bạn nào học cũng cần phải biết để giải thích tại sao người ta lại muốn xây mạng sâu hơn. Cùng đi vào bài viết thôi.

Đại số tuyến tính cơ bản - Phần 1

July 10, 2021 · 26 min read

Lam Tran

Data Engineer

Linear Algebra

Tiếp đây sẽ là loạt bài viết về đại số tuyến tính mình đã học lại khi đọc quyển Mathematics for Machine Learning trong thời gian học về Machine Learning và AI. Đây là phần thứ nhất trong loạt bài này.

Xác suất thống kê cơ bản

July 4, 2021 · 10 min read

Lam Tran

Data Engineer

Probability

Bài viết này nhằm ôn lại một số khái niệm trong toán xác suất cơ bản, sẽ không có những phần toán rất phức tạp và dồn dập như trong lúc học trên trường lớp. Thay vào đó, nội dung sẽ tập trung vào các kiến thức xác suất phụ trợ cho trí tuệ nhận tạo hay là thống kê dữ liệu.

AVL Tree, AVL Sorting Algorithm

February 24, 2021 · 8 min read

Lam Tran

Data Engineer

Intuition

In the previous post, I talked about the binary search tree. With efficiency in search, insert, delete,... binary search tree can be done in logrithmic time ( ${\Theta(logn)}$ ) in the average case. In this article, I will talk about AVL tree, which is a type of binary search tree, ensuring that in all cases, the time complexity of the above operations is the same ${\Theta(logn)}$ .