Convolution Operation Explained

Side Note While initially intended as an introduction to Convolutional Neural Networks (CNNs), I realized that a thorough understanding of the convolution operation itself is essential. This post ...

Apr 23, 2025 Computer Vision

Efficient Inference Engine on Compressed Deep Neural Network

EIE: Efficient Inference Engine on Compressed Deep Neural Network Terminology DRAM: Dynamic Random-Access Memory, used for large off-chip storage but consumes high energy per access. EIE a...

Mar 23, 2025 Paper Review

Weight Initialization of Neural Networks

Introduction Weights and biases one of the most important factors of neural networks. Think about all the topics (e.g., backpropagation, activation functions, loss functions, optimization, regular...

Mar 3, 2025 Deep Learning

Bias-Variance Tradeoff and Regularization

Introduction Machine learning models aim to discover patterns in data by learning the relationships between inputs and outputs. To measure how well a model performs this task, we calculate the dif...

Feb 27, 2025 Deep Learning

Bengio: A Neural Probabilistic Language Model

Introduction This time, we will skim through Yoshua Bengio’s seminal paper A Neural Probabilistic Language Model (2003) which laid the foundation of: Statistical language modeling that addresse...

Feb 6, 2025 Paper Review

Model Smoothing to Prevent Zero-Probabilities in Probabilistic Language Models

Introduction Probabilistic language models is trained based on the statistics of the training corpus. However, no matter how large the training corpus is, there is always a possibility that the mod...

Jan 28, 2025 Deep Learning

Negative Log Likelihood Explained

Introduction In our previous post, we explored a bigram language model that predicts the next character in a sequence based on probability distributions. At the heart of this model was the negativ...

Jan 19, 2025 Deep Learning

Building a Bigram Character-Level Language Model

Introduction Today, we will be building a Bigram Language Model which takes in a text file as training data and generates output text similar to the training data. More specifically, this post is ...

Jan 17, 2025 Deep Learning

Quick Intro to Torch Broadcasting

Introduction Broadcasting is a fundamental feature in PyTorch that enables element-wise operations between tensors of different shapes. When performing these operations, PyTorch automatically expa...

Jan 8, 2025 PyTorch

Building a Multilayer Perceptron (MLP) from Scratch

Introduction The past two posts have laid the groundwork for understanding the mathematical underpinnings of neural networks. In each post, we briefly covered: Gradient and Derivative: The conce...

Nov 30, 2024 Deep Learning