Definition

MobileNet is a lightweight deep Convolutional Neural Network architecture.

Architecture

MobileNet V1

MobileNet v1 introduced the depthwise separable convolutions consist of two operations: depthwise convolution and pointwise convolution. It significantly reduces computational cost and model size maintaining the model performance.

Inspired by ResNeXt, the Depthwise Convolution applies a filter to each input channel. It aggregates spatial information only. The pointwise convolution uses $1 \times 1$ convolutions to combine the outputs from the depthwise step. It channel-wisely combines the information.

MobileNet V2

MobileNet v2 introduced the inverted residual block consists of three layers: expansion, depthwise convolution, and projection layer. The expansion layer expands the input to a higher dimension $h \times w \times c \to h \times w \times k c$ The projection layer reduces back the channel size $h \times w \times k c \to h \times w \times c^{'}$ , where $c^{'} ≪ k c$ . Some of the ReLU activation functions in the narrow layers are replaced with the other (ReLU6 or Linear) to prevent information loss.

MobileNet V3

MobileNet v3 appended SE-block insider the inverted residual block.

The sigmoid functions used for SE-block are substituted with the hard sigmoid function more computationally light. $\frac{ReLU6 ( x + 3 )}{6} \approx σ (x)$ And the ReLU used in the mobileNet v2 is replaced with hard swish activation function $h-swish (x) = x \frac{ReLU6 ( x + 3 )}{6} \approx x σ (x)$

The model architecture is optimized using the auto-ml technique network architecture search (NAS).

My Knowledge Base

Explorer

MobileNet

Definition

Architecture

MobileNet V1

MobileNet V2

MobileNet V3

Graph View

Table of Contents

Backlinks