Neural Network Packages


Image Dimension

Keras Tensorflow PyTorch
Dimension (H,W,C,B) (B,H,W,C) (B,C,H,W)

Procedures to replicate models: See input parser → main → network → build layer by layer with fake input of certain dimension → write loss → complete dataloader → optional add tensorboard → complete procedure, train and deploy.

Installation

Installation of Pytorch is easier than that of Tensorflow-gpu, you should alert to the fact that, tensorflow does not usually support the latest version of cuda. For example, now pytorch supports cuda 11, but tensorflow only supports cuda 10.1.

Tensorflow

Sometimes, it is complicated to get this working with gpu, please pay attention to the software requirements of tensorflow-gpu.

For Tensorflow-gpu == 2.3.0 and ubuntu 18.04 system, it requires cuda 10.1, libcudnn 7.6, nvidia-gpu driver > 418.x, etc. See details from the official website Tensorflow-GPU.


# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

# Install NVIDIA driver
sudo apt-get install --no-install-recommends nvidia-driver-450
# Reboot. Check that GPUs are visible using the command: nvidia-smi

# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
    cuda-10-1 \
    libcudnn7=7.6.5.32-1+cuda10.1  \
    libcudnn7-dev=7.6.5.32-1+cuda10.1


# Install TensorRT. Requires that libcudnn7 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
    libnvinfer-dev=6.0.1-1+cuda10.1 \
    libnvinfer-plugin6=6.0.1-1+cuda10.1
    

Common issues when installing is,

  1. Why nvidia-smi and nvcc --version show different cuda version? Has the cuda 10.1 been installed properly? Answer: trust the version from nvcc.
  2. Installed tensorflow gpu, but cannot detect gpu? Answer, check if the software requirements are satisfied. Either reinstall tensorflow-gpu or cuda. TensorFlow 2.1 doesn’t recognize my GPU,though Cuda 10.1. (with Solution)
  3. Error message: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64, what does it mean? Answer: this may indicate your tensorflow-gpu requires cuda 10.0, but you only has 10.1. This is a version mismatch. Upgrade or downgrade either your tensorflow-gpu or cuda.
  4. UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D] Answer: Add this line os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true' to your code.
PyTorch

PyTorch installation is simpler. It only requires to have the proper cuda version. Please refer to their official website PyTorch Distribution.

About Convolution

$$n_{i+1} = \frac{n_i+2p-d(f-1)-1}{s}+1$$

The main implementation difference betweem tensorflow and pytorch is that, whether or not the option of 'same' padding is supported.

In tensorflow implementation, if padding='SAME' and stride=2 appear, it usually means the size of the input is cut in half, \(n_{i+1}=\frac{n_i}{2}\). In that case, assume that \(d=1\), the pytorch implementation should have \(p=\frac{f-2}{2}\). Even if \(f\) is odd, it does not matter, causing .5. Just round it down, for example, \(f=5,p=2\) and \(f=3,p=1\).

If padding='SAME' and stride=1 appear, it means \(n_{i+1}=n_i\)

Syntax Comparison

tf torch tf torch tf torch
tf.layers.conv2d(x,f,k,s,p) nn.Conv2d(f_,f,k,s,p)(x) tf.reduce_sum torch.sum tf.reduce_mean tf.mean
tf.reshape(x,) x.view(,) tf.layers.dense torch.Linear tf.image.resize_nearest_neighbor F.interpolate
tf.gradients torch.autograd.gradients tf.clip_by_value torch.clamp tf.concat torch.cat

Feature Extractions from pretrained network


def _initializeVGG(self,pretrained,freeze):
    encmodel = models.vgg16(pretrained=pretrained)
    if freeze:
        for child in encmodel.children():
            for param in child.parameters():
                param.requires_grad = False
    features = list(encmodel.features)[:31]
    self.features = nn.ModuleList(features)

    

Custom Kernel


def constant_kernel(self,shape,value=1,diag=False,
        flip=False,trainable=False):
    if not diag:
        k = nn.Parameter(torch.ones(shape)*value,requires_grad=trainable)
    else:
        w = torch.eye(shape[2],shape[3])
        if flip:
            w = torch.reshape(w,(1,shape[2],shape[3]))
            w = w.flip(0,1)
        w = torch.reshape(w,shape)
        k = nn.Parameter(w,requires_grad=trainable)
    return k

def context_conv2d(self,t,dim=1,size=7,diag=False,
        flip=False,stride=1,trainable=False):
    N,C,H,W = t.size(0),t.size(1),t.size(2),t.size(3)
    in_dim = C
    size = size if isinstance(size,(tuple,list)) else [size,size]
    stride = stride if isinstance(stride,(tuple,list)) else [1,stride,stride,1]
    shape = [dim,in_dim,size[0],size[1]]
    w = self.constant_kernel(shape,diag=diag,flip=flip,trainable=trainable)
    pad = ((np.array(shape[2:])-1)/2).astype(int)
    conv = nn.Conv2d(1,1,shape[2:],1,list(pad),bias=False)
    conv.weight = w
    conv.to(self.device);
    return conv(t)

    

Stacking Layers

With nn.ModuleList and nn.Sequential,


class pix2pixDMap(nn.Module):
    def __init__(self,units=64):
        super(pix2pixDMap,self).__init__()
        self.layers = []
        f = units
        nodelist = [6,f,2*f,4*f,8*f,1]
        for idx in range(len(nodelist)-1):
            norm = False if idx==0 or idx==len(nodelist)-2 else True
            act = 'leaky' if idx != len(nodelist)-2 else None
            kernel = 3 if idx >= len(nodelist)-2 else 4
            stride = 1 if idx >= len(nodelist)-2 else 2
            self.layers.append(self._discriminate(nodelist[idx],
                nodelist[idx+1],k=kernel,s=stride,
                batchnorm=norm,activation=act))
        self.layers = nn.ModuleList(self.layers)
    def _discriminate(self,in_,out,k=4,s=2,p=1,
            batchnorm=True,activation='leaky'):
        block = [nn.Conv2d(in_,out,k,s,p)]
        if batchnorm:
            block.append(nn.BatchNorm2d(out))
        if activation == 'leaky':
            block.append(nn.LeakyReLU(.2))
        return nn.Sequential(*block)
    def forward(self,x):
        for idx,layer in enumerate(self.layers):
            x = layer(x)
        x = torch.sigmoid(x)
        return x

    

References