3D Deep Learning 入門（二）- Deep learning on point cloud

前言

繼上次的 3D Deep Learning 入門（一）- Deep learning on regular structures，今天要來講第二部分 - 如何對 3D point cloud 做 deep learning。

為何 point cloud 重要？

因為 point cloud 很接近 sensor 吐出的 raw data，所以若能直接從 point cloud 中學到有用的資訊，就能達到 End-to-end 的學習（也就是直接拿 raw data 跟 ground truth，直接學習），不需要再額外做其他的轉換（例如上次介紹的 multi-view 方法，你還是得決定要從哪些角度拍、拍幾張、距離要多遠等等）。

PointNet - 一個能做 classification 跟 segmentatino 的 seminal work

今天的第一篇，就要來介紹 PointNet 這篇算是一個 milestone 的論文。他具有代表性的地方主要在於提出一個架構，可以直接從 point cloud 學習，而且這個架構稍加變化就可以做到多種 task - classification 跟 segmentation。

point cloud data 的難點 - orderless

Point cloud data 並沒有一個固定的順序，而且在空間中的密度分佈也很不均勻，所以在處理 point cloud data 跟有 regular structure 的 3D data 很不一樣。對於沒有順序這個特性，等於是你的 function（或說 neural net）要能對抗 N! 的 permutation 變化。

PointNet 解決此問題的直覺方法 - 想辦法逼近 symmetric function

因為 symmetric function 本身的特性就是不在乎 input variable 的順序，所以可以想辦法做出一個逼近 symmetric function 的架構。

於是產生出 PointNet 的基本架構：

這邊的 symmetric function 就是用 max pooling，但如果直接對所有點取 max，可想而知結果不會太好，所以他們前面多套了一層 MLP 來學習。

除了 orderless 問題外，point cloud 的 pose 也需要 align

為了避免 point cloud 的 pose 不同影響辨識結果，所以這邊用了額外的 T-Net 來 align point cloud 到同樣的 pose：

PointNet classification 架構

把上面提到的東西組合起來，就可以產生下面的架構啦：

稍微提醒一下，這邊的 classification 是假設輸入的 point cloud 都來自同一物體，所以最後直接取一個 global feature 來產生分類結果。

PointNet segmentation 架構

segmentation 因為會需要切出 point cloud 中間的 parts，所以這邊的做法是把 local features 跟 global features 接起來，再多用一個 function 來計算各 point 的 class。

結果

從結果上看得出來 PointNet 不輸 3D CNN 類型的方法，有些表現甚至還贏。

還有很多實驗結果我就不放了，這邊只放我覺得能夠對直觀理解有幫助的內容。

PointNet 的缺點

雖然 max pooling 解決了 N! permutation 問題，但 PointNet 直接從各個 point 學到 global feature，缺點就是少了局部的 feature：

這會造成 PointNet 很難 generalize 到各種 configuration（例如只要 input point cloud 沒有先做 mean normalization，結果就會不太好），於是作者們又提出了 PointNet++。

PointNet++

主要概念

想辦法做到 hierarchical learning，在過程中學到一些 local feature
想辦法處理 point cloud 分佈不均勻的問題

Hierarchical learning

首先，對於有 N 個點、每個點有 d+C 維的 vector data（d 是座標維度，以下圖來說 d == 2；C 則是其他的 feature 維度，例如 RGB、normal vector 等等），先 sample 一些點，並用 N1 個小球來分群（同一個球裡面的 point cloud 是同一群），開始有點 local group 的感覺：