Abstract
Semantic segmentation of 3D point clouds is a crucial task in scene understanding and is also fundamental to indoor scene applications such as indoor navigation, mobile robotics, augmented reality. Recently, deep learning frameworks have been successfully adopted to point clouds but are limited by the size of data. While most existing works focus on individual sampling points, we use surface patches as a more efficient representation and propose a novel indoor scene segmentation framework called patch graph convolution network (PGCNet). This framework treats patches as input graph nodes and subsequently aggregates neighboring node features by dynamic graph U-Net (DGU) module, which consists of dynamic edge convolution operation inside U-shaped encoder–decoder architecture. The DGU module dynamically update graph structures at each level to encode hierarchical edge features. Incorporating PGCNet, we can segment the input scene into two types, i.e., room layout and indoor objects, which is afterward utilized to carry out final rich semantic labeling of various indoor scenes. With considerable speedup training, the proposed framework achieves effective performance equivalent to state-of-the-art for segmenting standard indoor scene dataset.