√100以上 2d or not 2d adaptive 3d convolution selection for efficient video recognition 220791-2d or not 2d adaptive 3d convolution selection for efficient video recognition
2 GONDA, WEI, PARAG, PFISTER PARALLEL SEPARABLE 3D CONVOLUTION (a) Structured 2D and 1D convolution P3D Pm SCn (ours) Type A Type B Type C Method (21)D 39 P3D 29 Pm SCn 1 1 m (b) 3D convolution replacement É É É m # stream #2D conv Type 1 1 n A,B,C A A n Figure 1 Illustration of our parallel separable convolution (P m SC n) The goal is to replace 3D convolutionAdaptive 3D Convolution Selection for Efficient Video Recognition Hengduo Li Zuxuan Wu Abhinav Shrivastava Larry S Davis University of Maryland, College Park fhdli,zxwu,abhinav,lsdg@csumdedu Abstract 3D convolutional networks are prevalent for video recognition While achieving excellent recognition perforFollowing the recent trend of developing efficient 3D convolution models for video classification lin19tsm, qiu17learning, tran18closer, xie18rethinking, we also propose a new efficient pseudo3D convolution module wherein the standard 3D convolution is decoupled into 2D and 1D convolution To further enhance motion features, we uti...