Fig. 6From: 3D human pose estimation model using location-maps for distorted and disconnected images by a wearable omnidirectional cameraThe network architecture of our model based on HRNet-W24. The stem net convolutes input images to 256 (channel) Ă—24 (input height /4) Ă—48 (input width /4) regardless of the number of W. The network makes branches after the stem net according to W. The number of output maps is 48 because of 4 maps (H, X, Y, and Z) for each of the 12 joints in our settingBack to article page