Skip to main content
Fig. 2 | IPSJ Transactions on Computer Vision and Applications

Fig. 2

From: Visual saliency detection for RGB-D images under a Bayesian framework

Fig. 2

Architecture for supervision transfer. a The Architecture of Clarifai model, where Relu denotes a rectified linear function relu(x)=max(x,0), which rectify the feature maps thus ensuring the feature maps are always positive, lrn denotes a local response normalization layer, and Dropout is used in the fully connected layers with a rate of 0.5 to prevent CNN from overfitting. b Upper branch: Deep CNN-based global-context modelling for RGB saliency detection with a superpixel-centred window padded with the mean pixel value of the RGB training dataset. Lower branch: Deep CNN-based global-context modelling for depth saliency detection with a superpixel-centred window padded with the mean pixel value of the depth training dataset. We train a CNN model for depth images by teaching the network to reproduce the mid-level semantic representation learned from RGB images for which there are paired images. The supervision transfer occurs at the penultimate layer of the global context model. For the loss function, we use the L2 distance

Back to article page