3D Obstacle Detection Application

../../../../_images/estop_rviz.png

This demonstrates the 3D obstacle detection application using the disparity map from stereo depth engine and the semantic segmentation map from the deep-learning network. As shown in Figure 1, this application consists of the following three main processes:

../../../../_images/estop_demo_block_diagram.svg

Figure 1. 3D obstacle detection demo: block diagram

Stereo Vision Processing
- This is the same process described in Stereo Vision Application without the point-cloud generation process. The output disparity map is fed to the 3D obstacle detection process as an input.
Semantic Segmentation Processing
- This is the same process described in Semantic Segmentation Application. The output tensor for the CNN network is fed to the 3D obstacle detection process as another input.
3D Obstacle Detection Processing
- This process outputs the 3D bounding box coordinates of the detected obstacles. First, it creates 3D point cloud using the disparity map and the camera parameters. Note that it maps only pixels that belongs to particular classes, e.g., car, pedestrian, bicycle, rider, etc. into the 3D space. Then it projects the 3D point cloud on a 2D occupancy grid map. Finally it detects individual obstacles by grouping closely-located occupied cells with an identical class using a “connected component analysis” algorithm.

How to Run the Application in ROS 1

[TDA4] For setting up the ROS1 environment on TDA4 host, please follow Docker Setup for ROS 1. To launch ti_estop node with playing back a ROSBAG file, run the following inside the Docker container on TDA4 target:

roslaunch ti_estop bag_estop.launch

To process the image stream from a ZED stereo camera, replace the launch file with zed_estop.launch:

roslaunch ti_estop zed_estop.launch zed_sn:=SNxxxxx

[Visualization on Ubuntu PC] For setting up the ROS1 environment on remote PC, please follow Docker Setup for ROS 1.

As shown in the “Launch File Parameters” section below, this application publishes many topics regarding 3D bounding box coordinates, semantic segmentation output tensor and disparity map. Using these information, we can produce color-coded disparity map, color-coded semantic segmentation map and 3D bounding boxes overlaid on image. To visualize them on PC, run

roslaunch ti_estop rviz_estop.launch

The ego-centric occupancy grid map is created based on 3D bounding boxes. To visualize OG map along with 3D bounding box image, run

roslaunch ti_estop rviz_estop_ogmap.launch

How to Run the Application in ROS 2

[TDA4] For setting up the ROS2 environment on TDA4 host, please follow Docker Setup for ROS 2. To process the image stream from a ZED stereo camera, replace the launch file with zed_estop.launch:

ros2 launch ti_estop zed_estop_launch.py zed_sn:=SNxxxxx

[Visualization on Ubuntu PC] For setting up the ROS2 environment on remote PC, please follow Docker Setup for ROS 2.

To visualize outputs on PC, run

ros2 launch ti_viz_nodes rviz_estop_launch.py

The ego-centric occupancy grid map is created based on 3D bounding boxes. To visualize OG map along with 3D bounding box image, run

ros2 launch ti_viz_nodes rviz_estop_ogmap_launch.py

Launch File Parameters

Parameter	Description	Value
rosparam file	Algorithm configuration parameters (see "ROSPARAM Parameters" section)	config/params.yaml
left_lut_file_path	LDC rectification table path for left image	String
right_lut_file_path	LDC rectification table path for right image	String
dl_model_path	Deep-learning model path	String
left_input_topic_name	Left input topic name to read left images from a stereo camera	camera/left/image_raw
right_input_topic_name	Right input topic name to read right images from a stereo camera	camera/right/image_raw
camera_info_topic	Right camera_info topic name to read relevant camera parameters	camera/right/camera_info
semseg_cnn_tensor_topic	Publish topic name for semantic segmentation tensor	semseg_cnn/tensor
rectified_image_topic	Publish topic name for rectified right image	camera/right/image_rect_mono
bounding_box_topic	Publish topic name for 3D bounding boxes coordinates of detected obstacles	detection3D/BB3D
raw_disparity_topic_name	Publish topic name for raw disparity map	camera/disparity/raw
ogmap_topic_name	Publish topic name for ego-centric occupancy grid map	detection3D/ogmap
estop_topic_name	Publish topic name for binary emergency stop message, indicating whether obstacle(s) is in proximity to the robot or not	detection3D/estop
exportPerfStats	Flag for exporting the performance data to a file: 0 - disable, 1 - enable	0, 1

ROSPARAM Parameters

Basic input, LDC and SDE Parameters

Parameter	Description	Value
input_format	Input image format, 0: U8, 1: YUV422	0, 1
sde_algo_type	SDE algorithm type, 0: single-layer SDE, 1: multi-layer SDE	0, 1
num_layers	Number of layers in multi-layer SDE	2, 3
sde_confidence_threshold	Disparity with confidence less than this value is invalidated	0 ~ 7
disparity_min	Minimum disparity to search, 0: 0, 1: -3	0, 1
disparity_max	Maximum disparity to search, 0: 63, 1: 127, 2: 191	0 ~ 2

Camera Parameters

Parameter	Description	Value
camera_height	Camera mounting height	Float32
camera_pitch	Camera pitch angle in radian	Float32

Occupancy Grid Map Parameters

Parameter	Description	Value
grid_x_size	Horizontal width of a grid of a OG map in millimeter	Integer
grid_y_size	Vertical length of a grid of a OG map in millimeter	Integer
min_x_range	Minimum horizontal range in millimeter to be covered by a OG map	Integer
max_x_range	Maximum horizontal range in millimeter to be covered by a OG map	Integer
min_y_range	Minimum vertical range in millimeter to be covered by a OG map	Integer
max_y_range	Maximum vertical range in millimeter to be covered by a OG map	Integer

The number of grids in one row is defined by (max_x_range - min_x_range) / grid_x_size. Likewise, the number of grids in one column is defined by (max_y_range - min_y_range) / grid_y_size.

Obstacle Detection Parameters

Parameter	Description	Value
min_pixel_count_grid	Minimum number of pixels for a grid to be occupied	Integer
min_pixel_count_object	Minimum number of pixels for connected grids to be an object	Integer
max_object_to_detect	Maximum number of objects to detect in a frame	Integer
num_neighbor_grid	Number of neighboring grids to check for connected component analysis	8, 24
enable_spatial_obj_merge	Enabling flag of merging spatially close objects	0, 1
enable_temporal_obj_merge	Enabling flag of use of temporal information	0, 1
enable_temporal_obj_smoothing	Enabling flag of use of a corresponding object in a previous frame to compute an object position	0, 1
object_distance_mode	Method to compute distance between objects (0: distance between centers, 1: distance between corners)	0, 1

e-Stop Parameters

Parameter	Description	Value
min_estop_distance	Minimum distance of e-Stop area. Should be 0	0
max_estop_distance	Maximum distance of e-Stop area in millimeter	Integer
min_estop_width	Width of e-Stop area in millimeter at min_estop_distance	Integer
max_estop_width	Width of e-Stop area in millimeter at max_estop_distance	Integer
min_free_frame_run	Minimum number of consecutive frames without any obstacle in e-Stop area to be determined free	Integer
min_obs_frame_run	Minimum number of consecutive frames with any obstacle in e-Stop area to be determined infringed	Integer

e-Stop area forms a trapezoid defined by the first four parameters. When obstacles are detected in the e-Stop area, detection3D/estop topic is turned on 1, so that the robot can be forced to stop.

Camera Setup

LDC Rectification Table

To create LDC-format LUT for ZED camera, please refer to zed_capture/README.md.

Camera Mounting

For accurate obstacle detection, it is important to mount properly and correct values of camera_height and camera_pitch should be provided. For example, incorrect values of camera pitch angle result in 3D object boxes being overlaid in front of or behind obstacles on images. It is recommended to install the stereo camera parallel to the ground plane or slightly tilted downward, e.g., between 0° and 10°. In general, camera pitch angle should be close to 0 when a camera’s height is low, while camera pitch angle can be larger to some extent when the camera is mounted higher.