Abstract: Capturing cross-pose correlation from a sequence of frame-level 2D poses is essential for 3D human pose estimation (3D-HPE) in the video. Recent studies have shown the promising potential of ...