Gesture Recognition Algorithm


REVISION HISTORY

Revision No.
Description
Date
101
  • Initial release
  • 03/22/2024

    1. Algorithm Description

    The gesture recognition algorithm primarily recognizes 13 types of gestures:

    • Static gestures (10 types): L, dislike, ok, fist, stop, like, yes, one, call, rock
    • Dynamic gestures (3 types): wave_up, wave_down, grip
    • When in use, please follow the standard gestures shown in the figure below to ensure the effectiveness of the algorithm:

    Model Introduction:

    model Function Resolution(w*h) Input Format
    handpose_det36m.img Gesture Detection (Medium) 640*352 yuvsp420_nv12
    handpose_det48m.img Gesture Detection (Medium) 800*480 yuvsp420_nv12
    handpose_det36l.img Gesture Detection (Large) 640*352 yuvsp420_nv12
    handpose_det48l.img Gesture Detection (Large) 800*480 yuvsp420_nv12
    handpose_cls.img Gesture Pose Classification 224*224 yuvsp420_nv12

    2. Interface Calling Process

    The algorithm interface calling process is ALGO_HandPose_CreateHandle->ALGO_HandPose_InitHandle->ALGO_HandPose_GetInputAttr->ALGO_HandPose_SetParams->ALGO_HandPose_Detect->ALGO_HandPose_Cls->ALGO_HandPose_DeinitHandle->ALGO_HandPose_ReleaseHandle.

    3. Functional Module API

    API Name Function
    ALGO_HandPose_CreateHandle Create Handle
    ALGO_HandPose_InitHandle Initialize Handle
    ALGO_HandPose_GetInputAttr Get Input Attributes
    ALGO_HandPose_SetParams Set Parameters
    ALGO_HandPose_Detect Hand Detection
    ALGO_HandPose_Cls Gesture Recognition
    ALGO_HandPose_DeinitHandle Handle Deinitialization
    ALGO_HandPose_ReleaseHandle Release Handle

    3.1. ALGO_HandPose_CreateHandle

    • Function

      Create Handle

    • Syntax

      MI_S32 ALGO_HandPose_CreateHandle(void** handle);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Output
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.2. ALGO_HandPose_InitHandle

    • Function

      Initialize Handle

    • Syntax

      MI_S32 ALGO_HandPose_InitHandle(void *handle, const PoseInit_t *init);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
      init Initialization Parameters Input
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.3. ALGO_HandPose_GetInputAttr

    • Function

      Get Input Attributes

    • Syntax

      MI_S32 ALGO_HandPose_GetInputAttr(void *handle, PoseInputAttr_t *det_input_attr, PoseInputAttr_t *cls_input_attr);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
      det_input_attr Detection Model Input Attributes (Width, Height, Image Format) Output
      cls_input_attr Recognition Model Input Attributes (Width, Height, Image Format) Output
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.4. ALGO_HandPose_SetParams

    • Function

      Set Parameters

    • Syntax

      MI_S32 ALGO_HandPose_SetParams(void *handle, const PoseParams_t* params);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
      params Parameters Input
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.5. ALGO_HandPose_Detect

    • Function

      Hand Detection

    • Syntax

      MI_S32 ALGO_HandPose_Detect(void *handle, const PoseInput_t *input, PoseBox_t boxes[MAX_POSE_OBJECT], MI_S32 *num_boxes);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
      input Detection Model Input Data 800*480 yuv420_nv12 Input
      boxes Detection Box Results Output
      num_boxes Number of Detection Boxes Output
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.6. ALGO_HandPose_Cls

    • Function

      Hand Pose Recognition, identifying which gesture among the 13 gestures

    • Syntax

      MI_S32 ALGO_HandPose_Cls(void *handle,  const PoseInput_t *input, PoseBox_t* box, PoseCls_t* cls);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
      input Recognition Model Input Data 224*224 yuv420_nv12 Input
      box Hand Detection Box Input
      cls Hand Pose Result Output
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.7. ALGO_HandPose_DeinitHandle

    • Function

      Handle Deinitialization

    • Syntax

      MI_S32 ALGO_HandPose_DeinitHandle(void *handle);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    3.8. ALGO_HandPose_ReleaseHandle

    • Function

      Release Handle

    • Syntax

      MI_S32 ALGO_HandPose_ReleaseHandle(void *handle);
      
    • Parameters

      Parameter Name Description Input/Output
      handle Handle Input
    • Return Value

      0: Success.

      Non-zero: Failure.

    • Dependency

      Header File: sgs_pose_api.h

      Library File: libsgsalgo_pose.so, libsgsalgo_pose.a

    4. Data Types

    Data Type Definition
    PoseInit_t Initialization Parameter Structure
    PoseInput_t Input Image Property Structure
    PoseInputAttr_t Model Input Data Structure
    PoseCls_t Gesture Recognition Result Structure
    PoseBox_t Hand Detection Box Structure
    PoseParams_t Input Parameters Structure
    HandPoseClass_e Gesture Category Information
    HandKeyPoint_e Hand Key Point Category Information

    4.1. PoseInit_t

    • Description

      Initialization Parameter Structure

    • Definition

      typedef struct
      {
          char ipu_firmware_path[MAX_POSE_STRLEN];
          char pose_model[MAX_POSE_STRLEN];
          char cls_model[MAX_POSE_STRLEN];
          MI_BOOL create_device;
          MI_BOOL destroy_device;
      } PoseInit_t;
      
    • Members

      Member Name Description
      ipu_firmware_path ipu_firmware_bin path
      pose_model Detection Model Path (e.g., ./models/handpose_det48y_20240313.img)
      cls_model Recognition Model Path (e.g., ./models/handpose_cls_20240222.img)
      create_device Whether to create device: if false, need to create device externally; if true, create device automatically within the algorithm library
      destroy_device Whether to destroy device: if false, need to destroy device externally; if true, destroy device automatically within the algorithm library
    • Related Interfaces

      ALGO_HandPose_InitHandle

    4.2. PoseInput_t

    • Description

      Input Image Data Structure

    • Definition

      typedef struct
      {
          void *p_vir_addr;
          MI_PHY phy_addr;
          MI_U32 buf_size;
          MI_U64 pts;
          MI_U16 width;
          MI_U16 height;
      } PoseInput_t;
      
    • Members

      Member Name Description
      p_vir_addr Virtual Address
      phy_addr Physical Address
      buf_size Data Size
      pts Timestamp (can be left unassigned)
      width Image Width
      height Image Height
    • Related Data Types and Interfaces

      ALGO_HandPose_Detect

      ALGO_HandPose_Cls

    4.3. PoseInputAttr_t

    • Description

      Model Input Attribute Structure

    • Definition

      typedef struct
      {
          MI_U32 width;
          MI_U32 height;
          MI_IPU_ELEMENT_FORMAT format;
      } PoseInputAttr_t;
      
    • Members

      Member Name Description
      width Input Image Width
      height Input Image Height
      format Input Image Format
    • Related Data Types and Interfaces

      ALGO_HandPose_GetInputAttr

    4.4. PoseCls_t

    • Description

      Gesture Recognition Result Structure

    • Definition

      typedef struct
      {
          MI_S32 cls;
          MI_FLOAT score;
      } PoseCls_t;
      
    • Members

      Member Name Description
      cls Gesture Category (Corresponding relationship see HandPoseClass_e)
      score Gesture Category Score
    • Related Data Types and Interfaces

      ALGO_HandPose_Cls

    4.5. PoseBox_t

    • Description

      Hand Detection Box Structure

    • Definition

      typedef struct
      {
          MI_U32 x;
          MI_U32 y;
          MI_U32 width;
          MI_U32 height;
          MI_S32 cls;
          MI_FLOAT score;
          MI_U64 pts;
          MI_FLOAT keypts[MAX_POSE_NUM_KEYPTS][2];
          MI_U64 track_id;
      } PoseBox_t;
      
    • Members

      Member Name Description
      x Box's top-left X coordinate
      y Box's top-left Y coordinate
      width Box's Width
      height Box's Height
      cls Box's Category
      score Box's Score
      pts Timestamp (can be left unassigned)
      keypts Key Point Sequence
      track_id Tracking ID
    • Related Data Types and Interfaces

      ALGO_HandPose_Detect

      ALGO_HandPose_Cls

    4.6. PoseParams_t

    • Description

      Input Parameters Structure

    • Definition

      typedef struct
      {
          MI_S32 disp_width;
          MI_S32 disp_height;
          MI_FLOAT min_width;
          MI_FLOAT min_height;
          MI_FLOAT det_threshold;
          MI_FLOAT cls_threshold;
          MI_FLOAT attr_threshold[MAX_POSE_NUM_ATTR];
      } PoseParams_t;
      
    • Members

      Member Name Description
      disp_width Display Width
      disp_height Display Height
      min_width Minimum Width for Detection
      min_height Minimum Height for Detection
      det_threshold Detection Threshold
      cls_threshold Recognition Threshold (the higher the threshold, the stricter the recognition criteria)
      attr_threshold Attribute Threshold (unused in gesture recognition)
    • Related Data Types and Interfaces

      ALGO_HandPose_SetParams

    4.7. HandPoseClass_e

    • Description

      Gesture Category Information

    • Definition

      typedef enum
      {
          E_HAND_POSE_NONE = 0,
          E_HAND_POSE_CALL,
          E_HAND_POSE_DISLIKE,
          E_HAND_POSE_FIST,
          E_HAND_POSE_FOUR,
          E_HAND_POSE_LIKE,
          E_HAND_POSE_MUTE,
          E_HAND_POSE_OK,
          E_HAND_POSE_ONE,
          E_HAND_POSE_PALM,
          E_HAND_POSE_PEACE,
          E_HAND_POSE_ROCK,
          E_HAND_POSE_STOP,
          E_HAND_POSE_STOP_INV,
          E_HAND_POSE_THREE,
          E_HAND_POSE_TWO_UP,
          E_HAND_POSE_TWO_UP_INV,
          E_HAND_POSE_THREE2,
          E_HAND_POSE_PEACE_INV,
          E_HAND_POSE_DOWN_INV,
          E_HAND_POSE_L,
          E_HAND_POSE_WAVE_UP,
          E_HAND_POSE_WAVE_DOWN,
          E_HAND_POSE_GRIP,
          E_NUM_HAND_POSE
      } HandPoseClass_e;
      
    • Related Data Types and Interfaces

      PoseCls_t

      ALGO_HandPose_Cls

    4.8. HandKeyPoint_e

    • Description

      Hand Key Point Category Information

    • Definition

      typedef enum
      {
          E_WRIST = 0,
          E_THUMB_CMC,
          E_THUMB_MCP,
          E_THUMB_IP,
          E_THUMB_TIP,
          E_INDEX_FINGER_MCP,
          E_INDEX_FINGER_PIP,
          E_INDEX_FINGER_DIP,
          E_INDEX_FINGER_TIP,
          E_MIDDLE_FINGER_MCP,
          E_MIDDLE_FINGER_PIP,
          E_MIDDLE_FINGER_DIP,
          E_MIDDLE_FINGER_TIP,
          E_RING_FINGER_MCP,
          E_RING_FINGER_PIP,
          E_RING_FINGER_DIP,
          E_RING_FINGER_TIP,
          E_PINKY_MCP,
          E_PINKY_PIP,
          E_PINKY_DIP,
          E_PINKY_TIP,
          E_NUM_HAND_KEYPTS
      } HandKeyPoint_e;
      
    • Illustration

    • Related Data Types and Interfaces

      PoseBox_t

      ALGO_HandPose_Detect

      ALGO_HandPose_Cls