This page describes writing a custom interface/ extending the client API to the FlightGoggles Binary. The message passing between the client and renderer is performed using TCP via ZeroMQ. The outgoing and incoming message are serialized as JSON.
Outgoing Message
To request a render, or modify the positions of objects (including cameras) the client needs to send an outgoing state message. The state message has the following fields:
sceneIsInternal
: Typeboolean
. This must always be set to true. This could be used to load external scenes in the Unity editor through TriLib (This is currently unsupported).sceneFilename
: Typestring
. This is set to the scene that should be loaded. This can currently be set toAbandoned_Factory_Sunset
,Stata_GroundFloor
orStata_Basement
.ntime
: Typeint64_t
. This is the timestamp of the requested render in nanoseconds and can be specified as a 64 bit integer.camWidth
: Typeint
. This is used to specify the width in pixels of the requested render.camHeight
: Typeint
. This is used to specify the height in pixels of the requested render.camFOV
: Typefloat
. This is used to specify the camera field of view.camDepthScale
. Typedouble
. This is used to specify the depth resolution of the camera, e.g. 0.02 corresponds to a resolution of 2 cm.cameras
. TypeList<Camera>
(see Camera type). This is used to specify the list of cameras that are required from the render binary.objects
. TypeList<Objects>
(see Object type). This is used to specify the list of objects and their poses in the environment.
An example of this struct in c++ is provided below:
struct StateMessage_t { bool sceneIsInternal = true; std::string sceneFilename = "Museum_Day_Small"; int64_t ntime; int camWidth = 1024; int camHeight = 768; float camFOV = 70.0f; double camDepthScale = 0.20; std::vector<Camera_t> cameras; std::vector<Object_t> objects; };
Camera Type
The camera type has the following fields:
ID
: typestring
. This field is used to specify the unique id of the camera.position
: typeList<double>
. This field is a list with 3 elements used to specify the translation of the camera in the environment. Note, this is specified in Unity co-ordinates (X Right, Y Up, Z Forward)rotation
. typeList<double>
. This field is a list with 4 elements used to specify the rotation of the camera in the environment as a quaternion.channels
: typeint
. This field is used to specify the number of channels. This can be set to 1 for grayscale and 3 for RGB/ Semantic cameras.isDepth
: typeboolean
. This field is used to specify if this camera is a depth camera.outputIndex
: typeint
. This field is used to specify the output index of the camera in the incoming data packet.hasCollisionCheck
: typeboolean
. This field is used to specify if this camera should check for collisions.doesLandmarkVisCheck
: typeboolean
. This field is used to specify if this camera should check for visible infra red beacons.
An example struct in c++ is shown below.
struct Camera_t { std::string ID; std::vector<double> position; std::vector<double> rotation; int channels; bool isDepth; int outputIndex; bool hasCollisionCheck = true; bool doesLandmarkVisCheck = false; };
Object Type
The object type has the following fields:
ID
: typestring
. This field is used to specify the unique object ID.prefabID
: typestring
. This field specifies the name of the prefab object to instantiate and place. This must match a prefab in theResources
folder of the binary.position
: typeList<double>
. This field specifies the 3 translation elements of the object in the environment. Note, this is specified in Unity co-ordinates (X right, Y up, Z forward).rotation
: typeList<double>
. This field specifies the rotation of the object in the environment as a quaternion.size
: typeList<double>
. This field specifies the scaling of the object in the x, y, and z axis.
An example of the object structure in c++ is shown below:
struct Object_t { std::string ID; std::string prefabID; std::vector<double> position; std::vector<double> rotation; std::vector<double> size; };
Incoming Message
The incoming message from Unity has the following fields:
renderMetadata
: typeRenderMetadata_t
. This field specifies all the returned render metadata.images
: typeList<Mat>
. This field returns all the requested renderers as a list.
An example of the struct in c++ is shown below:
struct RenderOutput_t { RenderMetadata_t renderMetadata; std::vector<cv::Mat> images; };
RenderMetadata_t type
The render metadata type structure has the following fields:
ntime
: Typeint64_t
. This field specifies the returns the timestamp in nanoseconds as a 64 bit integer.camWidth
: Typeint
. This field returns the camera width of the rendered image.camHeight
: Typeint
. This field returns the camera height of the rendered image.camDepthScale
: Typedouble
. This field returns the camera depth scale.cameraIDs
: TypeList<string>
. This field returns the list of rendered camera IDs.channels
: TypeList<int>
. This field returns the number of channels in each rendered camera as a list.hasCameraCollision
: Typeboolean
. This field returns the collision state of the camera.lidarReturn
: Typefloat
. This field returns the height measurement measured by the downward facing lidar.
An example c++ struct is shown below.
struct RenderMetadata_t { int64_t ntime; int camWidth; int camHeight; double camDepthScale; std::vector<std::string> cameraIDs; std::vector<int> channels; bool hasCameraCollision; float lidarReturn; };