MPlatform SDK: using advanced statistics for application monitoring

We have made it possible to monitor the enhanced statistics for MPlatform objects in your application. The main purpose of this feature is to let you monitor the application state in real time. And it something goes wrong, you will get a signal immediately. It will be really useful for the applications that run 24/7.

 

See this documentation page for how to enable and receive the extended object statistics.


Below are a some tips that will help you to make sure everything is running smoothly in your application.


Monitoring the state of objects


The basic approach to monitoring your application is checking the object state periodically. It does not even require extended statistics. All you need isto  call ObjectStateGet methods for your objects repeatedly. The method returns the object's current state. "eMS_Running" means everything is fine. "eMS_Error", "eMS_Closed" or "eMS_Stopped" may mean that something is not working the way it should.


Monitoring frame loss


The extended statistics engine will provide you with the following data.


breaks - counts the situations when frame flow was interrupted. For example, when MLive object was expecting to receive a frame, but was not able to get it on time. That means that something had happened with the frame source and it could not deliver the frame on time. This parameter value should be low or slowly increase from time to time. If it increases constantly - something goes wrong with the frames flow.


Please note that breaks can also appear in common situations:


  • when the playlist switches from one item to another;
  • when the operator performs seeking;
  • when the object starts and the first frame arrives.


drops (for MWriter and MRenderer objects) - counts frames that were dropped. A frame can be dropped if the writer or renderer can't deliver it to the final destination and has to drop it (for example, in cases of CPU overload). This parameter value should be low and should not increase with time. If it does - something is wrong.


dups (for MLive object working with DirectShow device) - counts the situations when MLive was expecting to receive a frame from the device, but did not receive it on time. So, instead of sending a new frame to the output, MLive had to repeat the previous frame to provide a smoother viewing experience. This parameter value should be low. If it isn't, something is wrong with the DirectShow device as it is not able to provide frames consistently. (Yes, dups is a linguistic typo from dupes.)


Frame rate stability monitoring


For nice and smooth output the frequency of frames that flow through MPlatform's objects should be more or less constant. By frame rate we mean the time between the arrival of frame N-1 and frame N. If something is wrong (like too much CPU load), the frame rate frequency changes.


Here's the parameters that we use to monitor frame rate stability:

 

jitter - the average frame rate deviation from the normal value during last 128 frames. This value should not be too big to be sure that frame rate is constant.


fps_avg - average frame rate value for last 128 frames.


max_time - maximum frame length value that was received in last 128 frames.


min_time - minimum frame length value that was received  in last 128 frames.


These values should be almost equal to the frame rate. If they differ too much, the frame rate is not constant and the output might not be smooth.


Monitoring audio/video sync


To be sure that audio and video are in sync you need to monitor this parameter:


av_sync - shows the difference between video and audio times in milliseconds. This value should be less than 100 to be sure that audio and video are in sync.


Monitoring Blackmagic devices


Here are the parameters that will help you to be sure that your Blackmagic output device is working properly:


bmd.missed - counts the situations when frames did not arrive to the MRenderer object from the source on time (similar to drops in other objects). This value should be small or equal to 0 and should not increase during playback. If it does, something is wrong.


bmd.wait_overtime_cnt - counts the situations when the frame arrives to MRenderer too late and cannot be sent to the output. Just like bmd.missed, this value should be small and should not increase during playback.


bmd.hw_correct_cnt - counts corrections for the internal hardware clock of the Blackmagic device. When we play out to a Blackmagic device we use it's hardware clock to control the output frame rate. Sometimes (for unknown reasons) this clock fails and its "ticking" interrupts. We track these situations automatically and fix them using the system clock. When this fix happens, the bmd.hw_correct_cnt value increases in count. This parameter should be 0. If it increases all the time - your card's clock is insane. We recommend that you replace it.


There are two more parameters that we use to monitor time corrections:


bmd.skipped - counts frames that were skipped during time sync correction.


bmd.stream_offset - the time offset between current Blackmagic time and original time.


When a frame is dropped and does not arrive on time to the renderer, it spends some time to wait for the frame. When this happens, a small delay happens between stream time and the real-time clock. This delay will increase during playback each time a frame is dropped (does not arrive on time). Usually, this delay does not matter at all. But sometimes (for example, when 2 playlists have to be output to the renderer in sync) it can be critical to have stream time in sync with the real-time clock. The "output.time_sync" parameter enables special logic that corrects the playout time according to the original clock in case of frame drops.