Why we need a Dynamic Baseline and Thresholding Tool?

Why we need a Dynamic Baseline and Thresholding Tool?

With the introduction of opCharts v4.2.5 richer and and more meaningful data can be used in decision making. Forewarned is forearmed the poverb goes, a quick google tells me “prior knowledge of possible dangers or problems gives one a tactical advantage”. The reason we want to baseline and threshold our data is so that we can receive alerts forewarning us of issues in our environment, so that we can act to resolve smaller issues before they become bigger. Being proactive increases our Mean Time Between Failure. If you are interested in accessing the Dynamic Baseline and Thresholding Tool, please Contact Us.

Types of Metrics

When analysing time series data you quickly start to identify a common trend in what you are seeing, you will find some metrics you are monitoring will be “stable” that is they will have very repeated patterns and change in a similar way over time, while other metrics will be more chaotic, with a discernible pattern difficult to identify. Take for example two metrics, response time and route number (the number of routes in the routing table), you can see from the charts below that the response time is more chaotic with some pattern but really little stability in the metric, while the route number metric is solid, unwavering.
meatball-responsetime - 750
meatball-routenumber - 750

Comparing Metrics with Themselves

This router meatball is a small office router, with little variation in the routing, however a WAN distribution router would be generally stable, but it would have a little more variability. How could I get an alarm from either of these without configuring some complex static thresholds?

The answer is to baseline the metric as it is and compare your current value against the baseline, this method is very useful for values which are very different on different devices, but you want to know when the metric changes, example are route number, number of users logged in, number of processes running on Linux, response time in general, but especially response time of a service.

The opCharts Dynamic Baseline and Threshold Tool

Overall this is what opTrend does. The sophisticated statistical model it builds is very powerful and helps spots these trends with the baseline tool. We have extended opTrend with some additional functionality so that you can quickly get alerts from metrics which are important to you.

What is really key here is that the baseline tool will detect downward changes as well as upward changes, so if your traffic was reducing outside the baseline you would be alerted.

Establishing a Dynamic Baseline

Current Value

Firstly I want to calculate my current value, I could use the last value collected, but depending on the stability of the metric this might cause false positives, as NMIS has always supported, using a larger threshold period when calculating the current value can result in more relevant results.

For very stable metrics using a small threshold period is no problem, but for wilder values, a longer period is advised. For response time alerting, using a threshold period of 15 minutes or greater would be a good idea. That means that there is some sustained issue and not just a one off internet blip. However with our route number we might be very happy to use the last value and get warned sooner.

Multi-Day Baseline

Currently two types of baselines are supported by the baseline tool, the first is what I would call opTrend Lite, which is based on the work of Igor Trubin’s SEDS and SEDS lite, this methods calculates the average value for a small window of time looking back the configured number of weeks, so if my baseline was 1 hour for the last 4 weeks and the time now is 16:40 on 1 June 2020 it would look back and gather the following:

  • Week 1: 15:40 to 16:40 on 25 May 2020
  • Week 2: 15:40 to 16:40 on 18 May 2020
  • Week 3: 15:40 to 16:40 on 11 May 2020
  • Week 4: 15:40 to 16:40 on 4 May 2020

With the average of each of these windows of time calculated, I can now build my baseline and compare my current value against that baseline’s value.

Same-Day Baseline

Depending on the stability of the metric it might be preferable to use the data from that day. For example if you had a rising and falling value It might be preferable to use just the last 4 to 8 hours of the day for your baseline. Take this interface traffic as an example, the input rate while the output rate is stable with a sudden plateau and is then stable again.

asgard-bits-per-second - 750

If this was a weekly pattern the multi-day baseline would be a better option, but if this happens more randomly, using the same-day would generate an initial event on the increase, then the event would clear as the ~8Mbps became normal, and then when the value dropped again another alert would be generated.

Delta Baseline

The delta baseline is only concerned with the amount of change in the baseline, for example from a sample of data from the last 4 hours we would see that the average of a metric is 100, we then take the current value, for example, the spike of 145 below, and we calculate the change as a percentage, which would be a change of 45% resulting in a Critical event level.

amor-numproc - 750

The delta baseline configuration then allows for defining the level of the event based on the percentage of change, for the defaults, this would result in a Major, you can see the configuration in the example below, this table is how to visualize the configuration.

  • 10 – Warning
  • 20 – Minor
  • 30 – Major
  • 40 – Critical
  • 50 – Fatal

If the change is below 10% the level will be normal, between 10% and 20% Minor, and so up to over 50% it will be considered fatal.

In practicality this spike was brief and using the 15 minute threshold period (current is the average of the last 15 minutes) the value for calculating change would be 136 and the resulting change would be 36% so a Major event. The threshold period is dampening the spikes to remove brief changes and allow you to see changes which last longer.

Installing the Baseline Tool

Copy the file to the server and do the following, upgrading will be the same process.

tar xvf Baseline-X.Y.tgz
cd Baseline/
sudo ./install_baseline.sh

Working with the Dynamic Baseline and Thresholding Tool

The Dynamic Baseline and Threshold Tool includes various configuration options so that you can tune the algorithm to learn differently depending on the metric being used. The tool comes with several metrics already configured. It is a requirement of the system that the stats modeling is completed for the metric you require to be baseline, this is how the NMIS API extracts statistical information from the performance database.

Conclusion

For more information about the installation and configuration steps required to implement opCharts’ Dynamic Baseline and Thresholding tool, it is all detail in our documentation – here.

Virtual reality surfing a winner at Opmantek Industry Awards

Virtual reality surfing a winner at Opmantek Industry Awards

Enabling surfing coaches and their trainees to use virtual reality (VR) to overcome traditional obstacles in the sports’ training environment has won a team Griffith Sciences students top gong at the Opmantek Industry Awards.

The awards recognise industry collaboration, and each nominated project worked with real clients – in this case Surfing Australia’s High Performance Centre and Griffith’s Ideas Lab.

VR Wave team members Jake Ballard, Matthew Faust, Francesco Mennella, Mauro Oliveri, Dominic Rochin and Christopher Wood, who study computer science and information technology at Griffith, were commended on their achievement and innovative thinking.

“Our team was tasked with overcoming two problems surfing coaches face – unpredictable surf conditions and the limitations of traditional coaching methods,” said Dominic.

“To overcome these problems, the project goal was to create a virtual reality wave-selection tool aimed at improving wave identification and selection skills of surfers.”

Wave selection is key, as it has a significant impact on the score surfers can expect during competition. Traditional coaching methods do not allow for instant feedback (as the surfer is in the water) which can limit the useability and translation of feedback into improved performance.

To solve this problem, the VR Wave team created a virtual reality surfing simulation that enables better coaching communication to improve the surfers’ wave selection.

The team of six was awarded $2500 for their win.

The judging panel for the awards included Opmantek President and Chairman Danny Maher, Gold Coast Innovation Hub CEO Sharon Hunneybell, Australia Computer Society Queensland Branch Executive Committee Member Sharon Singh and Griffith’s Head of the School of Information and Communication Technology Professor Paulo de Souza.

“It is so good to see the talent of all the students flourishing when they go and work with industry, working on practical projects exactly as they would in the workforce,” said Professor de Souza.

Professor de Souza also thanked the award sponsor, Opmantek.

The Opmantek Industry Awards is hosted by Griffith, in collaboration with Opmantek. Opmantek is a multi-award-winning software company operating in the field of intelligent network management and IT audit software.

OPMANTEK WELCOMES NASA ABOARD WITH NEW AGREEMENT

OPMANTEK WELCOMES NASA ABOARD WITH NEW AGREEMENT

SAN FRANCISCO, CA – (July 07, 2021) – Opmantek Software, one of the world’s leading providers of automated network management software, welcomes NASA onboard with a recent agreement to provide software and support to the Artemis program. In a move that will support the progress of humankind’s space exploration, this agreement will ensure that NASA has the right software to achieve the mission outcomes.

“We are so excited to provide a piece of the complex infrastructure that will help NASA land on the Moon,” said Craig Nelson, CEO of Opmantek. “To think that Opmantek, just like many other suppliers, is part of this mission is fantastic. This mission, just like the Apollo program, will make history and it’s something we can talk to our kids about.”

Opmantek Software is well known for its ability to scale, replace or consolidate older systems and its flexibility to deliver network visibility and automation regardless of the size, location and type of hardware and software infrastructure. It’s validated by MSPs, telecommunications companies, ISPs worldwide and any operation where monitoring the IT Network is critical to successful operations.

Request a demonstration of Opmantek Software or talk to a network engineer about your specific network management, automation or audit requirements.

About Opmantek:
Opmantek is an industry-leading software company operating in the field of Intelligent Network Management, Network Process Automation and IT Audit. Opmantek software manages some of the world’s most complex IT environments, including some of the world’s largest telecommunications carriers and Managed Service Providers. Learn more about Opmantek at www.opmantek.com

About NASA:
The National Aeronautics and Space Administration is America’s civil space program and the global leader in space exploration. The agency has a diverse workforce of just under 18,000 civil servants, and works with many more U.S. contractors, academia, and international and commercial partners to explore, discover, and expand knowledge for the benefit of humanity.

NASA also leads a Moon to Mars exploration approach, which includes working with U.S. industry, international partners, and academia to develop new technology, and send science research and soon humans to explore the Moon on Artemis missions that will help prepare for human exploration of the Red Planet. Learn more about NASA at https://www.nasa.gov

Opmantek Innovation Awards 2021

Opmantek Innovation Awards 2021

What is it?

Fostering innovation is the catalyst to business growth. Opmantek are proud to support our industry’s growth by sponsoring Griffith University’s Industry award. This award focuses on encouraging students to apply their skills and innovative ideas to real-world situations, and businesses that are in need of bright solutions.

Building a diverse and forward-thinking team in today’s climate has never been more important. This event hosted by Griffith’s School of ICT will showcase the emerging talent graduating from Griffith University.

Registration

If you’re looking for qualified staff with project experience, have an area of interest in entrepreneurship or the ICT field, click to register your attendance.

The 2021 Opmantek Awards will be held on 1 November 2021 from 4 pm – 6 pm AEST, via Teams.

Opmantek Logo
Griffith - 500

Previous Winners

Octadoc - 400

Team name: Octadoc

The Project: The goal of the project was to re-design an existing online tool, Octadoc. Octadoc allows a GP to record clinic notes more efficiently by streamlining the process. The team re-designed the existing tool to be user friendly and to incorporate the ability to use different templates or create new templates in such a way that they are organised and shareable with other users.

Jupiter - 400

Team name: Jupiter 305

The Project: Jupiter 305’s project was called the ‘Virtual Space Tour’, which was a working prototype aimed to enhance the learning material for 7th grade science students through virtual reality.
VADSA - 400

Team Name: Vietnamese Agriculture Decision Support App (VADSA)

The Project: VADSA were tasked to develop an Android app that would help farmers, specifically in the Mekong Delta, to help predict farming outcomes based on environmental conditions. VADSA used professional sources and node collection to run a prediction algorithm that gave suggestions on suitable farming techniques to maximise the probability of successful crop generation. This was all packaged on a mobile device for a farmer to use in the field.

Vision VR - 400

Team name: Vision VR

The Project: Vision VR utilized the unique capabilities of Virtual Reality to tackle the complex issue of Health and Safety for factory and production line workers. In conjunction with Tafe Coomera, the team built an accurate virtual training space for their client, packaging company Orora, to use with training their employees to familiarize themselves with the machinery they will use in their day to day operations. The high level of detail and various training modes allow for guided training and assessment of employee’s knowledge of safety features and protocols and moves toward a safer overall operation for both companies and employees.

Introducing Programmable Button Actions

Introducing Programmable Button Actions

Opmantek has long believed that Operational Process Automation is one of the foundational pillars that a successful network management strategy is built upon. One key piece to this is to ensure that actions are undertaken in a consistent manner each time, there should be no variance from what is outlined as the standard protocol. opEvents has introduced programmable button actions that help assist organisations in replicating troubleshooting actions and escalation procedures further solidifying opEvents as a technical service desk.

The buttons use the same pipeline as scripts in EventActions but now operators have the ability to manually kick off an action for an event. One of the most common actions will be to create a ticket in your issue tracking system, in our case we will create a Jira Ticket.

opEvents-Programmable-Buttons - 700

Configuration

To start create the following file in omk/conf/table_schemas/opEvents_action-buttons.json This must be valid JSON schema or the buttons will fail to render. You should see an error in opEvents.log if this is the case. [ { “description”: “Example Events Button Action”, “label”: “Create Ticket”, “fa_icon”: “fas fa-jira”, “script”: “create_ticket”, “tags”: [“ticket”] } ] Then add the following policy in omk/conf/EventActions.json|.nmis that triggers show_button.tag() EventActions.json “policy”: { “5”: { “IF”: “event.any”, “THEN”: “show_button.ticket()”, “BREAK”: “true” }, } & EventActions.nmis %hash = ( ‘policy’ => { ‘5’ => { IF => ‘event.any’, THEN => ‘show_button.create_ticket()’, BREAK => ‘true’ }, } ); These are the supported keys and how the change operation and look of the button.
Key Type Required Description
script String Yes Name of the script defined in EventActions.json
label String Yes Label which the button will display to the user
description String optional Tool-tip help text to be displayed when you mouse over the button
tags array[string] optional If no tags are defined the button will show on all events, if tags are defined the button will only show on events which have been tagged with show_button.tag_name()
run_once boolean optional If set to true the button will look for script.script_name key on the event, if found the button will disable itself. This allows manual actions to only be triggered once. Will not influence any defined EventActions.json operations.
fa_icon string optional Icon to be displayed from the Font Awesome library shipped with opEvents example: “fas fa-table-tennis” Icons here.
class string optional Define a css class to colour the button, see Notes on Button Classes below to see a list of supported types.

Notes on Font Awesome

In opEvents-3.2.2 we are shipping the library 5.12.1 In opEvents-2.6.1 we are shipping the library 5.8.2