This is the 17th article in the “Real Words or Buzzwords?” series about how real words become empty words and stifle technology progress, also published on SecurityInfoWatch.com.
By Ray Bernard, PSP, CHS-III
While many people equate high availability with good user experience, many more factors are likely to impact that user experience than server or network uptime.
★ ★ ★ GET NOTIFIED! ★ ★ ★
SIGN UP to be notified by email the day a new Real Words or Buzzwords? article is posted!
Real Words or Buzzwords?
The Bi-Weekly Article Series
#1 Proof of the buzzword that killed tech advances in the security industry—but not other industries.
#2 Next Generation (NextGen): A sure way to tell hype from reality.
#3 Customer Centric: Why all security industry companies aren't customer centric.
#4 Best of Breed: What it should mean to companies and their customers.
#5 Open: An openness scale to rate platforms and systems
#6 Network-friendly: It's much more than network connectivity.
#7 Mobile first: Not what it sounds like.
#8 Enterprise Class (Part One): To qualify as Enterprise Class system today is world's beyond what it was yesterday.
#9 Enterprise Class (Part Two): Enterprise Class must be more than just a top-level label.
#10 Enterprise Class (Part Three): Enterprise Class must be 21st century technology.
#11 Intuitive: It’s about time that we had a real-world testable definition for “intuitive”.
#12 State of the Art: A perspective for right-setting our own thinking about technologies.
#13 True Cloud (Part One): Fully evaluating cloud product offerings.
#14 True Cloud (Part Two): Examining the characteristics of 'native-cloud' applications.
#15 True Cloud (Part Three): Due diligence in testing cloud systems.
#16 IP-based, IP-enabled, IP-capable, or IP-connectable?: A perspective for right-setting our own thinking about technologies.
#17 Five Nines: Many people equate high availability with good user experience, yet many more factors are critically important.
#18 Robust: Words like “robust” must be followed by design specifics to be meaningful.
#19 Serverless Computing – Part 1: Why "serverless computing" is critical for some cloud offerings.
#20 Serverless Computing – Part 2: Why full virtualization is the future of cloud computing.
#21 Situational Awareness – Part 1: What products provide situational awareness?
#22 Situational Awareness – Part 2: Why system designs are incomplete without situational awareness?
#23 Situational Awareness – Part 3: How mobile devices change the situational awareness landscape?
#24 Situational Awareness – Part 4: Why situational awareness is a must for security system maintenance and acceptable uptime.
#25 Situational Awareness – Part 5: We are now entering the era of smart buildings and facilities. We must design integrated security systems that are much smarter than those we have designed in the past.
#26 Situational Awareness – Part 6: Developing modern day situational awareness solutions requires moving beyond 20th century thinking.
#27 Situational Awareness – Part 7: Modern day incident response deserves the help that modern technology can provide but doesn’t yet. Filling this void is one of the great security industry opportunities of our time.
#28 Unicity: Security solutions providers can spur innovation by envisioning how the Unicity concept can extend and strengthen physical access into real-time presence management.
#29 The API Economy: Why The API Economy will have a significant impact on the physical security industry moving forward.
#30 Future-Proof: What does Future-Proof mean in an era of managed services, continuous delivery, and ever-accelerating technology advancement?
#33 Software-Defined: Cloud-computing technology, with its many software-defined elements, is bringing self-scaling real-time performance capabilities to physical security system technology.
#34 High-Performance: How the right use of "high-performance" can accelerate the adoption of truly high-performing emerging technologies.
#35 Erasure Coding: Why RAID drive arrays don’t work anymore for video storage, and why Erasure Coding does.
#36 Presence Control: Anyone responsible for access control management or smart building experience must understand and apply presence control.
#37 Internet+: The Internet has evolved into much more than the information superhighway it was originally conceived to be.
#38 Digital Twin: Though few in physical security are familiar with the concept, it holds enormous potential for the industry.
#39 Fog Computing: Though commonly misunderstood, the concept of fog computing has become critically important to physical security systems.
#40 Scale - Part 1: Although many security-industry thought leaders have advocated that we should be “learning from IT,” there is still insufficient emphasis on learning about IT practices, especially for large-scale deployments.
#41 Scale - Part 2: Why the industry has yet to fully grasp what the ‘Internet of Things’ means for scaling physical security devices and systems.
#42 Cyberspace - Part 1: Thought to be an outdated term by some, understanding ‘Cyberspace’ and how it differs from ‘Cyber’ is paramount for security practitioners.
#43 Cyber-Physical Systems - Part 1: We must understand what it means that electronic physical security systems are cyber-physical systems.
#44 Cyberspace - Part 2: Thought to be an outdated term by some, understanding ‘Cyberspace’ and how it differs from ‘Cyber’ is paramount for security practitioners.
#45 Artificial Intelligence, Machine Learning and Deep Learning: Examining the differences in these technologies and their respective benefits for the security industry.
#46 VDI – Virtual Desktop Infrastructure: At first glance, VDI doesn’t seem to have much application to a SOC deployment. But a closer look reveals why it is actually of critical importance.
#47 Hybrid Cloud: The definition of hybrid cloud has evolved, and it’s important to understand the implications for physical security system deployments.
#48 Legacy: How you define ‘legacy technology’ may determine whether you get to update or replace critical systems.
#49 H.264 - Part 1: Examining the terms involved in camera stream configuration settings and why they are important.
#50 H.264 - Part 2: A look at the different H.264 video frame types and how they relate to intended uses of video.
More to come about every other week.
I was checking out the ID badge issuance process, when suddenly the browser displayed a system error page, with details about the database error and the software code involved. The menu was still at the top of the page, so I navigated to the first page of the badge process, which listed the personnel in the demo account, but this time all of the photos were the same: a placeholder image that said, “No Photo”. Navigating around, I saw that anything having to do with Personnel showed the No Photo image, and seemed to be missing some parts of the personnel data previously displayed. Whatever I did had changed the state of something in the system, and it wasn’t correcting itself.
This occurred at 11:30 PM, and I was concerned that I may have taken some portion of their system offline. There was no tech support phone number to call. So, making notes and taking screenshots, I emailed the head of marketing (it was a website issue) and the head of sales (since this was the demo sales uses account).
The next morning the problem was still there, and I had received no feedback from my email. So, I called the head of marketing, who said, “Don’t worry about it. It’s a completely separate computer just for demos. It’s not our actual system. It’s not really up to date.” I was speechless, so she continued.
“Our actual system is on Microsoft Azure, and that guarantees 99.999% uptime. So, we wouldn’t have that same kind of problem. Don’t worry about it.” So, I didn’t. I just crossed that company off my list of candidate cloud system vendors and went on with other work. There were so many things wrong with the responses that it would have taken way to long to write an educational note back about it.
The #1 question I should have been asked was, “What prompted you to try our demo system?” I thought for sure the sales manager would follow up on a potentially hot prospect. After all, I took the time to write them an explanatory note and provide screen shots. How many other people ran into that problem and just closed the browser, mentally writing off the company and the product?
What’s Wrong with 99.999% Uptime?
More recently, I have had several sales people mention high availability and state that Microsoft Azure guarantees them “five nines” of uptime, referring to their SaaS (Software as a Service) offering. There are several things wrong with this thinking.
- No Guarantee for SaaS. SaaS vendors receive an uptime guarantee for the Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) services they subscribe to. A cloud infrastructure provider can’t possibly guarantee that your cloud SaaS application won’t fail, so they don’t. Cloud application developers know this, and most but evidently not all sales folks do.
- It’s Not 99.999%. In its Service Level Agreement (SLA) for Virtual Machines, Microsoft Azure states (with my comment in brackets): “For all Virtual Machines that have two or more instances deployed in the same Availability Set [redundant virtual servers running on different physical hardware], we guarantee you will have Virtual Machine Connectivity to at least one instance at least 99.95% of the time. For any Single Instance Virtual Machine using premium storage for all Operating System Disks and Data Disks, we guarantee you will have Virtual Machine Connectivity of at least 99.9%.” This means that allowable downtime per month is about 22 minutes for 99.95%, and 44 minutes for 99.9%.
- Uptime Itself is not Guaranteed. The guarantee provides a service credit if the monthly uptime percentage target is not met. For Virtual Machines in an Availability Set, less than 99.5% gets you a 10% credit, less than 99% gets you a 25% credit, and less than 95% gets you a 100% credit, with the credit being applicable to the next month’s billing.
- Downtime Excludes Maintenance. Continuing about Azure, for “periods of Downtime related to network, hardware, or Service maintenance or upgrades impacting Single Instances . . . we will publish notice or notify you at least five (5) days prior to the commencement of such Downtime.” Downtime is usually short (like a reboot), so rarely should scheduled downtime be a significant issue.
- Read the SLA details. Outside of the top five cloud providers, Service Level Agreement (SLA) terms can vary more than you would expect. Some SLAs state that downtime must be continuous, meaning that four different incidents of 10 minutes of downtime doesn’t count as 44 minutes of downtime—it counts as zero downtime under the 99.9% uptime target, because none of the downtime periods were 44 minutes long. (That’s not how Azure does it, but some other providers calculate it that way.)
- Don’t Brag About Infrastructure Uptime. Making a big deal about uptime sends the message that your service didn’t used to have high uptime. Besides, it’s a brag about something that most companies have in common with their competitors, so the value of bragging about uptime is minimal at best, so just mention it, and brag instead about your application’s features.
Infrastructure Uptime is Generally Good and Improving
With a big-name cloud provider, high availability for infrastructure is typically very good, and that trend is improving. For any cloud infrastructure provider, information should be available on their overall record of service, including how well they meet uptime targets. Vendors of cloud-based security applications get no real benefit from talking at length about infrastructure uptime. Customers assume it will be good (unless it’s the vendors own in-house data center). Customers are more concerned about how they can use the cloud application to improve their security-effectiveness or cost-effectiveness. Providing case study examples around these two factors will typically have a much higher sales ROI than talking about cloud data center technical details.
The User Experience is What Counts
The reason I started this article with a story of a cloud system problem, is that many people I have talked to equate high availability with good user experience, when many more factors are likely to impact user experience than server or network uptime.
For example, one of the five essential could computing characteristics is “resource pooling using a multi-tenant model”, illustrated nicely on WhatIsCloud.com. Multitenancy allows several cloud system users to share the same underlying IT resource or its instance while each remains unaware that it may be used by others. The primary consideration I hear discussed is about data isolation—that one cloud system customer cannot access the data from another cloud system customer, which is important. However, application performance is also an important part of the picture.
I have used a few client cloud-based systems whose responsiveness varied greatly depending upon the time of day or on some other invisible factor. This violates the principle that resource sharing should have no impact on users. With one system I checked out, some functions were not available due to a database issue—according to the error information provided, which suggested retrying after a short wait. Retrying worked, and I couldn’t tell if the system had recovered from a coding error in the application, or if the database resource had reached 100% utilization and simply wasn’t available. I checked out the same function again a few weeks later, and running lots of searches to see if the problem would recur, and it didn’t. In fact, the system seemed consistently more responsive than before.
That’s the result we should expect from a cloud-based system, that the system is continually improving in features and performance.
My point is that multitenancy means more than just data isolation. It also means engineering the system so that the multitenancy aspect of the system has no negative impacts on the user experience. And customers have a right to expect that.
Ray Bernard, PSP CHS-III, is the principal consultant for Ray Bernard Consulting Services (RBCS), a firm that provides security consulting services for public and private facilities (www.go-rbcs.com). He is the author of the Elsevier book Security Technology Convergence Insights available on Amazon. Mr. Bernard is a Subject Matter Expert Faculty of the Security Executive Council (SEC) and an active member of the ASIS International member councils for Physical Security and IT Security.