Nutanix 5.0 Features Overview (Beyond Marketing) – Part 1

This is a 4 part blog series.

Five months after releasing AOS 4.7 Nutanix is announcing another major release with dozens of new features and enhancements. AOS 5.0 has an internal engineering codename of ‘Asterix’. The velocity with which features and improvements are released by Nutanix is only comparable to what we see in AWS and the consumer space, where apps are updated at an amazing pace for the user’s delight. This blog post contains features being made available with forthcoming releases of the Nutanix software, but if you missed previous releases announcements read them here:

These are the features introduced with this blog post:

Cisco UCS B-Series Blade Servers Support
Acropolis Affinity and Anti-affinity
Acropolis Dynamic Scheduling (DRS++)
REST API 2.0 and 3.0
Support for XenServer TechPreview
Network Visualization
What-if analysis for New workloads and Allocation-based forecasting
Native Self-Service Portal
Snapshots – Self Service Restore UI
Network Partner Integration Framework
Metro Availability Witness
VM Flash Mode Improvements
Acropolis File Services GA (ESXi and AHV)
Acropolis Block Services (CHAP authentication)
Oracle VM and Oracle Linux Certified for AHV
SAP Netweaver stack Certified for AHV
…more on Part 2

Over the next few weeks the PM and PMM teams will be launching a series of blog posts with more in depth information about features in AOS 5.0. The first one is here, Ten Things you need to know about Nutanix Acropolis File Services by Shubhika Taneja

Disclaimer: Any future product or roadmap information is intended to outline general product directions, and is not a commitment, promise or legal obligation for Nutanix to deliver any material, code, or functionality. This information should not be used when making a purchasing decision. Further, note that Nutanix has not determined whether separate fees will be charged for any future product enhancements or functionality which may ultimately be made available, and may choose to charge separate fees for the delivery of any product enhancements or functionality which are ultimately made available.

For official information on features and timeframe refer to the official Nutanix Press Release (here).

Now that we have legal disclaimer out of the way… let’s get into it!

Platform

Cisco UCS B-Series Blade Servers Support

Today during .NEXT EMEA Nutanix announced the upcoming support for Cisco UCS B-Series Blade servers, in addition to the previously announced support for C-Series rackmount servers.

Today, UCS B200-M4 blades are limited to 3.2TB of all-flash raw storage. The limited flash, in many cases, is inadequate for storage capacity requirements. Cisco and the other hyperconverged manufacturers have consequently limited their solutions to rackmount servers.

Nutanix solves the storage capacity deficiency of B-Series blades by enabling customers to add storage-only nodes to their B-Series blade clusters. Once released, a pair of all-flash C240-M4SX storage-only nodes can be added to the cluster with up to 24 1.6TB SSDs per node. Nutanix unique ability to mix and match nodes with different compute: storage ratios make this possible.

The storage-only nodes enable UCS customers to optimize the balance between blades and C240s. It also enables them to scale storage independent of compute. Rather than requiring a large investment in a new storage array as the older unit reaches capacity, customers can expand their environments incrementally as needed.

Since the storage-only nodes run Nutanix AHV, no additional monies are required for virtualization software license fees. The AHV storage-only nodes can be mixed with ESXi nodes in the same cluster.

AMF (Application Mobility Fabric)

Acropolis Affinity and Anti-affinity

VM-Host strict affinity

At times administrators may need to ensure sure that certain workloads run on the same host. Many companies, for example, may have an application running inside a VM, but that application is tied via licensing rules to a host. Administrators can create virtual machine to host affinity rules to make sure that these virtual machines are not migrated to other hosts.

Acropolis provides support for these type of VMs during HA or maintenance mode
- In reserved mode HA, resources are reserved for a VM to restart on a different affinity host. Acropolis will not allow a VM to power on if the reservation is not guaranteed.
- In Best effort HA, Acropolis will turn off a VM if it cannot restart on a different affinity host.
- In maintenance mode, Acropolis will stop a VM evacuation if it cannot migrate the VM to a different affinity host.

VM-VM preferential anti-affinity

At times certain virtual machines should not run on the same host. For example, most organizations want to make sure that at least one domain controller remains available at all times, so those organizations will create VM to VM anti-affinity rules which state that these virtual machines are to run on different hosts, even if performance would be better by combining them.

What does it mean
- VMs have a preferential anti-affinity policy among themselves.
- It is OK to violate the policy if the scheduler cannot honor the rule during placement.
- If DRS cannot resolve the violation, an alert is generated.

Affinity explanation from http://www.virtualizationadmin.com/blogs/lowe/news/affinity-and-anti-affinity-explained.html

Acropolis Dynamic Scheduling (DRS++)

System administrators’ are familiar with the DRS concept. DRS balances computing workloads with available resources in a virtualized environment – and DRS is part of most virtualization stacks nowadays.

DRS is closely associated to capacity planning – except that capacity planning takes a longer horizon viewpoint, while DRS optimizations are done within a constrained capacity environment, for a much shorter time window.

AHV Dynamic Scheduling is not much different from existing DRS implementations at first sight, but Nutanix factors in compute, memory and storage performance to make placement decisions. Nutanix administrators now have “peace of mind” knowing that AHV DRS feature will power ON VMs on nodes to minimize likelihood of resource (CPU, memory and storage IO) contention and subsequently proactively resolve (or generate recommendations to resolve) any transient resource contention related to CPU, memory and storage IO.

REST API 2.0 and 3.0

Major changes to Nutanix REST API have been done for this release, including the addition of API Versioning, Backward Compatibility, API Sanitization, and Standardization. Furthermore, a new REST 3.0 is appearing as part of the platform

REST 3.0 is a scale-out Intent based API and gateway with built-in load balancer. Instead of exposing implementation details of the actual scheme (which may change), REST 3.0 instead expose higher order concepts of what the user intention is for your actual use cases.

By mapping to the intention of the user – what they are actually trying to accomplish – Nutanix has the opportunity to tailor the API endpoint just to the set of parameters that make sense for a given operation. What Nutanix has done is remove the burden of implementing business logic that’s specific to Nutanix from the callers, and placed it inside Nutanix (where it belongs).

A new Nutanix API portal is also available to enable developers to quickly get up-to-speed with the old and the new REST 3.0 intent specification. The portal provides examples for Python, Java, GoLang and PowerShell, and can be accessed at http://developer.nutanix.com or https://nuapi.github.io/docs.

Support for XenServer TechPreview

It’s kind of a re-cap announcement, but Nutanix now offers support for Citrix applications including XenApp, XenDesktop, NetScaler VPX and NetScaler running on Citrix XenServer on the Nutanix platform. Starting with the AOS 5.0, XenServer customers will be able to run XenServer 7 as a TechPreview on the Nutanix platform.

Read the press release here.

Prism

Network Visualization

If networks are misconfigured, an application’s VMs can stop working or their performance can be degraded. For example, VLAN misconfiguration can result in applications not being able to talk to each other. Network configuration mismatches, such as MTU mismatch or link speed mismatch can also cause performance degradation due to excessive packet drops.

What makes troubleshooting network problems hard is that a misconfiguration in any of the switches along a network path can cause a problem, and to troubleshoot that administrators need to have a global view of the network configuration.

This is exactly what Network Visualization is intended to solve. It provides a view of the entire network: from each individual VM to the virtual switches, to the host physical NICs, to the TOR switches and so on. It displays the configuration of network elements, such as VLAN configuration, in an intuitive and easy-to-use interface. It allows administrators to easily navigate the network, for example to group information by user, project or host.

Nutanix uses LLDP and/or SNMP to discover and validate network topology. To discover configuration information from switches SNMP is used. For example, the VLAN info for each port is gathered using SNMP, along with the network stats. Once the topology is retrieved along with configurations and statistics from virtual and physical network elements, Nutanix present the information in an easy-to-use interface. (This first release works only with AHV)

What-if analysis for New workloads and Allocation-based forecasting

Pay as you go

How many new VMs a cluster can support?
If a new SQL server is needed 1 month later, is the cluster ok?
If the current workload increases will the cluster be ok?
Given a set of workloads, I want to create a new cluster, what kind of cluster I need?

What-if analysis is a way to specify new future workloads along with the time at which they’ll come in. You can specify it in terms of existing VMs, for example as 10 additional instances of an existing VM. Or you can specify it as a percentage change of the existing workload; this supports both expanding and shrinking existing workloads. Finally, you can specify it as one of the pre-defined common workloads.

For example, you can specify your workload as a business-critical medium-sized OLTP SQL server workload, and the what-if tool will estimate the workload size. The what-if analysis tool will get accurate sizing estimates because the tool is integrated with the Nutanix sizer, which is what we use to do initial deployment recommendations. Then what-if analysis tool will support several pre-defined workloads, such as SQL server, VDI, Splunk, and Xen App.

Nutanix already provides the runway component view, and it uses capacity planning algorithms to predict the runway for the different resources and the overall runway for the cluster. Based on that, the what-if analysis can give administrators recommendation as to the nodes that have to be added along with the dates on which they should be added, so that the runway extends all the way to the target runway.

Once you add the workloads and the HW, and the system gives its recommendations, whatever is shown in the what-if UI can be used as the starting plan that can be tweaked and tuned. For example, you can adjust the start dates for the various hardware recommendations based on your budget constraints to see what the runway looks like and similarly adjust workload start times; maybe some lower priority workload can come later. You can tune it until you get to the workload and HW plan that is optimal for you.

Native Self-Service Portal

AOS 4.6 introduced Openstack support for AHV, offering drivers for Nova, Cinder, Glance and Neutron. While Openstack has a big market adoption and works flawlessly with Nutanix, Openstack is not a native Nutanix solution and is not capable of leveraging many of the advanced Nutanix capabilities because Openstack was built to work with all types of underlying infrastructures.

Nutanix native self-service portal is integrated into Prism and provides access to IT resources, policies and secure tenant-based access. The portal enables tenants to deploy applications without IT intervention, enabling organizations to provide developers or tenants with a AWS self-service like experience.

Admin Portal

Create/Manage Projects
Create/Add users and groups
Assign Resources
Assign Actions
Run Show-back reports

Tenant Portal

Deploy Applications from a Catalog (VM Template, vDisk, Images from Docker Hub, App Templates)
Monitor Applications
Monitor Resource Usage

Snapshots – Self Service Restore UI

Nutanix AOS 5.0 finally brings the user based file level restore intro the Prism UI for VM users. The feature allows users to recover files and folders securely on their own VMs, with zero intervention from administrators.

Network Partner Integration Framework (Service Insertion, Chaining and Orchestration)

Today at .NEXT conference in Vienna Nutanix also announced the new extended networking framework with support for a combination of network connectivity services and network packet processing services.

The combination of Service Insertion, Chaining and Webhooks deliver on a wide range of possible functionalities that can be leveraged by networking and security partners.

Some of the possible use-cases under current development with partners are:

Automates network provisioning workflows corresponding to workload provisioning workflows on Nutanix:
Automated provisioning of VLANs on-demand on partner switches.
- When an application (a set of VMs) is launched on Nutanix, the corresponding physical network switches are automatically configured with the appropriate networking policies for that workload.
- When an application is decommissioned on Nutanix, the corresponding network policies are automatically removed from the physical network switches.
- When a VM on Nutanix is live migrated to a different node in the Nutanix cluster (possibly connected to a different port on the same TOR or a different switch altogether), corresponding network configuration on both the old and the new switches are altered to reflect the change.
“VM centric view” of the network built by the switch vendor partner based on the info gleaned from Nutanix, meant for the network admin managing the partner switches.
A ‘VM Centric’ operational view of the network presented by the physical network to the network admin for faster and more accurate troubleshooting. Network admins can perform faster root cause analysis by tracing the path, flows and statistics corresponding to a VM name, tag or label. This intelligence is built into the physical network database through information provided by Nutanix about a VMs properties (VM name and associated labels, along with IP and MAC address information of the VM)
Support for topolgy discovery (mapping between Nutanix nodes and corresponding TOR switch nodes) through LLDP.

Cwz014tW8AAj_UF.jpg-large.jpeg

Single Network Packet Processing (NPP) Service insertion

NPP is a network framework to support cluster wide service insertion and enablement of network services running on a AHV cluster. NPP provides support for:

Workflow for partner service image and plugin registration
Service deployment – cluster–wide or onto a subset of nodes in cluster
Network Level Insertion – Support both bump-in-the–wire and tap mode insertion modes
Guest VM life-cycle event notification to partner service through plugin invocation
Notifications for relevant VM properties – both native properties (IP and MAC addresses) as well as meta-data properties (labels, categories and names)
Selective redirection of traffic to the service (scoped as a subset of vNics on guest VMs)

Packet Processing Service Chain Framework

Nutanix networking partners now have the ability to inspect, modify, or drop packets as they flow through the AHV network. The service chain framework auto-configure AHV virtual switches to redirect packets to packet processor VM images and services, provided by Nutanix partners. The services available to build upon are:

Inline processors – Allows the processor to modify or drop the packets as they flow through a virtual switch.
Tap processors – Allows the processor to inspect the packets as they flow through a virtual switch.
Processor chains – Allows multiple processors, from the same or different vendors, to be chained together to provide different services.

Webhooks based Event Notification (Network Orchestration)

Nutanix networking partners will have the ability to be notified via a webhook event whenever a given event occurs in a cluster, host or VM for immediate action. As an example, a networking partner that apply policy rules to packet inspection wants to be alerted whenever the VM network VLAN is modified or when the VM is live-migrated to a different host. Through the use of webhooks the partner is able to implement very advanced heuristics and workflows to automate the entire datacenter.

Watch some of the preliminary partner integration demos.

Brocade

Mellanox

Distributed Storage Fabric (DSF)

Metro Availability Witness

Nutanix metro availability does an outstanding job failing over entire datacenters with a single click. However, some customers feel that the lack of automatic failover during a site failure or network failure can become a pronounced problem, particularly if business critical applications are in use or if there’s no IT staff available to execute the DR procedure.

Previously Nutanix didn’t have automatic failover because it wasn’t possible to differentiate between a site failure and network partition. AOS 5.0 address the issue with a Witness VM that resides outside the failure domain. This Witness VM communicate with each Metro site on a network different from the Metro inter-site network and help in taking the automatic failover decision for Metro availability. The Witness VM also avoids a split-brain scenario by providing support for automatic leader election between Metro clusters.

VM Flash Mode Improvements

VM Flash Mode is back into Prism UI and is also improved! VM Flash Mode enables administrators to select the SSD storage tier in hybrid systems for a particular VM that may be running latency sensitive mission critical applications. The improvements deliver consistent Latency / IOPS using All flash performance for key VMs in hybrid system, QoS tiering for service providers and higher overall IOPS. I wrote about VM Flash Mode here if you are interested in more details.

Acropolis Files Services (AFS)

Acropolis File Services is now GA (ESXi and AHV)

Acropolis File Services (or AFS) is an integral and native component of DSF, removing the need for Windows File servers VMs or external NAS arrays, such as Netapp or EMC Isilon arrays. AFS was in Tech Preview in AOS 4.6 and 4.7 and now with AOS 5.0 the feature is GA for both ESXi and AHV hypervisors, and may be used in production with full Nutanix support.

Acropolis File Services (Async-DR)

AFS now provides native data protection using the native NOS Async-DR. VMs and Volume Groups are protected by using Protection Domain with all DR related operations, Snapshot schedules and policies being applied to the Protection Domain itself.

Acropolis File Services (AFS Quota)

AFS now provides hard and soft quota limits with configurable email warning limits, when using hard limits, a quota shall not be exceeded and if an operation such as file write cause a quota limit to exceed the operation will fail. When using soft quota limits an alert is sent to the recipient; however, data writes are permitted.

The quota policy is a set of rules that specifies the quota to be applied on a user or a group, the quota limit (size in GB), the type of Quota (Hard or Soft) and whether to notify user for quota events. The policy application can be done on a single user or all user of specified group, including AD groups and a default policy for users with no applicable user or group policies.

Adding new policy:

Acropolis File Services (Access Based Enumeration – ABE)

AFS Access-based enumeration displays only the files and folders that a user has permissions to access. If a user does not have Read (or equivalent) permissions for a folder, Windows automatically hides the folder from the user’s view. ABE ccontrols the user’s view of the shared folder based on READ access privileges, including:

Show only those file system objects that the user has access to using FIND (Directory enumeration) response.
Hide sensitive file or folder titles for which users do not have READ access.
Share level configuration parameter (“hide unreadable”)
Special handling for top level folders on HOME shares.

Acropolis File Services (Performance and Scale)

AFS received optimizations to support more than 500 connections per VMFS VM with 4CPU/16GB. AFS in a small 3-node cluster configuration is now able to scale to up to 60 million files/directories for the file-server

Acropolis File Services (Performance Optimization Recommendations)

Due to the distributed nature of AFS, some nodes (FSVM) may be under pressure while other FSVMs are idle. By redistributing the load across the nodes or adding extra resources, AFS can serve the clients with better performance. AFS utilizes a number of metrics to measure consumption and define load balancing resolution, including CPU load average, Number of SMB connections, the fixed limit based on memory configuration, and Read / Write Bandwidth of Volume Groups.

The possible resolutions can be:

Moving Volume Groups: moving some volume groups away from the “Hot” FSVM to reduce load.
Scale Out: creating new FSVMs to hold the volumes groups if existing FSVMs are busy.
Scale Up: adding CPU and Memory resources to all FSVMs.

Once a recommendation is generated, a “Load Balancing” button will show on file server tab in the recommendation column, but the administrator is able to choose and overwrite the recommendation from:

Moving volume groups to scale up.
Scale out to scale up.
Scale up recommendation cannot be overwritten.

Once the user has chosen a load balancing action, a task will be created to perform the action.

Acropolis Block Services (Scale-Out SAN)

Acropolis Block Services provides a highly available, scalable and high performance iSCSI block storage to guests. ABS builds atop of Acropolis Volume Groups Services that have been available since AOS 4.5 release. Volume Groups provide block storage and are particularly important for enterprise applications that do not provide support for NFS datastores or applications that require block storage “shared” across deployment instances. The use cases include: Microsoft Exchange on ESXi, Windows 2008 Guest Clustering, Microsoft SQL 2008 Clustering and Oracle RAC.

Acropolis Block Services (CHAP authentication)

1. Challenge-Handshake Authentication Protocol
2. Shared “secret” known to authenticator and peer
3. MUTUAL CHAP – Client authenticates the target

CHAP provides protection against replay attacks by the peer through the use of an incrementally changing identifier and of a variable challenge-value. CHAP requires that both the client and server know the plaintext of the secret, although it is never sent over the network.
Mutual CHAP authentication. With this level of security, the target and the initiator authenticate each other. A separate secret is set for each target and for each initiator.

Other ABS Improvements:

Dynamic Load Balancing
Flash Mode for Volume Groups
IP based initiator whitelisting
Initiator management
Wider client support – RHEL 7, OL 7, ESXi 6.
Online LUN resize

Workload Certifications

Nutanix has also announced that AHV is now certified for Oracle VM and Oracle Linux with ABS, and also for the SAP Netweaver stack. This is amor news for all the enterprise customers that have been wanting to move business critical applications to the Nutanix platform and were waiting for Oracle and SAP support.

Hardware Certification List for Oracle Linux and Oracle VM

Today Nutanix also announced native AHV 1–click Micro Segmentation. However this not a feature being released with the upcoming releases.For official information on features and timeframe refer to the official Nutanix Press Release (here).

That’s a long list of features, but that’s not all…. I will soon release the second part of this blog series with many more NEW features. Keep tuned!

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net

Nutanix 5.0 Features Overview (Beyond Marketing) - Part 1 - myvirtualcloud.net