Archive | 2014

Upgrading my FAS3270 OnTap 8.2.1 to 8.3RC1 in my lab

Well, it’s time! This morning the first release candidate of OnTap 8.3 was released! Happy day!

I figured I would post about the upgrade process. A raw, uncut, look at how the hell you do it! Maybe you have never seen it. Maybe you just want to have something to copy and paste. Well, here it is!

Click to continue reading “Upgrading my FAS3270 OnTap 8.2.1 to 8.3RC1 in my lab”

4 Comments Continue Reading →

Storage Field Day #6: Livin La Pure’a Vida

Last year, I went to Costa Rica. The greeting they have for everyone was Pura Vida: “Good Life”. That has a new meaning for me now.

As you may know, recently, I was a delegate for Storage Field Day 6 #SFD6. I had the pleasure of heading to Pure Storage’s HQ. Read on to see what I found out!

Click to continue reading “Storage Field Day #6: Livin La Pure’a Vida”

20 Comments Continue Reading →

My NS0-504 Study Notes – 4 SCALABLE SAN IMPLEMENTATION TESTING AND TROUBLESHOOTING

SCALABLE SAN IMPLEMENTATION TESTING AND TROUBLESHOOTING

Please click here to goto the main NS0-504 Study Notes blog post to see all sections and additional information

4.1 Be able to create an acceptance test plan.

 
4.2 Test host to storage connectivity

  • Possible failures in connectivity?
    • Improper zoning
    • NPIV not enabled on switches

 
4.3 Test LUN availability during failover scenarios (multipathing).

 
4.4 Test controller failover scenarios (multipath HA).

 

 

 

 

Leave a comment Continue Reading →

My NS0-504 Study Notes – 2 SCALABLE SAN IMPLEMENTATION PLAN CREATION

Please click here to goto the main NS0-504 Study Notes blog post to see all sections and additional information

2. SCALABLE SAN IMPLEMENTATION PLAN CREATION

2.1 Verify and plan for dual power feeds for all components.

  • DUAL is the key word
    • Dual Feeds
    • Dual PDUs
    • Dual Power Supplies
  • Controllers and shelves all have dual power supplies. Power should be plugged into A and B circuits. Each leg of power should be able to handle the load of the entire environment (N+N)
  • Refer to Site Requirements Guide for more information, and power/temp specs.
  • Verify that Controllers/Shelves were ordered with the proper plug style for your nation/voltage/outlet.  (For example NEMA L5-20  vs IEC c13-c14. See common cable types here. )

2.2 Be able to create cabinet diagrams or be able to read and interpret a cabinet diagram. Diagrams should include the cabinet’s storage systems and switches with all connections shown.

  • Core info
    • Rack Location
    • Cabinet U positions of servers
    • Switch Placement
  • Utilize Visio or NetApp Synergy to mock of the layout and available space.
  • If they show exact ports to exact switches, keep an eye on what may be trunked, that FC isnt plugged into the wrong ports, etc.

 

2.3 Create a connectivity diagram. Be able to read and interpret a connectivity diagram.

  • Fibre Channel
    • Port names/Locations
      • Initiator vs Target port setting
      • FC vs UTA
    • FC Switch port connectivity
      • WWPN
      • Port Locations
      • Host name, Zone name, Aliases
  • IP
    • Switch name / port
    • VLANs
    • Trunks / LACP
    • VIF names
  • See section 3 for more detailed information

 

2.4 Plan storage controller configuration.

  •  Important details
    • Cluster naming, automatic node names vs manually “renamed”
    • SVMs youll need, protocols they will use
    • DNS Server, default domains

2.5 Plan host configuration.

  • Protocols needed?
  • Multipathing software installed?
  • Snapdrive and Host Utilities installed (FC or iSCSI)?
  • Verify paths
  • HBAnywhere (Emulex) or SANsurfer (Qlogic) for FC installed?

2.6 Create a Snapshot plan.

  • What’s the customers RPO (Recovery Point Objective)?
    • It is the maximum tolerable period in which data might be lost from an IT service due to a major incident. The RPO gives systems designers a limit to work to. (Wikipedia)
  • What’s the customers RTO (Recovery Time Objective)?
    • The targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity. (Wikipedia)
  • Expected Space Reservations needed?
    • What’s the expected rate of change?  Plan for the needed space withing the volumes for this X the number of snapshots

 

2.7 Plan Ethernet switch configuration.

  • Remember- at least 2 different types of switches
    • Management
      • Flat lights out management switch config. Single vlan. (See section 3 for example)
    • Cluster Interconnect
      • These have specific configurations based on the number of cluster nodes. (See section 3 for example)
    • Data (management may be on these normal switches)
      • VLANs needed?  Any routing?
        • iSCSI
        • NFS
        • Management / iLO?
        • User desktops
        • Servers (Web/App/DB tiers)
      • MTU?
        • Jumbo: 9000 MTU?  (needed for FCoE)
        • Normal: 1500 MTU?
      • QoS?

 

2.8 Plan zoning configuration.

2.9 Plan iSCSI configuration.

  •  Gather all IQNs, which iGroups will they belong to?
  • Add multiple IP addresses, one for each LIF to the discovery and static bindings
  • Enable MCS (Multiple Connections per Session)
  • iSCSI boot from SAN needed? (May influence dynamic vs static persistence in iSCSI discovery, and also host LUN IDs used)
  • Use VLANs to ISOLATE Traffic!
  • CHAP or IPSEC needed?

 

1 Comment Continue Reading →

My NS0-504 Study Notes – 1 SCALABLE SAN SOLUTION ASSESSMENT

Please click here to goto the main NS0-504 Study Notes blog post to see all sections and additional information

1. SCALABLESAN SOLUTION ASSESSMENT

1.1 Ensure that all prerequisites for the installation of NetApp system and switches (if needed) are met and that the required information to configure NetApp systems is collected.

Site Information

List the networks used for Cluster Management IP, Node Management IP, and Service Processor IP.

 

1. Networks
Subnet Netmask Gateway Purpose

 

2. DNS
DNS Domain Names
Name Server IP Addresses

 

Cluster Information

 

1. Licenses
SO Number
Serial Number Cluster S/N Model Protocol Name License Key
2. Cluster
Cluster Administrator’s (username “admin”) Password
Cluster Name Cluster Mgmt Port Cluster Mgmt IP
3. Node(s)
Location
Node Name Node Mgmt Port Node Mgmt IP

 

I. Post Cluster Setup Wizard
  • The Cluster Setup Wizard will name nodes CLUSTERNAME-XX, with XX being consecutive numbers from 01 to 24. To rename via the clustershell ::>

node rename -node OLDNAME -newname NEWNAME

 

II. Service Processor Configuration
  • Post Cluster Setup Wizard, as a minimum, before leaving site, the Service Processor must be configured and reachable. Configure via the clustershell ::>

system node service-processor network modify -address-type IPv4/IPv6 -node NODENAME -enable true/false -dhcp v4/none -ip-address X.X.X.X -netmask X.X.X.X -gateway X.X.X.X

Node Name Address Type (IPv4/IPv6) Enable (true/false) DHCP (v4/none) IP Address

1.1.1 Collect NetApp storage system configuration information.

Putting Various Netapp Information here:

  • Maximum number of nodes in a SAN cluster in CDOT 8.1 is 4, in CDOT 8.1.2 it went up to 6, and in CDOT 8.2 it went up to 8.

 

1.1.2 Collect Switch configuration information.

1.1.3 Gather power information such as circuit availability, wiring in place, etc…
1.1.4 Collect Host configuration information.
1.1.5 Collect Application configuration and requirements.

  • Application Names / Software  Version
  • OS / Version
  • IOPs needed heavy / avg.

1.1.6 Collect potential DEDUPE information.

 

1.1.7 Collect backup and retention information.

  • Hourly?
  • Daily?
  • Weekly?
  • Nightly?
  • Monthly?
  • Years kept?
  • Distance requirements?

 


1.2 List a detailed inventory of SAN components including:
1.2.1 NetApp storage system configuration details

1.2.2 Host details

  • FC Supported OSes: ( netapp.com )
    • Windows®, VMware, Solaris™, Oracle® Enterprise Linux®, Red Hat Linux, SUSE Linux, AIX, HP/UX, NetWare, and OpenVMS
  • Verify Host Utility Kit installation
    • HUK sets Timeouts
    • HUK lets the SAN see basic OS side data (filesystem paths, etc)
  • Verify SnapDrive Installation
    • Allowed easy access to basic SAN functionality from the mounting OS.  (Create snaps, mount volumes, create volumes)

1.2.3 FC switch details

4types_of_switches_supported

 

1.2.4 Ethernet switch details

  • sh run” on most switches
  • sh cdp neighbor” to see what other devices are connected

 

1.2.5 Current zoning configuration

  • On the NetApp:  fcp adapter show
  • On Netapp: system node run -node {nodename|local} fcp topology 
  • Brocade FOS common commands:
    • zoneshow

      • “Zone Info” in the WebTools GUI
    • cfgshow
      • Displays the ‘zoneset’
    • portCfgShow
      • Shows port configuration
    • portshow
      • Shows port status, including NPIV information
    • portloginshow
      • Displays port login status of devices attached to the specified port
    • cfgActvShow
      • Displays ACTIVE Zone configuration

 

  • Cisco MDS 9k command quick reference
    • show zoneset active
      • Displays the active zone set for each VSAN, including which devices have logged in and are communicating
    • show zone
      • Displays zone mappings
    • show zone vsan X
      • Displays zone mapping ONLY for vsan X
    • show fcalias
      • Displays alias mappings
    • show flogi database
      • Displays the devices that have logged into the fabric. Retrieve pWWNs for aliases and zoning from this database
    • show fcns database
      • Displays device name server registration information per VSAN

 

1.2.6 Current iSCSI implementation details
1.2.7 CHAP settings
1.2.8 IPSEC configuration details
1.2.9 Snapshot configuration details
1.2.10 Current data layout (aggregates, raid groups, volumes)
1.2.11 Consider listing out system names, IP addresses, current zoning configuration, OS
versions, OS patch levels, driver versions and firmware versions


1.3 Ensure that the solution design and the hardware provisioned do not fall short of the customer’s requirements and expectations.
1.3.1 Validate requirements with the customer. Consider the following:
1.3.1.1 Sizing needs

  • Total Capacity, expected growth
  • Bandwidth throughput
    • Per end host and application
    • Per SAN for all hosts
    • Per SAN for backups
    • For replication
  • IOPs requirements
    • Perform simple IOPs calculations per existing application and per existing san
    • Look at daytime, backup time, and nightly processing IOPs differences
  • Future needs

1.3.1.2 Connectivity needs
1.3.1.3 Zoning types

  • single initiator zoning

1.3.1.4 Expected level of functionality
1.3.1.5 Performance requirements
1.3.1.6 Solution requirements being provided by a thirdparty

1 Comment Continue Reading →

My NS0-504 Study Notes – 3 SCALABLE SAN IMPLEMENTATION AND CONFIGURATION TASKS

Please click here to goto the main NS0-504 Study Notes blog post to see all sections and additional information

3 SCALABLE SAN IMPLEMENTATION AND CONFIGURATION TASKS

3.1 Prepare site for installation.

3.1.1 Be able to review implementation flowchart with customer and assign task areas.

 

3.1.2 Verify site infrastructure including: dual power, floor space, floor loading plan, HVAC.

Sample cDot Site Survey Questions (not including networking) (from an cDot old Survey to Quote document)

Cabinets:

Questions to help get the equipment racked and powered properly.

  • Does the customer plan on using NetApp Cabinets?
    • If Yes, what PDUs does the customer want in the cabinets?
      • What country will these racks be installed in?
      • Will redundant power be able to be configured?
    • If No, what is the height in RackU of the customer’s cabinets?
      • Will customer provide redundant power for the equipment?
  • Does the customer want NetApp to supply PDU-to-Head/Shelf power cords?
    • If so, what length?
    • If not, country-specific power cords will be included.
  • Does the customer want NetApp Universal Rail kits?
    • If so, 2 post or 4 post?
Cables:

Questions to help gauge cable lengths to connect cluster elements together.

  • How does the customer plan on running cables between racks?
    • Down, under the floor and back up?
    • Up, through ladder racks/trays and back down?
    • Through cabinet sides (not recommended)?
    • What cable length(s) will be required?
  • How will cabinets be set up on the datacentre floor?
    • All adjacent with no gaps or other equipment in between?
    • Separated by an aisle or other equipment?
    • If not all adjacent, please build diagram of cabinet layout to facilitate cable length calculation with PS

 

Questions to help gauge SFPs and cable lengths to connect cluster to host/client side networks.

Clusters with NAS, iSCSI & FCoE protocols:

  • What kind and quantity of Ethernet ports does the customer require?
    • o   10GigE with optical SFP+
    • o   10Gig CNA with optical SFP+
    • o   10GigE bare cage
    • o   10Gig CNA bare cage
    • o   1GigE with optical SFP
    • o   1GigE with RJ45 connector
  • Does the customer want NetApp to supply cables for any of the above?
    • If so, what length(s)?
  • For bare cage 10GigE & 10GigE CNA, does the customer want copper 10Gig cables?
    • If so, what length(s)?

Clusters with FC protocol:

  • What kind and quantity of FC ports does the customer require?
    • o   8Gbps with optical SFP
    • o   4Gbps with optical SFP
    • o   8Gbps bare cage
    • o   4Gbps bare cage
  • Does the customer want NetApp to supply cables for any of the above?
    • If so, what length(s)?

 

  • Read the Netapp Site Requirement guide
  • Review the BTU and Ton conversion of the heat output of the controllers and disk shelves.
  • Know the NEMA vs IEC power cables and voltage
    • C13 – C14
    • NEMA L5-15/L5-20
    • C19-C20

3.1.3 Validate equipment move path to installation location.

  • This sounds stupid, but I have seen racks fall over, or bust plastic wheels.
  • Walk it!

3.1.4 Validate logistics plan for staging and installation of equipment.

  •  Will this be built in a lab room? Built in production rack? Etc.

3.1.5 Verify Ethernet cabling plan and availability of cable supports.

Valid 1gb Administration Switches

  • Cisco 2960
  • Netapp 1601 (may not have existed at time of test creation)

Valid 10gb Cluster Interconnect Switches

  • Cisco 5010 – 12 or 18 node clusters
  • Cisco 5020 – Up to 32 nodes (not like thats possible, realistically 24 nodes)
  • Cisco 5548UP
  • Cisco 5596UP – Up to 40 nodes
  • Cisco CN1610 (rebranded Broadcom BCM53716-16FE or larger port count)

 

Sample cabling plan:

  • Netapp CN1610 Cabling.

Overview Cabling Diagram

esxi51_ucsm2_Clusterdeploy-002

Sample Cabling Guide of a Flexpod – For illustration only. Dont use on modern production.

esxi51_ucsm2_Clusterdeploy-003

 

3.1.6 Verify fiber cabling plan and availability of cable supports.

arch1

 


 

3.2 Following the rack diagram, install systems and FC switches.

Example Rack Diagram

example_rack_diagram

3.3 Storage System Configuration Tasks.

3.3.1 Data ONTAP 8.1.1 Cluster-Mode Setup Tasks

 

3.3.2 Storage Provisioning and Vserver Setup Tasks

  • vserver create
  • protocol configuration
  • volume create
  • Adding to namespace
  • lun create
  • NFS: Export policy creation

 

  • Other tasks
    • Lif Migrate
    • ARL Aggregate Relocate
    • DataMotion for Volumes

3.3.3 FC, FCoE, and iSCSI Connectivity Tasks

 

  • We are almost always deploying a “switched fabric” with a core-edge topology with FC.
    • Core-edge topology: In this design, storage is always put in the core, and hosts are always attached at the edge. This design is effective because SAN traffic flows are typically not peer to peer but instead many to one (hosts to storage). (Definition from Cisco MDS documentation)
    • Edge-core-edge topology: This common design (storage edge to core to host edge) is used when a core-edge design provides insufficient scalability and an additional edge tier is needed to handle the large number of devices. (Definition from Cisco MDS documentation)
[caption id="" align="alignnone" width="265"] Topologies[/caption]

 

FC Cable comparison (because they always have annoying distance questions, not like we couldnt just google this when we need to)

Comparison (from Wikipedia)

Category Minimum modal bandwidth
850 nm / 1310 nm
100 Mb Ethernet 100BASE-FX 1 Gb (1000 Mb) Ethernet 1000BASE-SX 10 Gb Ethernet 10GBASE-SR 40 Gb Ethernet 100 Gb Ethernet
OM1 (62.5/125) 200 / 500 MHz·km up to 2000 meters (FX) 275 meters (SX) 33 meters (SR) Not supported Not supported
OM2 (50/125) 500 / – MHz·km up to 2000 meters (FX) 550 meters (SX) 82 meters (SR) Not supported Not supported
OM3 (50/125) *Laser Optimized* 1500 / 2000 MHz·km up to 2000 meters (FX) 550 meters (SX) 300 meters (SR) 100 meters330 meters QSFP+ eSR4 100 meters
OM4 (50/125) *Laser Optimized* 3500 / 4700 MHz·km up to 2000 meters (FX) 1000 meters (SX) 400 meters (SR) 150 meters550 meters QSFP+ eSR4 150 meters

Fibre Channel loop speeds/Distance from Siemons

Connection Speed and Distance by Cable Category
Type Speed Distance
OM2 1Gb/s 500m/1,640’
OM3 1Gb/s 500m/1,640’
OM2 2Gb/s 300m/900’
OM3 2Gb/s 500m/1,640’
OM2 4Gb/s 150m/492’
OM3 4Gb/s 270m/886’
OM2 8Gb/s 50m/1,64’
OM3 8Gb/s 150m/492’
Twinax copper 8Gb/s 15m max’
  • Optical for SAS between controller and shelves distances: The point-to-point (QSFP-to-QSFP) path of any multimode cable cannot exceed 150 meters for OM4 and 100 meters for OM3.  Though this isnt supported on 8.1.1.
  • Important note!:  If you are using patch panels with a different thickness (62.5/125 instead of 50.125), you should match the patch panel fiber thickness to the end host or san A transition between 62.5/125 and 50/125 may result in a possible loss of signal strength.  Read this for more information. ( thefoa.org )

mismatch

 

 

3.3.4 LUN Connectivity Tasks

  •  ALUA information for pathing preferences is gathered by the host sending scsi inquiry (or new  REPORT_TARGET_PORT_GROUPS) command. (From Netapp Knowledge Base)
    • The storage system implements four states for a LUN:
      • Active/Optimized
      • Active/Non-Optimized
      • Unavailable
      • Transitioning
    • These map to the following existing Data ONTAP terms:
      • Local/Fast/Primary
      • Partner/Proxy/Slow/Secondary
      • Cluster IC is down, path is not functional
      • Path is transitioning to another state

 

 

3.3.5 Configure FC and Ethernet switches

 

3.3.6 Host Configuration Tasks

3.3.7 Virtualized Environment/Platforms: SAN Best Practices.

3.3.8 FCoE and Unified Connect Enabling Technologies

  • Relies on DataCenter Bridging (DCB)
    • Data center bridging (DCB) is a collection of extensions to the existing Ethernet standard that provides a lossless transport layer for FCoE traffic.
      • Per-priority pause (priority-based flow control)
        • Enables a device to only inhibit the transmission of frames based on user-defined priorities.
      • Enhanced transmission selection
        • Allows administrators to allocate bandwidth on a percentage basis to different priorities.
      • Congestion notification
        • Transmits congestion information.
      • DCB Exchange (DCBX) protocol
        • Exchanges connection information with directly connected peers and detects misconfigurations.
    • SAN Admin Guide Troubleshooting – Page 70-71 is a great resource for seeing this in action
      • Default priorities is 3 for FCoE traffic & 50% Bandwidth

        • Page 20 of TR3894 states this.
      • Default priority is 0 for IP traffic.

 

3.3.9 FCoE and Unified Connect Hardware

3.3.10 FCoE and Unified Connect Configuration

To move:

  • Must read: FCoE End to End Deployment (TR-3800 Older – 2011, w/ Qlogic)
  • Configuring CNA/UTA Ports
    • ucadmin
    • fcp config
  • FCoE Overview for Clustered Ontap
  • FC and FCoE Zoning
    • Best practices and recommendations
      • Netapp recommends “Single Initiator Zoning”
        • A zone should include a SINGLE INITIATOR and ALL Targets the initiator is connecting to.
      • Zoning should be based on World Wide Port Name (wwpn)
        • Change an HBA card, update the zone. Change a server, update the zone.
      • 50/125 Recommended.
      • Short Wave SFPs required to connect to onboard FC.
      • Should be used when you have 4 or more hosts connected (really always)
      • Orange cable typically OM2. “Laser Optimized” needed for short length 10gb.
      • OM3 and OM4 are cyan (blue), recommended for 8gb FC, 10GB eth.
  •  Why Zone?
    • Reduces CROSS TALK between initiator HBAs
    • Reduces paths to an available port
    • Increases Security
    • Shortens troubleshooting times
  • Zoning

 

1 Comment Continue Reading →

My NS0-504 Study Notes – NetApp Definitions

Commonly used definitions according to Netapp (Taken from the TR-4056 – Best Practices with Exchange and Cluster Mode)

Click to continue reading “My NS0-504 Study Notes – NetApp Definitions”

1 Comment Continue Reading →

My NS0-504 NCIE-SAN Cluster Mode Study Notes

I need to renew my NCIE-SAN Cluster Mode NS0-504 this year, so I figure NetApp Insight 2014 is the place to do it. Why? Because it’s free Yo!

Much of this I already know, but to error on the side of caution, and not be too overconfident, I will be stepping through as many of the Study Guide lists as possible. Doing some hands on examples.

Click to continue reading “My NS0-504 NCIE-SAN Cluster Mode Study Notes”

11 Comments Continue Reading →

Warning, New Exploit: Dealing with SHELLSHOCK on Linux & SAN Vendor links

Quick Warning:

To those that run your own webservers, and Mac OSX users.. If you haven’t already heard, there is a critical exploit out called SHELLSHOCK now that uses exploits a flaw in “bash” the primary command line of unix type operating systems. (linux, *bsd, Mac OSX). A variable can be used to execute a command.

This exploit can also be triggered remotely by making a special request to most webservers that run on linux or *bsd.

Click to continue reading “Warning, New Exploit: Dealing with SHELLSHOCK on Linux & SAN Vendor links”

2 Comments Continue Reading →

My @NetappInsightUS 2014 Tentative Schedule & Tips!

I’ll have an action packed week coming up in October at Netapp Insight @NetappInsightUS, followed up a trip to Sunnyvale for the Storage Field Day 6 #SFD6 where I will be a delegate.

Here is my tentative list of sessions and #NetappATeam events I’ll be attending! There’s still time to sign up if you are interested!

If you can’t find me at the sessions below, you can find me at the Sigma Derby machine in the MGM Casino!

Click to continue reading “My @NetappInsightUS 2014 Tentative Schedule & Tips!”

3 Comments Continue Reading →