SCHMAUSTECH

Monday, December 29, 2014

Solaris LUN Online Report

If you are using fiber channel storage with Solaris in a multipath configuration, sometimes before fabric maintenance or array maintenance you might want to check and confirm the status of all the paths on the Solaris client. The following script utilizing luxadm will report on the status of each path for a fiber channel device.

#!/bin/perl

@luns = `/usr/sbin/luxadm probe | grep Logical | sed -e 's/.*\://g'|grep rdsk`;

foreach $lun (@luns) {

chomp($lun);

$lun2 = $lun;

$lun2 =~ s/\/dev\/rdsk\///g;

print "Disk:$lun2\t";

@luxadm = `/usr/sbin/luxadm display $lun`;

$pathcount = 0;

foreach $luxadm (@luxadm) {

chomp($luxadm);

if ($luxadm =~ /State/) {

$luxadm =~ s/State//g;

$luxadm =~ s/^\s+//;

print "Path$pathcount:$luxadm\t";

$pathcount++;

}

print "\n";

}

The output from the script will look something like the output below:

#perl pathfinder.pl

Disk:c6t60060E80054337000000433700000526d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000527d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000301d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000300d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000278d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000277d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Disk:c6t60060E80054337000000433700000275d0s2 Path0:ONLINE Path1:ONLINE Path2:ONLINE Path3:ONLINE

Cleanup Shared Memory Segments Solaris

If you have ever used an application in Solaris that uses shared memory and that application has a tendency to not cleanup those memory segments properly on shutdown (SAP comes to mind) then this little Perl script is what you have been waiting for.

All this script does is take certain field output from the ipcs command and then iterate through it to determine if the memory is still actively in use by a process or if it it can safely be purged. I recommended testing this out with the $memclean line commented out to gain a good understanding before you remove the comment and allow the cleanup (#$memclean = `ipcrm -m $shmem`;). This script was tested on Solaris 10.

#!/usr/bin/perl

@sms = `ipcs -pm|grep "^m"|awk {'print \$2":"\$7":"\$8'}`;

foreach $sms (@sms) {

chomp($sms);

($shmem,$cpid,$lpid) = split(/:/,$sms);

$cpids=` ps -ef|grep $cpid|grep -v grep >/dev/null 2>&1;echo \$?`;

$lpids=` ps -ef|grep $lpid|grep -v grep >/dev/null 2>&1;echo \$?`;

chomp($cpids,$lpids);

if (($cpids eq "1") && ($lpids eq "1")) {

$message = "Memory can be reclaimed";

#$memclean = `ipcrm -m $shmem`;

} else {

$message = "Memory active";

}

print "$shmem - $cpid - $lpid - $cpids - $lpids - $message\n";

}

The output from the script will look similar to the following:

# perl mem_clean.pl

587203562 - 10885 - 17891 - 0 - 0 - Memory active

922746991 - 9728 - 10885 - 0 - 0 - Memory active

150995435 - 9728 - 10885 - 0 - 0 - Memory active

150995432 - 9728 - 17891 - 0 - 0 - Memory active

150995398 - 17421 - 17891 - 1 - 0 - Memory active

150995396 - 13421 - 13891 - 1 - 1 - Memory can be reclaimed

150995387 - 9728 - 10885 - 0 - 0 - Memory active

150995382 - 9728 - 10885 - 0 - 0 - Memory active

150995380 - 9728 - 10885 - 0 - 0 - Memory active

150995379 - 9728 - 10885 - 0 - 0 - Memory active

150995377 - 9728 - 17891 - 0 - 0 - Memory active

150995374 - 9728 - 17891 - 0 - 0 - Memory active

150995371 - 9727 - 10886 - 1 - 0 - Memory active

117440821 - 9728 - 10885 - 0 - 0 - Memory active

117440819 - 9728 - 10885 - 0 - 0 - Memory active
#

Saturday, December 20, 2014

Cleaning Up OpenStack Instances in Redhat Satellite or Spacewalk

When using OpenStack with instances that I wanted to have registered with Redhat Satellite or Spacewalk, I was left wondering what would happen to all those registered hosts once they were terminated in OpenStack?

If I chose to do nothing, the answer was I would be left of orphaned hosts in Redhat Satellite or Spacewalk and over time this could lead to higher license costs if leveraging support for Redhat Linux or just pure database bloat due to having all these previously used instances still referenced in my database.

This issue bothered me and I wanted a mechanism that would cleanup instances one they were terminated but the question was how to go about it?

Well I soon realized that OpenStack keeps a record of all the instances it ever created and or terminated. It was the terminated part that would be a key component to what I wanted to accomplish. I figured if I could mine out the deleted data of instances, I could cross check those against Redhat Satellite or Spacewalk.

The Perl script below does just that. I have it run every 24 hrs out of cron and it first goes into the OpenStack nova database and scrapes the instances table for any instances that were marked deleted in the last 24 hours. Any instances it finds it puts into an array that I then enumerate through using the spacecmd tools and check within Satellite or Spacewalk to see if the host is registered. If the host is registered, I then remove the host given that it is no longer a valid host that is up and running.

#!/usr/bin/perl

$cmd=`rm -r -f /root/.spacecmd/spacewalk.schmaustech.com`;

$yestdate=`TZ=CDT+24 /bin/date +%y-%m-%d`;

#$yestdate=`TZ=CDT /bin/date +%Y-%m-%d`;

chomp($yestdate);

@delhosts=`mysql -e "select hostname,uuid,deleted_at from nova.instances where deleted_at is not null"|grep $yestdate`;

foreach $delhost (@delhosts) {

($hostname,$uuid) = split(/\s+/,$delhost);

$uuid2 = $uuid;

$uuid2 =~ s/-//g;

@cmdout=`spacecmd -q system_details $hostname.schmaustech.com`;

foreach $cmd (@cmdout) {

chomp($cmd);

if ($cmd =~ /$uuid2/) {

$message = "Removing from satellite hostname: $hostname with UUID: $uuid...\n";

$cmdtmp = `logger $message`;

$cmdtmp = `spacecmd -y -q system_delete $hostname.schmaustech.com`;

}

exit;

Configuring DVR in OpenStack Juno

Before Juno, when we deploy Openstack in production, there was always a painful point about the single l3-agent node which caused two issues: a performance bottleneck and a single point of failure (albeit there were some non-standard ways around this issue).   Now Juno comes with new Neutron features to provide HA L3-agent and Distributed Virtual Router (DVR).

DVR distributes East-West traffic via virtual routers running on compute nodes. Also virtual routers on compute nodes handle North-South floating IP traffic locally for VM running on the same node. However if floating IP is not in use, VM originated external SNAT traffic is still handled centrally by virtual router in controller/network node. These aspects spread the load of network traffic across your compute nodes and your network controller nodes thus distributing network performance.

HA L3 Agent provides virtual router HA by VRRP. A virtual gateway IP is always available from one of controller/network nodes thus eliminating the single point of failure.

The following blog will discuss how to configure DVR in Juno in a complete configuration aspect.   In this example we used RHEL7 on Redhat’s RDO for Juno.

The host configuration is 3 nodes, one management node, and two compute nodes.   Each node has a data interface for access to the node itself and a bridge interface for the floating-ip network that allows instances access outside of their private subnet to the physical network.

I ran through a standard packstack install specifying GRE tunnels for my connectivity between my management and compute nodes. Be aware that the current version of DVR only supports GRE or VXLAN tunnels as VLANS are not yet supported.    I then configured the setup as if I was using standard neutron networking for a multi-tenant setup, that is all my instances would route traffic through the l3-agent running on the management node (similar behavior in Icehouse and Havana). Once I confirmed this legacy setup was working then moved on to changing it to use DVR on the compute nodes.

On the management node where the neutron server runs edit the following files: neutron.conf, l3_agent.ini, ml2_conf.ini and ovs_neutron_plugin.ini

In /etc/neutron/neutron.conf

Edit the lines to state the following by either adding or uncommenting them:

router_distributed = True
dvr_base_mac = fa:16:3f:00:00:00

Note: When creating a network as admin, one can override the distributed router by using the following flag: "--distributed False"

In /etc/neutron/l3_agent.ini

Edit the line to state the following:

agent_mode = dvr_snat

Note: This will provide the SNAT translation for any instances that do not get assigned a floating-ip. Therefore they will route through the central l3-agent on the management node if they need outside access but will not have a floating-ip associated. Given the l3-agent at the management node can be HA in Juno, this will still not ne a single point of failure. However we are not covering that topic in this article.

In /etc/neutron/plugins/ml2/ml2_conf.ini

Edit the line to state the following:

mechanism_drivers = openvswitch, l2population

In /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini

Edit or add the lines to state the following:

l2_population = True
enable_distributed_routing = True

One each of the compute nodes do the following steps:

Make the ml2 plugin directory. copy over the ml2_conf.ini from neutron node and setup softlink:

mkdir /etc/neutron/plugins/ml2
rsync -av root@ctl1:/etc/neutron/plugins/ml2 /etc/neutron/plugins
cd /etc/neutron
ln -s plugins/ml2/ml2_conf.ini plugin.ini

Copy over the metadata_agent.ini from the neutron server node:

rsync -av root@ctl1:/etc/neutron/metadata_agent.ini /etc/neutron

In /etc/neutron/ l3_agent.ini

Edit the line to state the following:

agent_mode = dvr

In /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini

Edit or add the lines to state the following:

l2_population = True

enable_distributed_routing = True

One final step on the compute node is to associate the br-ex interface with the physical interface on the compute node that will bridge the floating-ip’s to the physical vlan.

ovs-vsctl add-port br-ex

Restart the openstack services on the management node.

Restart the openstack services on the compute node as well. Also ensure you start the l3-agent and metadata service on the compute node.

If you plan on using Horizon to spin up instances and associate floating-ip’s, you will need to make the following edit in the Horizon code as there is a bug: https://bugs.launchpad.net/horizon/+bug/1388305.   Without the code update, you will not see a list of valid ports to associate the floating-ip to on the instance. This association does work from the cli however without modification.

Edit the following file: /usr/share/openstack-dashboard/openstack_dashboard/api/neutron.py

Find the line:

p.device_owner == 'network:router_interface'

And replace it with:

p.device_owner == 'network:router_interface'   or p.device_owner == 'network:router_interface_distributed'

Restart the httpd service.

Once you have followed the steps above you should be able to spin up an instance and associate a floating-ip to it and that instance will be accessible via the compute node l3-agent.   You can confirm a proper namespace is setup by running the following on the compute node:

ip netns

fip-4a7697ba-c29c-4a19-9b92-2a9194e1d6de
qrouter-6b4a2758-3aa7-4603-9fcd-f86f05d0c62

The fip is the floating-ip namespace and the qrouter is just like the namespaces previously seen on a network management node. You can use ip netns exec commands to explore those namespaces and further troubleshoot should the configuration not be working.

Another way to confirm traffic is coming to your instance directly on the compute node is to use tcpdump and sniff on the physical network interface that is bridging to the physical network for the floating-ip network. Then while running tcpdump you can ping your instance from another host somewhere on your network and you will see the packets in the tcpdump.

DVR promising to provide a convenient way of distributing network traffic loads to the compute nodes of the instances on them and helps to alleviate the bottleneck of the neutron management node.

Thursday, November 27, 2014

Lookup Tenant of Floating IP Address in OpenStack

Lets say your security team is doing routine scanning and they find that a few of your OpenStack instances running in your cloud are not passing the security test, what do you do?

You whip up a quick and dirty bash script that takes the floating ip address as an argument and then provides the name of the tenant that ip address belongs to:

#!/bin/bash

FLOAT=`neutron floatingip-list |grep $1|awk -F '|' {'print $2'}`

TENANT=`neutron floatingip-show $FLOAT|grep tenant|awk -F '|' {'print $3'}`

keystone tenant-get $TENANT

Sample run:

./float2tenant.sh 10.63.10.193

+-------------+---------------------------------------------------------+

| Property | Value |

+-------------+---------------------------------------------------------+

| description | This is a sample project |

| enabled | True |

| id | 981690ddbe5347bda5c73415134d6664 |

| name | Project 1 |

+-------------+---------------------------------------------------------+

Friday, May 16, 2014

Faking Out Ceph-Deploy in OpenStack

I wanted to build a functional Ceph deployment for testing but did not have hardware to use. So I decided I would use my instances in OpenStack. The image choice I used for this configuration was the stock RHEL 6.5 cloud image from Redhat. However when I went to do a ceph-deploy install on my monitor server, I ran into this:

[root@ceph-mon ceph]# ceph-deploy install ceph-mon

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf

[ceph_deploy.cli][INFO ] Invoked (1.5.2): /usr/bin/ceph-deploy install ceph-mon

[ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster ceph hosts ceph-mon

[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-mon ...

[ceph-mon][DEBUG ] connected to host: ceph-mon

[ceph-mon][DEBUG ] detect platform information from remote host

[ceph_deploy][ERROR ] UnsupportedPlatform: Platform is not supported:

It didn;t really say what platform it thought this was that was unsupported, but I knew that Redhat 6.5 was supported so it really did not make any sense. What I discovered though was that the following file was missing within my cloud image:

/etc/redhat-release

So I manually add it:

vi /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5 (Santiago)

Then when I reran ceph-deploy it detected a supported platform:

[root@ceph-mon ceph]# ceph-deploy install ceph-mon

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf

[ceph_deploy.cli][INFO ] Invoked (1.5.2): /usr/bin/ceph-deploy install ceph-mon

[ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster ceph hosts ceph-mon

[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-mon ...

[ceph-mon][DEBUG ] connected to host: ceph-mon

[ceph-mon][DEBUG ] detect platform information from remote host

[ceph-mon][DEBUG ] detect machine type

[ceph_deploy.install][INFO ] Distro info: Red Hat Enterprise Linux Server 6.5 Santiago

[ceph-mon][INFO ] installing ceph on ceph-mon

[ceph-mon][INFO ] Running command: yum clean all

Cleaning Up Expired Tokens in OpenStack Keystone

Keystone is an OpenStack project that provides Identity, Token, Catalog and Policy services for use specifically by projects in the OpenStack family. When a client obtains a token from Keystone, that token has a validity period before it expires. However even after it is marked expired, it is kept in the MySQL database of OpenStack. This can create issues if your environment is passing out a lot of tokens and can cause the token table to grow.

To prevent this infinite growth, you can implement the following command in a cron to clean up the expired tokens within the MySQL DB:

keystone-manage token-flush

Thursday, May 15, 2014

OpenStack Cinder: VNX Volume Cleanup

I recently had an issue where I had multiple Cinder volumes in OpenStack (Havana) that were stuck in a deleting state.   Unfortunately the trick of trying to put them back into a state of Available and trying to delete again did not work.    However I was able to come up with a solution to get the job completed and restore consistency.

In my example, my Cinder volumes were being provisioned on a EMC VNX.   So the first step I needed to do was validate if the volumes themselves were removed from the VNX.

Cleanup on VNX:

1) Obtain the volume ID from either OpenStack Dashboard and/or CLI.
2) Log into Unisphere on the VNX that contains the volume pool where the volumes/luns for Cinder are being provisioned.
3) Select the volume pool and show the luns associated with the volume pool.
4) Filter on the luns using the volume ID obtained in step one.
5) Delete the lun.

Now that we have removed the reference on the VNX, we can continue to do the cleanup on the OpenStack side within the database itself. This involves editing 3 tables in the cinder mysql database.

1)Obtain the volume ID from either OpenStack Dashboard and/or CLI.   Make note of the volume size as well.   You will also need to obtain the project/tenant ID that the volume belongs to.
2) Login to the OpenStack management controller that runs the MySQL DB.
3) Run the mysql command to access mysql. Note your deployment may require password and hostname.
4) Select the cinder database with the following syntax:

mysql>use cinder;

5) Next check if the volume id resides in the volume_admin_metadata table:

mysql> select * from volume_admin_metadata where volume_id="";

6) Delete the volume id if it does:

mysql> delete from volume_admin_metadata where volume_id="";

7) Next check if the volume id resides in the volumes table:

mysql> select * from volumes where id="";

8) Delete the volume id if it does:

mysql> delete from volumes where id="";

9) Next update the quota_usages table and reduce the values for the quota_usages fields for that project. First get a listing to see where things are at:

mysql> select * from quota_usages where project_id="";

10) Then execute the update. Depending on your setup you will have to update multiple fields from the output in step 9. In the example below since I was clearing out all volumes for a given project tenant I was able to get away with the following update:

mysql> update quota_usages set in_use='0' where project_id="";

However in cases where you are removing just one volume, you will need to specify the project_id and resource type in the WHERE clause of your MySQL syntax to match on the right in_use field. Further, your in_use will be either number of GB's minus volume removed GB's or number of volumes minus one volume removed.

Once I completed this, my system was back in sync and the volumes stuck in Deleting status were gone.

Sunday, May 11, 2014

OpenStack Havana - Recovering Services Account When Deleted

I was working with one of my colleagues who had accidentally deleted the services account with OpenStack. Unfortunately if this happens, it tends to break your setup in a really big way. After opening a case with Redhat whose OpenStack distribution we were using led to no results. I managed to reverse engineer where the services account resided and reestablish it. Here are the steps I did:

Symptoms:

1) In web gui user gets "Oops something went wrong!" when trying to login. User can get get valid token at command line (keystone token-get) but authorization fails.

2) Openstack-status shows the following:

== Glance images ==

Request returned failure status.

Invalid OpenStack Identity credentials.

==Nova managed services ==

ERROR: Unauthorized (HTTP 401)

== Nova networks ==

ERROR: Unauthorized (HTTP 401)

== Nova instance flavors ==

ERROR: Unauthorized (HTTP 401)

== Nova instances ==

ERROR: Unauthorized (HTTP 401)

Resolution:

Create New Services Project:

Create new "services" project/tenant via CLI (keystone tenant-create).

Obtain new "services" project/tenant ID via CLI (keystone tenant-list).

Determine NEW_SERVICES_ID:

Determine old project/tenant id of services project by looking at following users (nova,glance,neutron,heat,cinder) default_project_id in the user table of keystone database. There default_project_id should all be the same and was the ID of the previous services project that was removed.

Edit MySQL Database:

use keystone;

update user set default_project_id="NEW_SERVICES_ID" where default_project_id="OLD_SERVICES_ID";

use ovs_neutron;

update networks set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";

update subnets set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";

update securitygroups set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";

Wednesday, October 09, 2013

PowerShell & Isilon's REST API: Part 3 Creating Folder

In my previous posts I dabbled with Isilon's REST API and Perl. In the following I provide an example of how to create a folder on the Isilon cluster using the REST API via PowerShell. I have to say I am starting to like PowerShell, but I still have a lot of learning to do before I will feel as comfortable as I do in Perl.

The following example connects to the Isilon cluster and creates the folder schmaus101 in the following namespace path: /namespace/ifs/data/benz/.

### Ignore SSL Cert Errors ###
add-type @"
using System.Net;
using System.Security.Cryptography.X509Certificates;
    public class IDontCarePolicy : ICertificatePolicy {
    public IDontCarePolicy() {}
    public bool CheckValidationResult(
        ServicePoint sPoint, X509Certificate cert,
        WebRequest wRequest, int certProb) {
        return true;
    }
}
"@
[System.Net.ServicePointManager]::CertificatePolicy = new-object IDontCarePolicy
### Username & Password ###
$username = 'admin'
$upassword = 'password'
### Build Up Authentication String ###
$auth = $username + ':' + $upassword
$Encoded = [System.Text.Encoding]::UTF8.GetBytes($auth)
$EncodedPassword = [System.Convert]::ToBase64String($Encoded)
### Build Up HTTP Headers ###
$headers = @{}
#$headers."Accept" = "application/json"
$headers."Authorization"="Basic $($EncodedPassword)"
$headers."x-isi-ifs-target-type" = "container"
### Base HTTP URL ###
$baseurl = 'https://10.88.82.106:8080'
$resource = '/namespace/ifs/data/benz/schmaus101'
$url = $baseurl + $resource
### Run REST PUT to Create Directory ###
$dir = Invoke-RestMethod -Uri $url -Headers $headers -Method Put -ContentType "application/json"

Once the folder is created, one could go back and add a size quota and share off of the folder to complete the automation provisioning sequence. However I will cover those in my next PowerShell musing post.

Monday, October 07, 2013

Perl and Isilon's REST API: Part 2 Creating Directory

In my last post you saw how I was able to do a directory listing on an Isilon cluster using the Isilon REST API. In this posting I am providing an example of how you can create a directory on the Isilon cluster using the Isilon REST API.

The simple script below shows how I can create the directory schmaus8 (as an example) under the /ifs/data path on the cluster:

use REST::Client;
use JSON;
use Data::Dumper;
use MIME::Base64;
use IO::Socket::SSL qw( SSL_VERIFY_NONE );
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0;
my $username = 'admin';
my $password = 'password$';
my $headers = {Accept => 'application/json', Authorization => 'Basic ' . encode_base64($username . ':' . $password), 'x-isi-ifs-target-type' => 'container'};
my $client = REST::Client->new();
$client->getUseragent()->ssl_opts( SSL_verify_mode => 0 );
$client->setHost('https://10.88.82.106:8080');
$client->PUT('/namespace/ifs/data/schmaus8','schmaus8',$headers);
print $client->responseContent(). "\n"; ;
exit;

The output if successful from the script above will be nothing:

C:\TEMP>perl isi-file-create.pl
C:\TEMP>

This script could be altered to provide input for the path you need to have created. You could also join the previous script I provided to first check and see if the directory exists before you try to create it.

Wednesday, October 02, 2013

Perl and Isilon's REST API: Part 1 Listing Files In Directory Path

I have been programming in Perl for about 15 years now so anytime I have to write something I usually resort to Perl. Such was the case when I heard about Isilon and its REST API. However I soon learned that there is very limited information on Perl and Isilon.

Combing the web I found EMC has a two great manuals that talk about their REST API, the Platform and Namespace manuals (not the official name but provides hints on what to look for). However these manuals had limited examples and virtually nothing for Perl. The internet proved to be weak on the subject as well. I did manage to find some examples with Curl, Python and some read-only type PowerShell scripts. However I new that whatever I was going to build would at some point need the ability to GET data and PUT/POST data with the REST API.

In the process of my search I did come across a Perl module for Rest::Client. Seeking out examples of how that module was used with other REST API's and in combination with the EMC manuals for Isilon's REST, I managed to put together my first Perl script.

The following Perl script will list out the files in a directory path on the Isilon cluster:

use REST::Client;
use JSON;
use Data::Dumper;
use MIME::Base64;
use IO::Socket::SSL qw( SSL_VERIFY_NONE );
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0;
my $username = 'admin';
my $password = 'secret';
my $headers = {Accept => 'application/json', Authorization => 'Basic ' . encode_base64($username . ':' . $password), Content-type => 'application/json'};
my $client = REST::Client->new();
$client->getUseragent()->ssl_opts( SSL_verify_mode => 0 );
$client->setHost('https://10.88.82.106:8080');
$client->GET('/namespace/ifs/data/benz',$headers);
print $client->responseContent(). "\n"; ;
exit;

Here is the output when run from a Windows host:

C:\TEMP>perl isi-file-list.pl
{"children":[{
   "name" : "schmaus3"
}
,{
   "name" : "schmaus1"
}
,{
   "name" : "schmaus2"
}
]}

Basic script that we could then add argument parameters to to fill in the host, username, password and path details. We could then also write something to parse the JSON response and make it look nice and neat. Future version, but the core of the example shows you how to do the request in Perl.

Next post will be how you can create a directory on the Isilon cluster.

Perl & Netapp Data Protection Manager Suspend & Resume Datasets

Suppose your using Netapp Data Protection Manager and you need to either suspend or resume many datasets all at once. Doing this through the UI can be cumbersome given you have to do each and everyone individually. However the little Perl scripts below I wrote allows you to suspend or resume all of them automatically. These two Perl scripts should be run on your DFPM host:

To suspend the datasets:

chomp(@list = `dfpm dataset list`);

foreach $list (@list) {

$list =~ s/\s+//;

($id) = split(/\s+/,$list);

print "$id\n";

$cmd = `dfpm dataset suspend $id`;

print "$cmd\n";

}

To resume the datasets:

chomp(@list = `dfpm dataset list`);

foreach $list (@list) {

$list =~ s/\s+//;

($id) = split(/\s+/,$list);

print "$id\n";

$cmd = `dfpm dataset resume $id`;

print "$cmd\n";

}

Now you could add an argument option and merge both scripts and depending on if your argument was suspend or resume execute the correct section of code.

Saturday, June 23, 2012

HDS Device Manager HiRDB Cluster Setup Error

I recently was setting up a Hitachi Device Manager and Tuning Manager cluster on Windows 2008. During the configuration phase I ran into the follow error when trying to setup the HiRDB database onto the shared cluster disk.

HOSTNAME c:\Program Files (x86)\HiCommand\Base\bin\hcmdsdbclustersetup /createcluster /databasepath D:\database /exportpath E:\empty2 /auto

KAPM06577-E An attempt to acquire a value has failed.

This error has to do with C:\Program Files (x86)\HiComand\Base\conf\cluster.conf file. The hcmdsdbclustersetup command references this file which you create before you run the cluster setup command. Specifically it looks for the mode line which can either be online or standby. However if that line is not there, has a typo or the file is missing completely, the cluster setup command will fail.

In my case I had a typo in the file. Once I corrected the typo the command completed successfully.

Wednesday, May 09, 2012

Netapp Interconnect Troubleshooting

In a Netapp metro cluster configuration there may sometimes be the need, for troubleshooting, to reset the cluster interconnect and/or see the status. If the interconnect is experiencing issues it could prevent the filer from failing over. The following command details provide some insight on how to view the interconnect and reset it.

filer> priv set diag
Warning: These diagnostic commands are for use by NetApp
         personnel only.

filer*> ic ?
Usage:
        ic status
        ic stats performance
        ic stats error [ -v ]
        ic ood
        ic ood stats
        ic ood stats performance show [ -v ] [nv_vi_|vi_if_]
        ic ood status
        ic ood stats error show
        ic dumpq
        ic nvthruput [-bwmsrndkc:g:] [-c count]
        ic reset nic [0|1]
        ic reset link
        ic dump hw_stats
        ic ifthruput [chan]

filer*> ic status
        Link 0: up
        Link 1: up
        cfo_rv connection state :                       CONNECTED
        cfo_rv nic used :                               1
        cfo_rv2 connection state :                      CONNECTED
        cfo_rv2 nic used :                              1

To reset one of the interconnect links from the filer side:

filer*> ic reset nic 0

Hidden Options In Netapp

If you have ever wondered what all the hidden options were on a Netapp filer, you are not alone. To view all the options both visible and hidden you only need to do the following:

filer>priv set advanced

filer*>registry walk options

A list of all the options both visible and hidden will be displayed.

Monday, March 26, 2012

Using Netapp DFM and Perl to Manage Netapp Filer /etc Files

Sometimes you have to manage the /etc/passwd and /etc/group files on your Netapp filer and seemingly the only options available are to use rdfile and wrfile or a text editor like vi via a NFS mount or Notepad++ via a CIFS share. None of which appeal to me when trying to create something that a less technical person could use to manipulate these files.

Below is the rough framework that could be used to build a full fledged file manipulator for Netapp files under the /etc directory. In the example below we are looking at the /etc/passwd file, however it could be expanded to manipulate any files on the filer through DFM. Further, you could use Win32:GUI or Perl/TK to provide a GUI to the script as opposed to running it via the command line.

The breakdown of the script is as follows:

Standard Perl interpreter line. In this example we are on Windows.

#!/Perl64/bin/perl

This section of the script is a basic variable assignment of what I want my new line to be in the /etc/passwd file. However you could have an input prompt here and/or have it read from a file.

$newentry = "Your passwd entry\n";

This line grabs the existing /etc/passwd file and loads it into a perl array called rpasswd. It is using the DFM command set to run rdfile on the filer.

chomp (@rpasswd = `dfm run cmd -t 120 faseslab1 rdfile /etc/passwd`);

This section cleans up the rpasswd values and only puts in the lines that match Stdout from the DFM output into a second array called passwd.

foreach $line (@rpasswd) {

if ($line =~ /^Stdout:\s+/) {

$line =~ s/^Stdout:\s+//g;

push(@passwd,$line);

}

This line places the new entry into the passwd array.

push (@passwd,$newentry);

This line backs up the existing passwd file using DFM and mv command.

$result = `dfm run cmd -t 120 faseslab1 mv /etc/passwd /etc/passwd.bak`;

This line writes the new passwd file using the passwd array and DFM to write out the new file.

foreach $line (@passwd) {

$result = `dfm run cmd -t 120 faseslab1 wrfile -a /etc/passwd $line`;

}

exit;

Again, this is a rough example, but it gives you the idea of what can be done using Perl and DFM.

Monday, August 15, 2011

Determine Core Count on Solaris 10 Hosts

This question always comes up time and time again. How does one figure out how to get the correct core count on a Solaris 10 host? Below is the one line answer:

echo "`hostname` has `kstat cpu_info |grep core_id|uniq|wc -l` cores"

Thursday, August 04, 2011

Solaris 10 & What Process is Using Port

The other day I had a request to find out what process was using a specific port on a Solaris 10 server. I came up with this little gem to do the work and provide the PID of the process using the port.

get_port.sh

#!/bin/bash

if [ $# -lt 1 ]
then
echo "Usage: ./get_port.sh port#"
exit
fi
echo "Searching for PID using port... "
for i in `ls /proc`
do
pfiles $i | grep AF_INET | grep $1
if [ $? -eq 0 ]
then
echo The port is in use by PID: $i
fi
done

Tuesday, July 26, 2011

Sun Cluster 3.2 & SCSI Reservation Issues

If you have worked with luns and Sun Cluster 3.2, you may have discovered that if you ever want to remove a lun from a system, it may not be possible because of the scsi3 reservation that Sun Cluster places on the disks. The example scenario below walks you through how to overcome this issue and proceed as though Sun Cluster is not even installed.

Example: We had a 100GB lun off of a Hitachi disk array that we were using in a metaset that was controlled by Sun Cluster. We had removed the resource from the Sun Cluster configuration and removed the device with configadm/devfsadm, however when the storage admin attempted to remove the lun id from the Hitachi array zone, the Hitach array indicated the lun was still in use. From the Solaris server side, it did not appear to be in use, however Sun Cluster has set the scsi3 reservations on the disk.

Clearing the Sun Cluster scsi reservation steps:

1) Determine what DID device the lun is mapped to using /usr/cluster/bin/scdidadm -L
2) Disable failfast on the DID device using /usr/cluster/lib/sc/scsi -c disfailfast -d /dev/did/rdsk/DID
3) Release the DID device using /usr/cluster/lib/sc/scsi -c release -d /dev/did/rdsk/DID
4) Scrub the reserve keys from the DID device using /usr/cluster/lib/sc/scsi -c scrub -d /dev/did/rdsk/DID
5) Confirm reserve keys are removed using /usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/DID
6) Remove lun from zone on machine or whatever procedure you were trying to complete.

Configuring Persistent Bindings on Solaris 10

If you have tape devices attached to your Solaris 10 host and you often find that after a reboot of the host, the tape devices are no longer in the same order they were before, you can use the following Perl script to configure the /etc/devlink.tab file to make the tape devices persist. Script is below:

#!/usr/bin/perl

#################################################################

# This script maps fiber attached tape drives to persistently #

# bind to the same device across reboots. #

#################################################################

use strict;

my($junk,$path,$devices,$dev,$file);

my(@devices,@file);

my $date = `date +%m%d%Y`;

$file = `/usr/bin/cp /etc/devlink.tab /etc/devlink.tab.$date`;

@file = `cat /etc/devlink.tab`;

@file = grep !/type=ddi_byte:tape/, @file;

open (FILE,">/etc/devlink.tab.new");

print FILE @file;

close (FILE);

@devices = `ls -l /dev/rmt/*cbn|awk {'print \$9 \$11'}`;

open (FILE,">>/etc/devlink.tab.new");

foreach $devices (@devices) {

chomp($devices);

($dev,$path) = split(/\.\.\/\.\./,$devices);

$dev =~ s/cbn//g;

$dev =~ s/\/dev\/rmt\///g;

$path =~ s/:cbn//g;

($junk,$path) = split(/st\@/,$path);

print FILE "type=ddi_byte:tape;addr=$path;\trmt/$dev\\M0\n";

}

close (FILE);

$file = `/usr/bin/mv /etc/devlink.tab.new /etc/devlink.tab`;

exit;

Saturday, April 16, 2011

Comparing DAS, NAS, iSCSI, SAN

Purpose:

The purpose of this document is to briefly explain the different types of storage options available and there advantages and disadvantages.

Storage Use Considerations Factors:

Available budget
Data security requirements
Network infrastructure
Data availability requirements, etc.

Storage Types:

Direct Attached Storage (DAS),
Network Attached Storage (NAS),
Storage Area Networks (SAN).

Direct Attached Storage:

Direct Attached Storage is a system of hard drives addressed directly via system buses within the computer (IDE, SCSI); the network interface is managed by the operating system. As these buses can only bridge short distances within the decimeter range, DAS solutions are limited to the respective computer casing. Depending on the bus type, DAS systems are also restricted to a relatively small number of drives - Wide-SCSI achieves the maximum of 16 directly addressable drives. Due to these limitations and the need for more flexible storage, the importance of DAS is declining. Although DAS in terms of terabyte is still growing by 28% annually, the need for storage is increasingly being covered by networked storage like NAS and iSCSI systems.

Network Attached Storage:

NAS systems are generally computing-storage devices that can be accessed over a computer network (usually TCP/IP), rather than directly being connected to the computer (via a computer bus such as SCSI). This enables multiple computers to share the same storage space at once, which minimizes overhead by centrally managing hard disks. NAS devices become logical file system storage for a local area network. NAS was developed to address problems with direct attached storage, which included the effort required to administer and maintain “server farms”, and the lack of scalability, reliability, availability, and performance. They can deliver significant ease of use, provide heterogeneous data sharing and enable organizations to automate and simplify their data management.

NAS Application Uses (Low Performance):

File/Print server
Application specific server
Video Imaging
Graphical image store
Centralized heterogeneous file sharing
File system mirroring
Snap shot critical data
Replacement of traditional backup methods
Medical imaging
CAD/CAM
Portable centralized storage for offsite projects
Onsite repository for backup data

Advantages of NAS:

NAS systems offer a number of advantages:
Heterogeneous OS support. Users running different types of machines (PC, Apple iMac, etc.) and running different types of operating systems (Windows, Unix, Mac OS, etc.) can share files.
Easy to install and manage. NAS appliances are “plug-and-play” meaning that very little installation and configuration is required beyond connecting them to the LAN.
NAS appliances can be administrated remotely, i.e. from other locations.
Less administration overhead than that required for a Unix or Windows file server.
Leverages existing network architecture since NAS are on LANs.
NAS server OSs are smaller, faster, and optimized for the specialized task of file serving and are therefore undemanding in terms of processing power.
A NAS appliance is a standalone file server and can free up other servers to run applications. Compared to iSCSI an additional host server is not necessary.
Compared to iSCSI, NAS appliances already include integrated mechanisms for backup, data synchronization and data replication.

Disadvantages of NAS:

Heavy use of NAS will clog up the shared LAN negatively affecting the users on the LAN. Therefore NAS is not suitable for data transfer intensive applications.
Somewhat inefficient since data transfer rides on top of standard TCP/IP protocol.
Cannot offer any storage service guarantees for mission critical operations since NAS operates in a shared environment.
NAS is shared storage. As with other shared storage, system administrators must enforce quotas without which a few users may hog all the storage at the expense of other users.
NAS is less flexible than a traditional server.
Most database systems such as Oracle or Microsoft Exchange are block-based and are therefore incompatible with file-based NAS servers (except for SQL).

Storage Area Network:

Storage Area Networks (SAN), which also include iSCSI, are distinguished from other forms of network storage by using a block based protocol and generally run over an independent, specialized storage network. Data traffic on these networks is very similar to those used for internal disk drives, like ATA and SCSI. With the exception of SAN file systems and clustered computing, SAN storage is still a one-to-one relationship. That is, each device (or Logical Unit Number (LUN)) on the SAN is “owned” by a single computer (or initiator). SANs tend to increase storage capacity utilization, since multiple servers can share the same growth reserve. Other benefits include the ability to allow servers to boot from the SAN itself. This allows for a quick and easy replacement of faulty servers since the SAN can be reconfigured so that a replacement server can use the LUN of the faulty server.

iSCSI/SAN Application Uses(High Performance):

Offer power users disk space on demand
Databases (Oracle, MS-SQL, MySQL)
Video Imaging
Graphical image store
File system mirroring
Snap shot critical data
Replacement of traditional backup methods
Medical imaging
CAD/CAM
Onsite repository for backup data

Advantages of iSCSI:

Ease of scaling disk storage. With iSCSI the disks are remote from the server, therefore adding a new disk just requires the use ofdisk manager or if replacing the whole server re-mapping the data to the server using iSCSI. With iSCSI you can easily create huge storage pools with volumes in the range of several tera- or petabytes.
In comparison to NAS, which provides a file-level interface, iSCSI provides a block level interface and are therefore compatible with database applications such as Oracle or Microsoft Exchange, that also use a block level interface. Leverages existing network architecture since iSCSI are on LANs.
iSCSI storage appliances can be seamlessly integrated into existing SAN environments, since it also runs on block level storage.
iSCSI can provide significant benefits for providing failover for high availability configurations. iSCSI allows IP based replication, mirroring and clustering of data and offers integrated MPIO (Multi-Path I/O).
iSCSI can also be configured as a particularly flexible DAS system – the local SCSI bus is so to speak extended by the network.
As iSCSI is an underlying technology to the OS and uses the native file system of the applications, it is fully compatible with all file systems.

Disadvantages of iSCSI:

The demands of accommodating SCSI commands and SCSI data packets in TCP/IP packets require extensive hardware resources: CPU performance should be at least that of a 3 GHz Pentium processor, Gigabit Ethernet (GbE) should accordingly be used as a network interface, and the RAM requirement is also significant.
For sharing iSCSI targets with multiple initiators, additional server or specific (client) software for shared data access is required. Known providers of SAN data sharing software are Adic, Sanbolic, IBM, Polyserve, Dataplow and SGI.
As iSCSI is an underlying technology to the OS and application, anything that an organization currently has can be used. On the other hand this means that extra licenses for OS and software applications might be needed.
Comparing to NAS, an iSCSI Target is not a standalone device and an additional host server is necessary.
For sharing centralized storage pool disk among heterogeneous OS requires additional sharing software.
In iSCSI appliances mechanisms for backup, data synchronization and data replication are not integrated and must be configured. Comparing to NAS, iSCSI behaves like a local hard drive in the network.

Advantages of SAN:

A SAN will have all the advantages of iSCSI.
A SAN has higher throughput since it runs over dedicated fibre channel topology and not the LAN.
A SAN will reduce the overhead that iSCSI places on system resources.

Disadvantages of SAN:

Implementing a SAN infrastructure will cost more then NAS or iSCSI due to the additional equipment needed to build out the fibre channel topology.

Summary:

Direct Attached Storage is probably not going to feasibly allow us to grow our disk capacity in the future.
Network Attached Storage (NAS) is the obvious choice for a storage solution wherever the main focus is on storing and archiving files and shared access to these over a network – even from different client operating systems. Small and medium-sized businesses, typing pools, legal or agency offices, and even end users with large amounts of multimedia files will find an affordable storage solution for their needs in NAS.
For storing database systems - other than SQL-based database systems - on a network, NAS is however not a feasible solution. For requirements of this type the industry has developed the Storage Area Network (SAN) technology, which can often be implemented using iSCSI components. Advantages of iSCSI: An IP-based SAN allows administrators to use their familiar management tools and security mechanisms and rely on their existing know-how. However, iSCSI only makes sense in connection with a fast LAN infrastructure: At a throughput of approximately 120 Mbyte/s, the performance of a 1 Gbit Ethernet will be sufficient for database applications for approximately 100 users (data volume: approx. 15 MByte/s). Only high-end storage systems will require a 10 GbE infrastructure. Somewhat inefficient since data transfer rides on top of standard TCP/IP protocol. In contrast, SAN uses protocols designed especially for data transfer (though the advantage disappears if a server on the LAN is used to provide a file interface to a SAN).

Mapping Global Zone Name Inside Solaris 10 Zone

Log into global zone where local zone(s) reside.

Using zonecfg, configure the following for each zone:
# zonecfg -z (zone)
add fs
set type=lofs
set options=ro
set special=/etc/nodename
set dir=/etc/globalname
end
verify
commit
exit

Create mount point within local zone directory structure:
# touch /zones/(zone)/root/etc/globalname

Mount lofs file system manually:
# mount -F lofs -o ro /etc/nodename /zones/(zone)/root/etc/globalname (or path where the root of the zone resides)

Confirm local zone can access file:
# zlogin (zone) cat /etc/globalname

Change /tmp size on Solaris 10 Zone

This blog describes the steps on how to change the tmp size on an existing Solaris 10 zone.

First log into global zone where local zone(s) reside using zlogin:

# zlogin –z (zone)

As root edit the /etc/vfstab:

# vi /etc/vfstab

Find the line in vfstab for the /tmp filesystem:

swap - /tmp tmpfs - yes size=512mb (this could be any value, 512 is example)

Change the value of size=512mb to the requested value (in MB):

swap - /tmp tmpfs - yes size=2048mb

Save the vfstab and exit back to the global zone.

To make the changes take effect the zone must be stopped and booted:

# zoneadm –z (zone) halt

# zoneadm –z (zone) boot

Log back into local zone and confirm changes by reviewing df –h output:

# zlogin –z

# df –h|grep /tmp

swap                   2.0G     0K   2.0G     0%    /tmp

Sunday, October 17, 2010

Chinook CT80c for Apple IIc

I have seen a few discussions on the internet about the elusive Chniook CT80c for the Apple IIc, but in all those discussions there never seems to be any photographs of the device. I figured now was the time to display some shots of the device that finally provided an external hard drive to the closed system Apple IIc.

The Chinook CT80c, like it smaller cousins, the CT20c and CT40c were external based hard drives that connected to an Apple IIc and IIc+ computer using Apple's Smartport protocol. The drive could be daisy chained along with 3.5" UniDisk drives and 5.25" drives, albeit the 5.25" drives needed to be last in the chain.

This is an external view of the CT80c. The case was made of a sturdy all aluminum shell. There were two LED's, one for power and one for hard drive activity. My hard drive activity light burned out so I replaced it with a yellow LED.

This is the back side of the CT80c. I originally had serial number 0100102, but that drive was DOA on arrival. Chinook gladly sent me another drive before I even returned the first.

This is the hard drive side on the inside of the CT80c. They used Conner drives and this one was a CP3100 which is actually a 104mb drive, not a 80mb drive. So I actually ended up with 20mb more of space once I partitioned it.

This is the circuit board side of the CT80c. Notice that it had a 6502 processor, Hyundai 8K x 8-bit CMOS SRAM, 120ns, and a Zilog Z0538010PSC SCSI processor.

Here is a closer shot of the circuit board.

Monday, February 22, 2010

Expire List of Tapes in Netbackup

When working in a large Netbackup environment, there often comes a time when you need to expire a large amount of tapes all at once.

This was exactly the scenario that happened to me when a company I consulted at changed their retention periods. The change of retention periods from 1 year to 1 month meant they wanted to free up all the tapes that contained images that were more then a month old.

The solution to the problem was a script that basically does the following:

1) Reads in a list of media id's from a file.
2) Determine which media server the media id is assigned.
3) Expire the media id from the Netbackup catalog.

The script is here:

#!/usr/bin/perl
$master="hostname of master server";
open DATA, "cat /home/schmaubj/media.list|";
while () {
 $mediaid = $_;
 chomp($mediaid);
 open DATA2, "/usr/openv/netbackup/bin/admincmd/bpmedialist -U -m $mediaid|";
 while () {
  $line = $_;
  chomp($line);
  if ($line =~ /Server Host/) {
   ($junk,$mhost) = split(/=/,$line); 
   chomp($mhost);
    $mhost =~ s/ *$//;  
          $mhost =~ s/^ *//; 
  }
 }
 close (DATA2);
 print "Media ID: $mediaid\n";
 print "Media Server Host: $mhost\n";
 print "Expiring now...\n";
 $expire=`/usr/openv/netbackup/bin/admincmd/bpexpdate -force -d 0 -m $mediaid -host $mhost -M $master`;
}
close (DATA);