Archive for May, 2014

In my Folsom Openstack setup intalled by DevStack, it takes more than 5minutes to delete a volume with the size 50 GB.

Why it takes a longer time.

When a volume is deleted, Openstack tries to clean the memory allocated for the volume. Volume memory is cleaned by writing zeros to the volume (dd if=/dev/zero ). Memory is cleaned as a security measure.  If the volume is deleted without cleaning the memory, the data stored in the volume may be visible to the volumes which will be created next. When the volume is larger (say 50GB) this memory cleaning takes a long time.

What are the work arounds

The default behavior can be changed by editing the Cinder configurations in /etc/cinder/cinder.conf  file.

Skipp clearing the volume memory

Setting volume_clear to none makes Openstack skipping clearing the volume memory

# Method used to wipe old volumes (valid options are: none,
# zero, shred) (string value)
volume_clear=none
Setting volume_clear=clear will clear the volumes.
If volume_clear=shred, Openstack will use the Linux “shred -n3″ command to clean the volumes which will securely overwrite the volume 3 times.
Clear only first part of volume
If you specify some value other than zero, Openstack will clear only that amount of MegaBytes of memory from the beginning of the memory allocated for the volume. This is useful  when the volume contains encrypted data because deleting part of the encrypted data will prevent it from being decrypted.
# Size in MiB to wipe at start of old volumes. 0 => all
# (integer value)
volume_clear_size=100
Please note that if volume_clear=none, this configuration parameter is ignored.
Delete only blocks that data is written on to

Setting lvm_type to thin will make Openstack to delete only the blocks some data has written on to. Openstack keep track of the blocks which data is written.

# Type of LVM volumes to deploy; (default or thin) (string
# value)
lvm_type=thin
Openstack volume deletion code commit is pasted here for more clarification.
def clear_volume(self, volume):
         """unprovision old volumes to prevent data leaking between users."""
 
         vol_path = self.local_path(volume)
         size_in_g = volume.get('size')
         size_in_m = FLAGS.volume_clear_size
 
         if not size_in_g:
             return
 
         if FLAGS.volume_clear == 'none':
             return
 
         LOG.info(_("Performing secure delete on volume: %s") % volume['id'])
 
         if FLAGS.volume_clear == 'zero':
             if size_in_m == 0:
                 return self._copy_volume('/dev/zero', vol_path, size_in_g)
             else:
                 clear_cmd = ['shred', '-n0', '-z', '-s%dMiB' % size_in_m]
         elif FLAGS.volume_clear == 'shred':
             clear_cmd = ['shred', '-n3']
             if size_in_m:
                 clear_cmd.append('-s%dMiB' % size_in_m)
         else:
             LOG.error(_("Error unrecognized volume_clear option: %s"),
                       FLAGS.volume_clear)
             return
 
         clear_cmd.append(vol_path)
         self._execute(*clear_cmd, run_as_root=True)

If you have installed Openstack using DevStack, you will be able to create volumes or volume snapshots only until total capacity reaches 10GB. You won’t be able to create volumes larger than 10GB and you won’t be able to create more volumes when it reahes 10GB.

This article guides you how to increase this volume capacity so you can create more and larger volumes.

View the volume groups by executing “vgs” command. You can see that volume group “stack-volumes” is 10GB in size. So only volumes can be created upto 10GB.

 $vgs
  VG            #PV #LV #SN Attr   VSize   VFree
  stack-volumes   3   1   0 wz--n-  10.00g 10.00g
  stratos1        1   2   0 wz--n- 931.09g 48.00m

Let’s create a new partition so that we can increase the capacity of the volume group. A file named “cinder-volumes” which has the size of 50GB is created. The file is associated to the loop device /dev/loop3. Then the device is partitioned using fdisk.

dd if=/dev/zero of=cinder-volumes bs=1 count=0 seek=50G
losetup /dev/loop3 cinder-volumes
fdisk /dev/loop3

And at the fdisk prompt, enter the following commands:

n
p
1
ENTER
ENTER
t
8e
w

Create a physical volume with the above device.

root@stratos1:~# pvcreate /dev/loop3
  Physical volume "/dev/loop3" successfully created

Exetend the volume group (stack-volumes) size by adding the newly created device.

root@stratos1:~# vgextend stack-volumes  /dev/loop3
  Volume group "stack-volumes" successfully extended

Let’s see the details about the available physical devices. You will see the new device listed down.

root@stratos1:~# pvs
  PV         VG            Fmt  Attr PSize   PFree
  /dev/loop0 stack-volumes lvm2 a-    10.01g 10.01g
  /dev/loop3 stack-volumes lvm2 a-    50.00g 50.00g

Now check the details of the volumes groups by executing the vgdisplay command. You will see there are more free space (60GB since we added 50GB more) in the volume group “stack-volumes”.

root@stratos1:~# vgdisplay
  --- Volume group ---
  VG Name               stack-volumes
  System ID             
  Format                lvm2
  Metadata Areas        3
  Metadata Sequence No  303
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                3
  Act PV                3
  VG Size               60.00 GiB
  PE Size               4.00 MiB
  Total PE              23040
  Alloc PE / Size       7680 / 30.00 GiB
  Free  PE / Size       15360 / 60.00 GiB
  VG UUID               bM4X5R-hC3V-zY5F-ZMVI-s7dz-Kpiu-tPQ2Zt

Now you will be able to create more and larger volumes.

What is Auto Scaling

An auto scaling system automatically scale up, which means spawning up the service instances during the rush time and automatically scale down, which means shutting down the additional service instances during the non peak time.

Auto Scaling parameters

Apache Stratos PAAS framework provides a built in auto scaling capability, based on three parameters; Requests In Flight, Load Average and Memory Consumption. These three parameters are described below;
Requests In Flight (RIF)
Requests In Flight is the number of requests, which is not yet served by the Load Balancer (LB). When the LB receives a request, it increments the RIF count by one. Then it forwards the request to one of the service instances. When the service instance sends the response, LB forwards that response back to the client, who has originally sent the request. After forwarding the request, LB decrements the RIF count by one. Thus, LB increments the RIF when a request is received and decrement when the response is sent back to the client. Hence the RIF count denotes the number of requests which are pending to be served. If RIF count is high, that means there are high numbers of requests queued but not yet served. This is a good indication that, the system needs some additional resources to serve the incoming requests.

Load Average

Load Average value is the CPU usage of the service instances. High value of the Load Average denotes the CPU usage or loads of the service instances are high for some reason. It may be due to the requests, which are being served by the instance are high or the CPU is being consumed by any other process. Regardless of the reason, high CPU usage indicates that, the instances or nodes are not in a healthy condition to serve more incoming requests. In that case, additional resources or instances should be added to cater the future requests. It is the time for scaling up.

The formula used to calculate Load Average is: Load Average = (loadAvg/cores)*100
In the above formula, loadAvg refers to the load average of the instance during last minute of time and cores refer to the number of CPU cores of the instance.

To be more specific below is the Java code to calculate the loadAverage

double loadAvg = (double)ManagementFactory.getOperatingSystemMXBean().getSystemLoadAverage();
 // assume system cores = available cores to JVM
int cores = ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors();
double loadAvgPercentage = (loadAvg/cores) * 100;

Above¬†loadAvg¬†is same as the first number which has three digits of the load average of Linux Shell commands “uptime” and “top”.
$ uptime
18:24:39 up 4 days,  4:08,  8 users,  load average: 2.82, 2.63, 2.17
A detailed article on Linux load average can be found here. http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

Memory Consumption

Memory Consumptions is the average memory usage of the instances. Memory consumption growth may be due to the high number of threads which processes the incoming requests. High memory consumption of nodes indicates that, the system needs to add more nodes to serve incoming requests.

The formula used to calculate the Memory Consumptions is: (usedMemory/totalMemory)*100
In the above formula, usedMemory refers to the amount of current memory usage and totalMemory refers to the total amount of memory available in the instance.
The Jave code to calculate the memory consumption is pasted below for better understanding.

private static final int MB = 1024 * 1024;
OperatingSystemMXBean osBean = (OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean();
double totalMemory = (double)(osBean.getTotalPhysicalMemorySize()/ MB);
double usedMemory = (double)((totalMemory - (osBean.getFreePhysicalMemorySize() / MB) ));
double memoryConsumption = (usedMemory / totalMemory) * 100;

totalMemory and usedMemory in above formula are same as the output of the free Shell command in Linux

$ free -m
total       used       free     shared    buffers     cached
Mem:          7687       6362       1324          0        177       1970
-/+ buffers/cache:       4214       3472
Swap:        15623       1376      14247

Accordingly; Memory Consumption = (6362/7687) * 100 = 82

Auto Scaling Policy

When the values of; requests in flight or load average or memory consumption is high the system should scale up (add more nodes) and when the same is low, system should scale down (remove additional nodes). The question is when it is meant to be HIGH and when it is meant to be LOW. Stratos uses a policy called Auto Scaling Policy to decide this HIGH and LOW.  DevOps can deploy multiple auto scaling policies and user can select a policy at the time of subscription. Auto scaling policy describes the threshold values of each parameter to decide whether it is high or low.
A sample auto scaling policy is shown below.

{
  "id": "autoscale-policy-1",
  "loadThresholds": {
    "requestsInFlight": {
      "average": 50
    },
    "memoryConsumption": {
      "average": 70
    },
    "loadAverage": {
      "average": 70
    }
  }
}
Auto Scaling policy can be deployed the Stratos UI, Stratos CLI or by invoking the REST service hosted in Stratos manager as below.

curl -X POST -H "Content-Type: application/json" -d @policyJson -k -u admin:admin https://<SM_HOST>:9443/stratos/admin/policy/autoscale

This article is an extension of the article titled “Auto Scaling with Apache Stratos‚ÄĚ, which was explaining the basics of the Stratos auto scaling. This article explains how Stratos components work together to perform auto scaling.

Let’s consider a simple scenario where a user has subscribed to a PHP cartridge. Assume there are two PHP instances currently running in the system. Figure 1 depicts the similar type of environment. Stratos components such as AS (Auto Scaler), CC (cloud controller), CEP (Complex Event Processor) are publicized inside the blue box. PHP instances and load balancer instance are shown in the green box.

Every PHP instance and Load Balancer instance has Stratos Cartridge Agent, running inside them which communicate with Stratos system.

The undermining operations of the Auto Scaling are list down below.

How Auto Scaling works in Apache Stratos

How Auto Scaling works in Apache Stratos

  1.  PHP instances periodically send their Memory Consumption and Load Average values to CEP via Thrift protocol.
  2. Load balancer instance periodically send its Requests In Flights (RIF) of PHP cluster to CEP.
  3. CEP aggregates the Memory consumption, load average and RIF and sends the aggregated results to the “Summarized Health Stats‚ÄĚ topic of Message Broker.
  4. Auto Scaler receives the message of the aggregate results, since it has subscribed to the above topic.
  5. Scaling rules which are running periodically inside the Auto Scaler, compares the received aggregate results with the Auto Scaling policy.

    If received values of results are above the thresholds values of the auto scaling policy, AS will decide to scale up, else keep calm.

    When the peak time is approached, large number of users request your PHP apps, then;

    • RIF count of LB might increase¬†since there are more requests
    • PHP instances will have to serve more requests which will create more threads, thus¬†increase the memory Consumption¬†of the nodes.
    • More the requests,¬†more the load average will be.

    This will lead current aggregate results to exceed the thresholds values in the Auto Scaling policy.

    At the time when AS decided to scale up, AS asks Cloud Controller (CC) to start a new PHP instance.

  6. CC spawns a new PHP instance.
  • Recently created PHP instance is now a member of the same PHP cluster.
  • Load balancer add the newly created PHP instance to its cluster and start forwarding the user requests to new member as well.
  • This PHP instance will start sending its Memory Consumption and Load Average details to CEP.