Ceph

Ceph: Health WARN – too many PGs per OSD

Ceph health (or status) reported warning: too many PGs per OSD, how to solve this?

health HEALTH_WARN
too many PGs per OSD (320 > max 300)

What is this warning means:

The average number PGs in an  (default number is 300)

 => The total number of PGs in all pools / Total number of OSDs,

If the above is more than the default (i.e 300), ceph monitor will report warning.

How to solve/suppress this warning message:

Use injectargs to modify the “mon_pg_warn_max_per_osd to 0”, temporarily,the till
the ceph mon server restart.

# ceph tell mon.* injectargs  "--mon_pg_warn_max_per_osd 0" 

To make the above change persistence, update the ceph.conf with below line and
restart the ceph mons:

mon_pg_warn_max_per_osd = 0

 

 

 

Advertisements
Ceph

Ceph: How to change OSD nearfull and full ratio

Here is a quick way to change osd’s nearfull and full ration quickly:

# ceph pg set_nearfull_ratio 0.88   // Will change the nearfull ratio to 88%
# ceph pg set_full_ratio 0.92       // Will change the full ratio to 92%


You can set the above using the “injectargs”, but sometimes its not injects the new configurations:

For ex:

# ceph tell mon.* injectargs "--mon_osd_nearfull_ratio .88"
# ceph tell mon.* injectargs "--mon_osd_full_ratio .98"

 

 

Openstack

OpenStack: What’s new in OpenStack Ocata

Please refer the OpenStack Ocata release notes along with the version and other details:

Release notes: https://releases.openstack.org/ocata/index.html
Documentation: https://docs.openstack.org/ocata/

Thanks a lot for all PTLs and developers,  who helped in this release.

Nova (OpenStack Compute Service) – Version – 15.0.0

  1. VM placement changes: The Nova filter scheduler will now use the Placement API to filter compute nodes based on CPU/RAM/Disk capacity.
  2. High availability: Nova now uses Cells v2 for all deployments; currently implemented as single cells, the next release, Pike, will support multi-cell clouds.
  3. Neutron is now the default networking option.
  4. Upgrade capabilities: Use the new ‘nova-status upgrade check’ CLI command to see what’s required to upgrade to Ocata.

Keystone (OpenStack Identity Service) -Version 11.0.0

  1. Per-user Multi-Factor-Auth rules (MFA rules): You can now specify multiple forms of authentication before Keystone will issue a token.  For example, some users might just need a password, while others might have to provide a time-based one time password and an additional form of authentication.
  2. Auto-provisioning for federated identity: When a user logs into a federated system, Keystone will dynamically create that user a role; previously, the user had to log into that system independently, which was confusing to users.
  3. Validate an expired token: Finally, no more failures due to long-running operations such as uploading a snapshot. Each project can specify whether it will accept expired tokens, and just HOW expired those tokens can be.

Swift (OpenStack Object Storage) – Version – 2.11.0

  1. Improved compatibility: Byteorder information is now included in Ring files to support machines with different endianness.
  2. More flexibility: You can now configure the base of the URL base for static web.  You can also set the “filename” parameter in TempURLs and validate those TempURLs against a common prefix.
  3. More data: If you’re dealing with large objects, you can now use multi-range GETs and HTTP 416 responses.

Cinder (OpenStack Block Storage) – Version – 10.0.0

  1. Active/Active HA: Cinder can now run in Active/Active clustered mode, preventing concurrent operation conflicts. Cinder will also handle mid-processing service failures better than in past releases.
  2. New attach/detach APIs: If you’ve been confused about how to attach and detach volumes to and from VMs, you’re not alone. The Ocata release saw the Cinder team refactor these APIs in preparation for adding the ability to attach a single volume to multiple VMs, expected in an upcoming release.

Glance (OpenStack Image Service) – Version – 14.0.0

  1. Image visibility:  Users can now create “community” images, making them available for everyone else to use. You can also specify an image as “shared” to specify that only certain users have access.

Neutron (OpenStack Networking Service) – Version -10.0.0

  1. Support for Routed Provider Networks in Neutron: You can now use the NOVA GRP (Generic Resource Pools) API to publish networks in IPv4 inventory.  Also, the Nova scheduler uses this inventory as a hint to place instances based on IPv4 address availability in routed network segments.
  2. Resource tag mechanism: You can now create tags for subnet, port, subnet pool and router resources, making it possible to do things like map different networks in different OpenStack clouds in one logical network or tag provider networks (i.e. High-speed, High-Bandwidth, Dial-Up).

Heat (OpenStack Orchestration Service) – Version -8.0.0

  1. Notification and application workflow: Use the new  OS::Zaqar::Notification to subscribe to Zaqar queues for notifications, or the OS::Zaqar::MistralTrigger for just Mistral notifications.

Horizon (OpenStack Dashboard) – Version – 11.0.0

  1. Easier profiling and debugging:  The new Profiler Panel uses the os-profiler library to provide profiling of requests through Horizon to the OpenStack APIs so you can see what’s going on inside your cloud.
  2. Easier Federation configuration: If Keystone is configured with Keystone to Keystone (K2K) federation and has service providers, you can now choose Keystone providers from a dropdown menu.

Telemetry (Ceilometer) – Version – 8.0.0

  1. Better instance discovery:  Ceilometer now uses libvirt directly by default, rather than nova-api.

Telemetry (Gnocchi)

  1. Dynamically resample measures through a new API.
  2. New collectd plugin: Store metrics generated by collectd.
  3. Store data on Amazon S3 with new storage driver.

Dragonflow (Distributed SDN Controller)

  1. Better support for modern networking: Dragonflow now supports IPv6 and distributed sNAT.
  2. Live migration: Dragonflow now supports live migration of VMs.

Kuryr (Container Networking)

  1. Neutron support: Neutron networking is now available to containers running inside a VM.  For example, you can now assign one Neutron port per container.
  2. More flexibility with driver-based support: Kuryr-libnetwork now allows you to choose between ipvlan, macvlan or Neutron vlan trunk ports or even create your own driver. Also, Kuryr-kubernetes has support for ovs hybrid, ovs native and Dragonflow.
  3. Container Networking Interface (CNI):  You can now use the Kubernetes CNI with Kuryr-kubernetes.
  4. More platforms: The controller now handles Pods on bare metal, handles Pods in VMs by providing them Neutron subports, and provides services with LBaaSv2.

Vitrage (Root Cause Analysis Service) – Version -1.5.0

  1. A new collectd datasource: Use this fast system statistics collection deamon, with plugins that collect different metrics. From Ifat Afek: “We tested the DPDK plugin, that can trigger alarms such as interface failure or noisy neighbors. Based on these alarms, Vitrage can deduce the existence of problems in the host, instances and applications, and provide the RCA (Root Cause Analysis) for these problems.”
  2. New “post event” API: Use This general-purpose API allows easy integration of new monitors into Vitrage.
  3. Multi Tenancy support: A user will only see alarms and resources which belong to that user’s tenant.

Ironic (Bare Metal Service) – Version – 7.0.0

  1. Easier, more powerful management: A revamp of how drivers are composed, “dynamic drivers” enable users to select a “hardware type” for a machine rather than working through a matrix of hardware types. Users can independently change the deploy method, console manager, RAID management, power control interface and so on. Ocata also brings the ability to do soft power off and soft reboot, and to send non-maskable interrupts through both ironic and nova’s API.

TripleO (Deployment Service) – Version – 3.1.0

  1. Easier per-service upgrades: Perform step-by-step tasks as batched/rolling upgrades or in parallel. All roles, including custom roles, can be upgraded this way.
  2. Composable High-Availability architecture: Services managed by Pacemaker such as galera, redis, VIPs, haproxy, cinder-volume, rabbitmq, cinder-backup, and manila-share can now be deployed in multiple clusters, making it possible to scale-out the number of nodes running these services.

OpenStackAnsible (Ansible Playbooks and Roles for Deployment)

  1. Additional support: OpenStack-Ansible now supports CentOS 7, as well as integration with Ceph.

Puppet OpenStack (Puppet Modules for Deployment)

  1. New modules and functionality: The Ocata release includes new modules for puppet-ec2api, puppet-octavia, puppet-panko and puppet-watcher. Also, existing modules support configuring the [DEFAULT]/transport_url configuration option. This changes makes it possible to support AMQP providers other than rabbitmq, such as zeromq.

Barbican (Key Manager Service) – Version- 4.0.0

  1. Testing:  Barbican now includes a new Tempest test framework.

Congress (Governance Service) – Version – 5.0.0

  1. Network address operations:  The policy language has been enhanced to enable users to specify network network policy use cases.
  2. Quick start:  Congress now includes a default policy library so that it’s useful out of the box.

Monasca (Monitoring) – Version – 1.4.0

  1. Completion of Logging-as-a-Service:  Kibana support and integration is now complete, enabling you to push/publish logs to the Monasca Log API, and the logs are authenticated and authorized using Keystone and stored scoped to a tenant/project, so users can only see information from their own logs.
  2. Container support:  Monasca now supports monitoring of Docker containers, and is adding support for the Prometheus monitoring solution. Upcoming releases will also see auto-discovery and monitoring of applications launched in a Kubernetes cluster.

Trove (Database as a Service) – Version – 7.0.0

  1. Multi-region deployments: Database clusters can now be deployed across multiple OpenStack regions.

Mistral (Taskflow as a Service) – Version – 4.0.0

  1. Multi-node mode: You can now deploy the Mistral engine in multi-node mode, providing the ability to scale out.

Rally (Benchmarking as a Service)

  1. Expanded verification options:  Whereas previous versions enabled you to use only Tempest to verify your cluster, the newest version of Rally enables you to use other forms of verification, which means that Rally can actually be used for the non-OpenStack portions of your application and infrastructure. (You can find the full release notes here.)

Zaqar (Message Service) – Version – 4.0.0

  1. Storage replication:  You can now use Swift as a storage option, providing built-in replication capabilities.

Octavia (Load Balancer Service) –  Version – 0.10.0

  1. More flexibility for Load Balancer as a Service:  You may now use neutron host-routes and custom MTU configurations when configuring LBaasS.

Solum (Platform as a Service) – Version – 5.1.0

  1. Responsive deployment:  You may now configure deployments based on Github triggers, which means that you can implement CI/CD by specifying that your application should redeploy when there are changes.

Tricircle (Networking Automation Across Neutron Service) – Version – 5.1.0

  1. DVR support in local Neutron:  The East-West and North-South bridging network have been combined into North-South a bridging network, making it possible to support DVR in local Neutron.

Kolla (Container Based Deployment) – Version – 4.0.0

  1. Dynamic volume provisioning: Kolla-Kubernetes by default uses Ceph for stateful storage, and with Kubernetes 1.5, support was added for Ceph and dynamic volume provisioning as requested by claims made against the API server.

Freezer (Backup, Restore, and Disaster Recovery Service) – Version – 4.0.0

  1. Block incremental backups:  Ocata now includes the Rsync engine, enabling these incremental backups.

Senlin (Clustering Service) – Version – 3.0.0

  1. Generic Event/Notification support: In addition to its usual capability of logging events to a database, Senlin now enables you to add the sending of events to a message queue and to a log file, enabling dynamic monitoring.

Watcher (Infrastructure Optimization Service) – Version -0.32.0

  1. Multiple-backend support: Watcher now supports metrics collection from multiple backends.

Cloudkitty (Rating Service) – Version 5.0.0

  1. Easier management:  CloudKitty now includes a Horizon wizard and hints on the CLI to determine the available metrics. Also, Cloudkitty is now part of the unified OpenStack client.

Manila (Shared File System Service) – Version 5.0.0

    1. Specific usages through the quota-sets:  API to show user and tenant specific usages through the quota-sets resource.
    2. purge sub command:  Added purge sub command to be able to purge soft-deleted rows.

Other useful links on Ocata release:

  1.  Webinar webinar talking about 157 new features.
  2.  53 new things to look for at OpenStack Ocata 53 new things to look for in OpenStack Ocata.
Ceph

Ceph: What is ‘sortbitwise’ flag?

After upgrading to Ceph Jewel release, users should set the ‘sortbitwise’ flag to enable the new internal object sort order.

  • “sortbitwise” is a change in the internal sorting algorithm.
  • It is exposed to end-user only because of legacy (pre- jewel) compatibility.
  • Latest  Ceph installation it should always be ‘on’
$ ceph osd set sortbitwise   // To set the sortbitwise
$ ceph osd unset sortbitwise // to unset the sortbitwise
Ceph, Openstack, Rados GW, Telemetry

Ceph RGW usage with Openstack Telemetry

We assume that the  Ceph RGW (i.e Rados Gate Way) and Openstack Keystone are already integrated and working. If not you can use the Ceph documentation here.
You can make sure everything works properly with

# source openrc // with admin tenant
# swift -V 2.0 -A http://localhost:5000/v2.0  stat
             Account: v1
           Containers: 1
              Objects: 174
                 Bytes: 26824860
X-Account-Bytes-Used-Actual: 27140096
             X-Trans-Id: tx00000000000000000000d-0056b23cdc-19cf4-default
           Content-Type: text/plain; charset=utf-8
          Accept-Ranges: bytes

Make sure usage logs are enabled on the radosgw by setting rgw_enable_usage_log = true. You will also need a RadosGW admin user. I created the same  as follows:

# radosgw-admin user create --uid admin --display-name "admin user" --caps "usage=read,write;metadata=read,write;users=read,write;buckets=read,write"

In /etc/ceilometer/ceilometer.conf, uncomment and configure the following

radosgw = object-store
[rgw_admin_credentials]
access_key = <admin user access_key>
secret_key = <admin user secret>

The rgw ceilometer module depends on requests-aws to access the Admin ops API. Install it and restart the ceilometer service

# pip install requests-aws
# openstack-service restart ceilometer

After a while, the meters should show up

(keystone_admin)# ceilometer meter-list
+---------------------------------+-------+---------+---------------------------------------------+---------+----------------------------------+
| Name                            | Type  | Unit    | Resource ID                                 | User ID | Project ID                       |
+---------------------------------+-------+---------+---------------------------------------------+---------+----------------------------------+
| image                           | gauge | image   | d230f54e-6877-451e-bd6c-ca1c460b9041        | None    | ff4d8e6c7e244acb82476fb6acba1833 |
| image.size                      | gauge | B       | d230f54e-6877-451e-bd6c-ca1c460b9041        | None    | ff4d8e6c7e244acb82476fb6acba1833 |
| radosgw.api.request             | gauge | request | 4ddbab4fb6b445fb8195dd08b9cc05a9            | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.api.request             | gauge | request | 8ce73b1b38ab4945b294d691dfd86101            | None    | 8ce73b1b38ab4945b294d691dfd86101 |
| radosgw.api.request             | gauge | request | ff4d8e6c7e244acb82476fb6acba1833            | None    | ff4d8e6c7e244acb82476fb6acba1833 |
| radosgw.containers.objects      | gauge | object  | 4ddbab4fb6b445fb8195dd08b9cc05a9/container1 | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.containers.objects.size | gauge | B       | 4ddbab4fb6b445fb8195dd08b9cc05a9/container1 | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.objects                 | gauge | object  | 4ddbab4fb6b445fb8195dd08b9cc05a9            | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.objects                 | gauge | object  | 8ce73b1b38ab4945b294d691dfd86101            | None    | 8ce73b1b38ab4945b294d691dfd86101 |
| radosgw.objects                 | gauge | object  | ff4d8e6c7e244acb82476fb6acba1833            | None    | ff4d8e6c7e244acb82476fb6acba1833 |
| radosgw.objects.containers      | gauge | object  | 4ddbab4fb6b445fb8195dd08b9cc05a9            | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.objects.containers      | gauge | object  | 8ce73b1b38ab4945b294d691dfd86101            | None    | 8ce73b1b38ab4945b294d691dfd86101 |
| radosgw.objects.containers      | gauge | object  | ff4d8e6c7e244acb82476fb6acba1833            | None    | ff4d8e6c7e244acb82476fb6acba1833 |
| radosgw.objects.size            | gauge | B       | 4ddbab4fb6b445fb8195dd08b9cc05a9            | None    | 4ddbab4fb6b445fb8195dd08b9cc05a9 |
| radosgw.objects.size            | gauge | B       | 8ce73b1b38ab4945b294d691dfd86101            | None    | 8ce73b1b38ab4945b294d691dfd86101 |
| radosgw.objects.size            | gauge | B       | ff4d8e6c7e244acb82476fb6acba1833            | None    | ff4d8e6c7e244acb82476fb6acba1833 |
+---------------------------------+-------+---------+---------------------------------------------+---------+----------------------------------+

Get the samples for the radosgw.objects meters on tenant ID 4ddbab4fb6b445fb8195dd08b9cc05a9 (admin)

(keystone_admin)# ceilometer sample-list -q resource_id=4ddbab4fb6b445fb8195dd08b9cc05a9 -m radosgw.objects
+----------------------------------+-----------------+-------+--------+--------+----------------------------+
| Resource ID                      | Name            | Type  | Volume | Unit   | Timestamp                  |
+----------------------------------+-----------------+-------+--------+--------+----------------------------+
| 4ddbab4fb6b445fb8195dd08b9cc05a9 | radosgw.objects | gauge | 87.0   | object | 2016-02-03T17:21:50.201000 |
| 4ddbab4fb6b445fb8195dd08b9cc05a9 | radosgw.objects | gauge | 0.0    | object | 2016-02-03T16:39:15.050000 |
+----------------------------------+-----------------+-------+--------+--------+----------------------------+
Uncategorized

Ceph:How to remove objects from pool

How can we remove the objects from a pool, without removing the pool.
We can be “rados -p cleanup –prefix ” to remove all the objects,  with a specific prefix.

 

First check all the objects in that pool, use the below command:

 

$ rados -p  ls
For example, If you wanted to clean up  the ‘rados bench write’ testing objects, you can use the below command for the same:
$ rados -p  --prefix cleanu benchmark’
$ rados -p rbdbench  cleanup --prefix benchmark // will remove all objects prefixed with benchmake

You can also remove all the objects from a pool as below, but note that, the below command will delete all the objects in that pool, so be careful to use the below command. 

$ for i in `rados -p  ls; do echo $i; rados -p  rm $i; done’
Uncategorized

Ceph: Data scrubbing?

Data scrubbing is:

   - An error checking and correction method or 
   - Routine check to ensure that the data on file systems are in pristine condition, and has no errors.

In ceph data integrity is of primary concern, because the huge amounts of data being read and written daily.

A simple example for a scrubbing, is a file system check done on file systems with tools like ‘e2fsck’ in EXT2/3/4, or ‘xfs_repair’ in XFS.

Ceph also includes a daily scrubbing  as well as weekly scrubbing (which is called as deep-scrubbing).

NOTE: Btrfs is one of the file systems that can schedule an internal scrubbing automatically, to ensure that corruptions are detected and preventive measures taken automatically. Since Btrfs can maintain multiple copies of data, once it finds an error in the primary copy, it can check for a good copy (if mirroring is used) and replace it.