Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

OneFS: How to reset the CELOG database and clear events in OneFS 7.x

Summary: The Clusterwide Event Log (CELOG) provides a single location for logging events that happen on the cluster and provides a single point from which notifications about the events are generated, including sending alert emails and SNMP traps. Events are used to maintain a picture of cluster health for various cluster components. ...

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content


Instructions

BEFORE YOU START. This procedure is for clusters running OneFS 7.x., do not perform these steps on 8.x

In the case that an RCA is required, you will need to capture a full set of logs and investigate the CELOG problem fully.

Introduction

The Clusterwide Event Log (CELOG) provides a single location for logging events that happen on the cluster and provides a single point from which notifications about the events are generated, including sending alert emails and SNMP traps. Events are used to maintain a picture of cluster health for various cluster components.

This article describes how to manually reset and clear the CELOG database. It can be useful to do this if historical cluster events are not automatically being cleared by the system.

IMPORTANT!
Do not perform this procedure if your cluster is experiencing multiple repeating events regarding the same issue without confirming that the events are resolved.

Procedure

IMPORTANT!
This procedure does not work if your cluster is in SmartLock compliance mode. The compadmin user does not have privileged access to run the rm commands described in the procedure.

NOTE
If the /var partition is full, new CELOG database files cannot be created and the procedure below will fail. If you think this might be an issue, see the following article:

  • Event notification: The /var partition is near capacity (95% used), see LKB 000169344.

When the /var partition is no longer at or near capacity, perform the following procedure:

  1. Open an SSH connection on any node in the cluster and log in using the "root" account.
  2. To gather diagnostic information for Isilon Technical Support, run the following commands, in order, where <SR> is the Service Request number that is open for this issue, if there is one. If a service request is not open, you can use any other identifiable name, such as your company name, to identify the directory location of the saved files:
     
    mkdir -p /ifs/.ifsvar/db/celog /ifs/data/Isilon_Support/<SR> /ifs/data/Isilon_Support/celog_backups

    isi_for_array -sX 'gcore -c /ifs/data/Isilon_Support/<SR>/$(hostname)_$(date +"%Y-%m-%dT%H.%M.%S")_isi_celog_monitor.core $(pgrep isi_celog_monitor)'

    isi_for_array -sX 'gcore -c /ifs/data/Isilon_Support/<SR>/$(hostname)_$(date +"%Y-%m-%dT%H.%M.%S")_isi_celog_coalescer.core $(pgrep isi_celog_coalescer)'

    isi_for_array -sX 'gcore -c /ifs/data/Isilon_Support/<SR>/$(hostname)_$(date +"%Y-%m-%dT%H.%M.%S")_isi_celog_notification.core $(pgrep isi_celog_notifi)' ;sleep 120

  3. Reset the CELOG database by running the following commands, in order. Alternatively, you can run the script listed below the commands.
    1. Disable CELOG services by running the following three commands:
       
      isi services -a celog_coalescer disable

      isi services -a celog_monitor disable

      isi services -a celog_notification disable

    2. Stop all CELOG processes that might be lingering on the cluster:

      isi_for_array -sX 'pkill isi_celog_'
       
    3. Create a backup of the CELOG database:

      mv -vf /ifs/.ifsvar/db/celog/* /ifs/data/Isilon_Support/celog_backups/
       
    4. Clear the CELOG database by running the following three commands:
       
      isi_for_array -sX 'rm -f /var/db/celog/*'

      isi_for_array -sX 'rm -f /var/db/celog_master/*.db'

      rm -f /ifs/.ifsvar/db/celog/*.db

       
    5. Enable CELOG services by running the following three commands:
       
      isi services -a celog_coalescer enable

      isi services -a celog_monitor enable

      isi services -a celog_notification enable
       
  4. Verify that the CELOG processes restarted:

    isi_for_array -sX "pgrep celog | wc -l | sed 's/[^ 3]/FAIL/'"

    The output should display a value of 3 for each node. If the output is anything other than 3 for each node, one or more of the CELOG processes did not start. Wait 120 seconds and try again. If one or more processes still do not start, contact Isilon Technical Support..
     
  5. Send a test event to verify that CELOG is working properly:

    isi events sendtest

    This should generate a test event that will be listed in the output of the isi events command.
     
  6. Run the isi events command and verify that the test event is listed. If not, wait 120 seconds and try again. If it is still not listed, contact Isilon Technical Support.
     
  7. Gather cores and logs by running the following two commands, where <SR> is the Service Request number or other identifiable name:
     
    isi_gather_info --local -f /ifs/data/Isilon_Support/<SR>

    isi_gather_info

Article Properties


Affected Product

Isilon

Last Published Date

15 Mar 2024

Version

4

Article Type

How To