Oracle Database In Our Hand: Oracle 11g Kernel Parameter Settings

Oracle 11g Kernel Parameter settings

Open the /etc/sysctl.conf file, and add or edit lines similar to the following Ex:

fs.aio-max-nr = 1048576

fs.file-max = 6815744

kernel.shmall = 2097152

kernel.shmmax = 4294967295

kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

net.ipv4.ip_local_port_range = 9000 65500

net.core.rmem_default = 262144

net.core.rmem_max = 4194304

net.core.wmem_default = 262144

net.core.wmem_max = 1048586

Setting SHMMAX Parameter

This parameter defines the maximum size in bytes of a single shared memory segment that a Linux process can allocate in its virtual address space. Since the SGA is comprised of shared memory, SHMMAX can potentially limit the size of the SGA. SHMMAX should be slightly larger than the SGA size. If SHMMAX is too small, you can get error messages similar to this one:

ORA-27123: unable to attach to shared memory segment

To determine the maximum size of a shared memory segment, run:

# cat /proc/sys/kernel/shmmax

2147483648

The default shared memory limit for SHMMAX can be changed in the proc file system without reboot:

# echo 2147483648 > /proc/sys/kernel/shmmax

Alternatively, you can use sysctl(8) to change it:

# sysctl -w kernel.shmmax=2147483648

To make a change permanent, add the following line to the file /etc/sysctl.conf (your setting may vary). This file is used during the boot process.

# echo "kernel.shmmax=2147483648" >> /etc/sysctl.conf

Setting SHMMNI Parameter

This parameter sets the system wide maximum number of shared memory segments.
Oracle recommends SHMMNI to be at least 4096 for Oracle 10g. For Oracle 9i on x86 the recommended minimum setting is lower. Since these recommendations are minimum settings, it's best to set it always to at least 4096 for 9i and 10g databases on x86 and x86-64 platforms.

To determine the system wide maximum number of shared memory segments, run:

# cat /proc/sys/kernel/shmmni

4096

The default shared memory limit for SHMMNI can be changed in the proc file system without reboot:

# echo 4096 > /proc/sys/kernel/shmmni

Alternatively, you can use sysctl(8) to change it:

# sysctl -w kernel.shmmni=4096

To make a change permanent, add the following line to the file /etc/sysctl.conf. This file is used during the boot process.

# echo "kernel.shmmni=4096" >> /etc/sysctl.conf

Setting SHMALL Parameter

This parameter sets the total amount of shared memory pages that can be used system wide. Hence, SHMALL should always be at least ceil(shmmax/PAGE_SIZE).

If you are not sure what the default PAGE_SIZE is on your Linux system, you can run the following command:

$ getconf PAGE_SIZE

4096

To determine the system wide maximum number of shared memory pages, run:

# cat /proc/sys/kernel/shmall

2097152

The default shared memory limit for SHMALL can be changed in the proc file system without reboot:

# echo 2097152 > /proc/sys/kernel/shmall

Alternatively, you can use sysctl(8) to change it:

# sysctl -w kernel.shmall=2097152

To make a change permanent, add the following line to the file /etc/sysctl.conf. This file is used during the boot process.

# echo "kernel.shmall=2097152" >> /etc/sysctl.conf

Removing Shared Memory

Sometimes after an instance crash you may have to remove Oracle's shared memory segment(s) manually.

To see all shared memory segments that are allocated on the system, execute:

$ ipcs -m

------ Shared Memory Segments --------

key shmid owner perms bytes nattch status

0x8f6e2129 98305 oracle 600 77694523 0

0x2f629238 65536 oracle 640 2736783360 35

0x00000000 32768 oracle 640 2736783360 0 dest

In this example you can see that three shared memory segments have been allocated. The output also shows that shmid 32768 is an abandoned shared memory segment from a past ungraceful Oracle shutdown. Status "dest" means that this memory segment is marked to be destroyed. To find out more about this shared memory segment you can run:

$ ipcs -m -i 32768

Shared memory Segment shmid=32768

uid=500 gid=501 cuid=500 cgid=501

mode=0640 access_perms=0640

bytes=2736783360 lpid=3688 cpid=3652 nattch=0

att_time=Sat Oct 29 13:36:52 2005

det_time=Sat Oct 29 13:36:52 2005

change_time=Sat Oct 29 11:21:06 2005

To remove the shared memory segment, you could copy/paste shmid and execute:

$ ipcrm shm 32768

Another approach to remove shared memory is to use Oracle's sysresv utility. Here are a few self explanatory examples on how to use sysresv:

Checking Oracle's IPC resources:

$ sysresv

IPC Resources for ORACLE_SID "orcl" :

Shared Memory

ID KEY

No shared memory segments used

Semaphores:

ID KEY

No semaphore resources used

Oracle Instance not alive for sid "orcl"

Instance is up and running:

$ sysresv -i

IPC Resources for ORACLE_SID "orcl" :

Shared Memory:

ID KEY

2818058 0xdc70f4e4

Semaphores:

ID KEY

688128 0xb11a5934

Oracle Instance alive for sid "orcl"

SYSRESV-005: Warning

Instance maybe alive - aborting remove for sid "orcl"

Instance has crashed and resources were not released:

$ sysresv -i

IPC Resources for ORACLE_SID "orcl" :

Shared Memory:

ID KEY

32768 0xdc70f4e4

Semaphores:

ID KEY

98304 0xb11a5934

Oracle Instance not alive for sid "orcl"

Remove ipc resources for sid "orcl" (y/n)?y

Done removing ipc resources for sid "orcl"

Setting Semaphores

Semaphores can be described as counters which are used to provide synchronization between processes or between threads within a process for shared resources like shared memories. System V semaphores support semaphore sets where each one is a counting semaphore. So when an application requests semaphores, the kernel releases them in sets. The number of semaphores per set can be defined through the kernel parameter SEMMSL.

To see all semaphore settings, run:

ipcs -ls

The SEMMSL Parameter
This parameter defines the maximum number of semaphores per semaphore set.

NOTE:
If a database gets thousands of concurrent connections where the ora.init parameter PROCESSES is very large, then SEMMSL should be larger as well. Note what Metalink Note:187405.1 and Note:184821.1 have to say regarding SEMMSL: "The SEMMSL setting should be 10 plus the largest PROCESSES parameter of any Oracle database on the system". Even though these notes talk about 9i databases this SEMMSL rule also applies to 10g databases. I've seen low SEMMSL settings to be an issue for 10g RAC databases where Oracle recommended to increase SEMMSL and to calculate it according to the rule mentioned in these notes.

The SEMMNI Parameter

This parameter defines the maximum number of semaphore sets for the entire Linux system.

The SEMMNS Parameter

This parameter defines the total number of semaphores (not semaphore sets) for the entire Linux system. A semaphore set can have more than one semaphore, and as the semget(2) man page explains, values greater than SEMMSL * SEMMNI makes it irrelevant. The maximum number of semaphores that can be allocated on a Linux system will be the lesser of: SEMMNS or (SEMMSL * SEMMNI).

The SEMOPM Parameter

This parameter defines the maximum number of semaphore operations that can be performed per semop(2) system call (semaphore call). The semop(2) function provides the ability to do operations for multiple semaphores with one semop(2) system call. Since a semaphore set can have the maximum number of SEMMSL semaphores per semaphore set, it is often recommended to set SEMOPM equal to SEMMSL.

Setting Semaphore Parameters

To determine the values of the four described semaphore parameters, run:

# cat /proc/sys/kernel/sem

250 32000 32 128

These values represent SEMMSL, SEMMNS, SEMOPM, and SEMMNI.

Alternatively, you can run:

# ipcs -ls

All four described semaphore parameters can be changed in the proc file system without reboot:

# echo 250 32000 100 128 > /proc/sys/kernel/sem

Alternatively, you can use sysctl(8) to change it:

sysctl -w kernel.sem="250 32000 100 128"

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf

Setting File Handles

The maximum number of file handles specifies the maximum number of open files on a Linux system.

Oracle recommends that the file handles for the entire system is set to at least 65536 for 9i R2 and 10g R1/2 for x86 and x86-64 platforms.

To determine the maximum number of file handles for the entire system, run:

cat /proc/sys/fs/file-max

To determine the current usage of file handles, run:

$ cat /proc/sys/fs/file-nr

1154 133 8192

The file-nr file displays three parameters:
  - Total allocated file handles
  - Currently number of used file handles (2.4 kernel); Currently number of unused file handles (2.6 kernel)
  - Maximum file handles that can be allocated (see also /proc/sys/fs/file-max)

The kernel dynamically allocates file handles whenever a file handle is requested by an application but the kernel does not free these file handles when they are released by the application. The kernel recycles these file handles instead. This means that over time the total number of allocated file handles will increase even though the number of currently used file handles may be low.

The maximum number of file handles can be changed in the proc file system without reboot:

# echo 65536 > /proc/sys/fs/file-max

Alternatively, you can use sysctl(8) to change it:

# sysctl -w fs.file-max=65536

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

# echo "fs.file-max=65536" >> /etc/sysctl.conf

Changing Network Kernel Settings

Oracle now uses UDP as the default protocol on Linux for interprocess communication, such as cache fusion buffer transfers between the instances. But starting with Oracle 10g network settings should be adjusted for standalone databases as well.

Oracle recommends the default and maximum send buffer size (SO_SNDBUF socket option) and receive buffer size (SO_RCVBUF socket option) to be set to 256 KB. The receive buffers are used by TCP and UDP to hold the received data for the application until it's read. This buffer cannot overflow because the sending party is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the receive buffer. This could cause the sender to overwhelm the receiver

The default and maximum window size can be changed in the proc file system without reboot:

# sysctl -w net.core.rmem_default=262144 # Default setting in bytes of the socket receive buffer

# sysctl -w net.core.wmem_default=262144 # Default setting in bytes of the socket send buffer

# sysctl -w net.core.rmem_max=262144 # Maximum socket receive buffer size which may be set by using the SO_RCVBUF socket option

# sysctl -w net.core.wmem_max=262144 # Maximum socket send buffer size which may be set by using the SO_SNDBUF socket option

To make the change permanent, add the following lines to the /etc/sysctl.conf file, which is used during the boot process:

net.core.rmem_default=262144

net.core.wmem_default=262144

net.core.rmem_max=262144

net.core.wmem_max=262144

To improve failover performance in a RAC cluster, consider changing the following IP kernel parameters as well:

net.ipv4.tcp_keepalive_time

net.ipv4.tcp_keepalive_intvl

net.ipv4.tcp_retries2

net.ipv4.tcp_syn_retries

Changing these settings may be highly dependent on your system, network, and other applications. For suggestions, see Metalink Note:249213.1 and Note:265194.1.

Setting Shell Limits for the Oracle User

Most shells like Bash provide control over various resources like the maximum allowable number of open file descriptors or the maximum number of processes available to a user.

To see all shell limits, run:

ulimit -a

For more information on ulimit for the Bash shell, see man bash and search for ulimit.

Limiting Maximum Number of Open File Descriptors for the Oracle User

After /proc/sys/fs/file-max has been changed, see Setting File Handles, there is still a per user limit of maximum open file descriptors:

$ su - oracle

$ ulimit -n

1024

To change this limit, edit the /etc/security/limits.conf file as root and make the following changes or add the following lines, respectively:

oracle soft nofile 4096

oracle hard nofile 63536

The "soft limit" in the first line defines the number of file handles or open files that the Oracle user will have after login. If the Oracle user gets error messages about running out of file handles, then the Oracle user can increase the number of file handles like in this example up to 63536 ("hard limit") by executing the following command:

ulimit -n 63536

You can set the "soft" and "hard" limits higher if necessary.

To see the current limit of the maximum number of processes for the oracle user, run:

$ su - oracle

$ ulimit -u

Note the ulimit options are different for other shells.

To change the "soft" and "hard" limits for the maximum number of processes for the oracle user, add the following lines to the /etc/security/limits.conf file:

oracle soft nproc 2047

oracle hard nproc 16384

Memory Usage and Page Cache

Checking Memory Usage

To determine the size and usage of memory, you can enter the following command:

grep MemTotal /proc/meminfo

Alternatively, you can use the free(1) command to check the memory:

$ free

total used free shared buffers cached

Mem: 4040360 4012200 28160 0 176628 3571348

-/+ buffers/cache: 264224 3776136

Swap: 4200956 12184 4188772

In this example the total amount of available memory is 4040360 KB. 264224 KB are used by processes and 3776136 KB are free for other applications. Don't get confused by the first line which shows that 28160KB are free! If you look at the usage figures you can see that most of the memory use is for buffers and cache since Linux always tries to use RAM to the fullest extent to speed up disk operations. Using available memory for buffers (file system metadata) and cache (pages with actual contents of files or block devices) helps the system to run faster because disk information is already in memory which saves I/O. If space is needed by programs or applications like Oracle, then Linux will free up the buffers and cache to yield memory for the applications. So if your system runs for a while you will usually see a small number under the field "free" on the first line.

Tuning Page Cache

Page Cache is a disk cache which holds data of files and executable programs, i.e. pages with actual contents of files or block devices. Page Cache (disk cache) is used to reduce the number of disk reads. To control the percentage of total memory used for page cache in RHEL 3, the following kernel parameter can be changed:

# cat /proc/sys/vm/pagecache

1 15 30

The above three values are usually good for database systems. It is not recommended to set the third value very high like 100 as it used to be with older RHEL 3 kernels. This can cause significant performance problems for database systems. If you upgrade to a newer kernel like 2.4.21-37, then these values will automatically change to "1 15 30" unless it's set to different values in /etc/sysctl.conf. For information on tuning the pagecache kernel parameter, I recommend reading the excellent article Understanding Virtual Memory. Note this kernel parameter does not exist in RHEL 4.

The pagecache parameters can be changed in the proc file system without reboot:

# echo "1 15 30" > /proc/sys/vm/pagecache

Alternatively, you can use sysctl(8) to change it:

# sysctl -w vm.pagecache="1 15 30"

To make the change permanent, add the following line to the file /etc/sysctl.conf. This file is used during the boot process.

# echo "vm.pagecache=1 15 30" >> /etc/sysctl.conf

Swap Space

General

In some cases it's good for the swap partition to be used. For example, long running processes often access only a subset of the page frames they obtained. This means that the swap partition can safely be used even if memory is available because system memory could be better served for disk cache to improve overall system performance. In fact, in the 2.6 kernel, i.e. RHEL 4, you can define a threshold when processes should be swapped out in favor of I/O caching. This can be tuned with the /proc/sys/vm/swappiness kernel parameter. The default value of /proc/sys/vm/swappiness is 60 which means that applications and programs that have not done a lot lately can be swapped out. Higher values will provide more I/O cache and lower values will wait longer to swap out idle applications.

Depending on the system profile you may see that swap usage slowly increases with system uptime. To display swap usage you can run the free(1) command or you can check the /proc/meminfo file. When the system uses swap space it will sometimes not decrease afterward. This saves I/O if memory is needed and pages don't have to be swapped out again when the pages are already in the swap space. However, if swap usage gets close to 80% - 100% (your threshold may be lower if you use a large swap space), then a closer look should be taken at the system, see also Checking Swap Space Size and Usage. Depending on the size of your swap space, you may want to check swap activity with vmstat or sar if swap allocation is lower than 80%. But these numbers really depend on the size of the swap space. The actual numbers of swapped pages per timeframe from vmstat or sar are the important numbers. Constant swapping should be avoided at all cost.

Note, never add a permanent swap file to the system due to the performance impact of the filesystem layer.

Checking Swap Space Size and Usage

You can check the size and current usage of swap space by running one of the following two commands:

grep SwapTotal /proc/meminfo

cat /proc/swaps

free

Swap usage may slowly increase as shown above but should stop at some point. If swap usage continues to grow steadily or is already large, then one of the following choices may need to be considered:
- Add more RAM or reduce the size of the SGA
- Increase the size of the swap space

If you see constant swapping, then you need to either add more RAM or reduce the size of the SGA. Constant swapping should be avoided at all cost. You can check current swap activity using the following commands:

$ vmstat 3 100

procs memory swap io system cpu

r b swpd free buff cache si so bi bo in cs us sy id wa

1 0 0 972488 7148 20848 0 0 856 6 138 53 0 0 99 0

0 1 0 962204 9388 20848 0 0 747 0 4389 8859 23 24 11 41

0 1 0 959500 10728 20848 0 0 440 313 1496 2345 4 7 0 89

0 1 0 956912 12216 20848 0 0 496 0 2294 4224 10 13 0 77

1 1 0 951600 15228 20848 0 0 997 264 2241 3945 6 13 0 81

0 1 0 947860 17188 20848 0 0 647 280 2386 3985 9 9 1 80

0 1 0 944932 19304 20848 0 0 705 0 1501 2580 4 9 0 87

The fields si and so show the amount of memory paged in from disk and paged out to disk, respectively. If the server shows continuous swap activity then more memory should be added or the SGA size should be reduced. To check the history of swap activity, you can use the sar command.
For example, to check swap activity from Oct 12th:

# ls -al /var/log/sa | grep "Oct 12"

-rw-r--r-- 1 root root 2333308 Oct 12 23:55 sa12

-rw-r--r-- 1 root root 4354749 Oct 12 23:53 sar12

# sar -W -f /var/log/sa/sa12

Linux 2.4.21-32.0.1.ELhugemem (rac01prd) 10/12/2005

12:00:00 AM pswpin/s pswpout/s

12:05:00 AM 0.00 0.00

12:10:00 AM 0.00 0.00

12:15:00 AM 0.00 0.00

12:20:00 AM 0.00 0.00

12:25:00 AM 0.00 0.00

12:30:00 AM 0.00 0.00

...

The fields pswpin and pswpout show the total number of pages brought in and out per second, respectively.

If the server shows sporadic swap activity or swap activity for a short period time at certain invervals, then you can either add more swap space or RAM. If swap usage is already very large (don't confuse it with constant swapping), then I would add more RAM.

Oracle Database In Our Hand

Thursday, October 4, 2012

Oracle 11g Kernel Parameter Settings

1 comment: