Oracle
11g Kernel Parameter settings
Open the /etc/sysctl.conf
file, and add or edit lines similar to the following Ex:
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 4294967295
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586
Setting SHMMAX Parameter
This parameter defines the maximum size in bytes of a single shared memory
segment that a Linux process can allocate in its virtual address space. Since the SGA is
comprised of shared memory, SHMMAX can potentially limit the size of the SGA.
SHMMAX should be slightly larger than the SGA size. If SHMMAX is too small, you
can get error messages similar to this one:
ORA-27123: unable to attach to shared memory
segment
To determine the maximum
size of a shared memory segment, run:
# cat
/proc/sys/kernel/shmmax
2147483648
The default shared
memory limit for SHMMAX can be changed in the proc file system without reboot:
# echo
2147483648 > /proc/sys/kernel/shmmax
Alternatively, you can
use sysctl(8) to change it:
# sysctl
-w kernel.shmmax=2147483648
To make a change
permanent, add the following line to the file /etc/sysctl.conf (your setting may
vary). This file is used during the boot process.
# echo
"kernel.shmmax=2147483648" >> /etc/sysctl.conf
Setting SHMMNI Parameter
This parameter sets the system wide maximum number of shared memory segments.
Oracle recommends SHMMNI to be at least 4096 for Oracle 10g. For Oracle 9i on
x86 the recommended minimum setting is lower. Since these recommendations are
minimum settings, it's best to set it always to at least 4096 for 9i and 10g databases
on x86 and x86-64 platforms.
To determine the system wide maximum number of shared memory segments, run:
# cat
/proc/sys/kernel/shmmni
4096
The default shared
memory limit for SHMMNI can be changed in the proc file system without reboot:
# echo
4096 > /proc/sys/kernel/shmmni
Alternatively, you can
use sysctl(8) to change it:
# sysctl
-w kernel.shmmni=4096
To make a change
permanent, add the following line to the file /etc/sysctl.conf. This file is used
during the boot process.
# echo
"kernel.shmmni=4096" >> /etc/sysctl.conf
Setting SHMALL Parameter
This parameter sets the total amount of shared memory pages that can be used
system wide. Hence, SHMALL should always be at least ceil(shmmax/PAGE_SIZE).
If you are not sure what
the default PAGE_SIZE is on your Linux system, you can run the following command:
$ getconf
PAGE_SIZE
4096
To determine the system
wide maximum number of shared memory pages, run:
# cat
/proc/sys/kernel/shmall
2097152
The default shared
memory limit for SHMALL can be changed in the proc file system without reboot:
# echo
2097152 > /proc/sys/kernel/shmall
Alternatively, you can
use sysctl(8) to change it:
# sysctl
-w kernel.shmall=2097152
To make a change
permanent, add the following line to the file /etc/sysctl.conf. This file is used
during the boot process.
# echo
"kernel.shmall=2097152" >> /etc/sysctl.conf
Removing Shared Memory
Sometimes after an instance crash you may have to remove Oracle's shared memory
segment(s) manually.
To see all shared memory segments that are allocated on the system, execute:
$ ipcs
-m
------ Shared Memory Segments --------
key
shmid owner perms
bytes nattch status
0x8f6e2129 98305 oracle
600 77694523 0
0x2f629238 65536 oracle
640 2736783360 35
0x00000000 32768 oracle
640 2736783360 0 dest
In this example you can
see that three shared memory segments have been allocated. The output also
shows that shmid 32768 is an abandoned shared memory segment from a past
ungraceful Oracle shutdown. Status "dest" means that this
memory segment is marked to be destroyed. To find out more about this shared
memory segment you can run:
$ ipcs
-m -i 32768
Shared memory Segment shmid=32768
uid=500 gid=501 cuid=500 cgid=501
mode=0640 access_perms=0640
bytes=2736783360 lpid=3688 cpid=3652 nattch=0
att_time=Sat Oct 29 13:36:52 2005
det_time=Sat Oct 29 13:36:52 2005
change_time=Sat Oct 29 11:21:06 2005
To remove the shared
memory segment, you could copy/paste shmid and execute:
$ ipcrm
shm 32768
Another approach to
remove shared memory is to use Oracle's sysresv utility. Here are a few
self explanatory examples on how to use sysresv:
Checking Oracle's IPC resources:
$ sysresv
IPC Resources for ORACLE_SID "orcl" :
Shared Memory
ID
KEY
No shared memory segments used
Semaphores:
ID
KEY
No semaphore resources used
Oracle Instance not alive for sid
"orcl"
$
Instance is up and
running:
$ sysresv
-i
IPC Resources for ORACLE_SID "orcl" :
Shared Memory:
ID
KEY
2818058
0xdc70f4e4
Semaphores:
ID
KEY
688128
0xb11a5934
Oracle Instance alive for sid "orcl"
SYSRESV-005: Warning
Instance maybe alive - aborting remove for sid "orcl"
$
Instance has crashed and
resources were not released:
$ sysresv
-i
IPC Resources for ORACLE_SID "orcl" :
Shared Memory:
ID
KEY
32768
0xdc70f4e4
Semaphores:
ID
KEY
98304
0xb11a5934
Oracle Instance not alive for sid "orcl"
Remove ipc resources for sid "orcl"
(y/n)?y
Done removing ipc resources for sid
"orcl"
$
Semaphores can be
described as counters which are used to provide synchronization between
processes or between threads within a process for shared resources like shared
memories. System V semaphores support semaphore sets where each one is a
counting semaphore. So when an application requests semaphores, the kernel
releases them in sets. The number of semaphores per set can be defined through
the kernel parameter SEMMSL.
To see all semaphore settings, run:
ipcs -ls
NOTE:
If a database gets thousands of concurrent connections where the ora.init
parameter PROCESSES is very
large, then SEMMSL should be larger as well. Note what Metalink Note:187405.1
and Note:184821.1 have to say regarding SEMMSL: "The SEMMSL setting should
be 10 plus the largest PROCESSES parameter of any Oracle database on the
system". Even though these notes talk about 9i databases this SEMMSL rule
also applies to 10g databases. I've seen low SEMMSL settings to be an issue for
10g RAC databases where Oracle recommended to increase SEMMSL and to calculate
it according to the rule mentioned in these notes.
The SEMMNI Parameter
This parameter defines the maximum number of semaphore sets for the entire
Linux system.
The SEMMNS Parameter
This parameter defines the total number of semaphores (not semaphore sets) for
the entire Linux system. A semaphore set can have more than one semaphore, and
as the semget(2) man page explains, values greater than SEMMSL * SEMMNI makes
it irrelevant. The maximum number of semaphores that can be allocated on a
Linux system will be the lesser of: SEMMNS or (SEMMSL * SEMMNI).
The SEMOPM Parameter
This parameter defines the maximum number of semaphore operations that can be
performed per semop(2)
system call (semaphore call). The semop(2) function provides the ability to do operations for multiple
semaphores with one semop(2) system call. Since a semaphore set can have the maximum number of
SEMMSL semaphores per semaphore set, it is often recommended to set SEMOPM
equal to SEMMSL.
# cat
/proc/sys/kernel/sem
250
32000 32 128
These values represent
SEMMSL, SEMMNS, SEMOPM, and SEMMNI.
Alternatively, you can run:
# ipcs
-ls
All four described
semaphore parameters can be changed in the proc file system without reboot:
# echo
250 32000 100 128 > /proc/sys/kernel/sem
Alternatively, you can
use sysctl(8) to change it:
sysctl -w kernel.sem="250 32000 100
128"
To make the change
permanent, add or change the following line in the file /etc/sysctl.conf. This file is used
during the boot process.
echo "kernel.sem=250 32000 100 128"
>> /etc/sysctl.conf
The maximum number of
file handles specifies the maximum number of open files on a Linux system.
Oracle recommends that the file handles for the entire system is set to at
least 65536 for 9i R2 and 10g R1/2 for x86 and x86-64 platforms.
To determine the maximum number of file handles for the entire system, run:
cat /proc/sys/fs/file-max
To determine the current
usage of file handles, run:
$ cat
/proc/sys/fs/file-nr
1154
133 8192
The file-nr file displays three
parameters:
- Total allocated file handles
- Currently number of used file handles (2.4 kernel); Currently
number of unused file handles (2.6 kernel)
- Maximum file handles that can be allocated (see also /proc/sys/fs/file-max)
The kernel dynamically allocates file handles whenever a file handle is
requested by an application but the kernel does not free these file handles
when they are released by the application. The kernel recycles these file
handles instead. This means that over time the total number of allocated file
handles will increase even though the number of currently used file handles may
be low.
The maximum number of file handles can be changed in the proc file system
without reboot:
# echo
65536 > /proc/sys/fs/file-max
Alternatively, you can
use sysctl(8) to change it:
# sysctl
-w fs.file-max=65536
To make the change
permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during
the boot process.
# echo
"fs.file-max=65536" >> /etc/sysctl.conf
Changing Network Kernel Settings
Oracle now uses UDP as the default protocol on Linux for interprocess
communication, such as cache fusion buffer transfers between the instances. But
starting with Oracle 10g network settings should be adjusted for standalone
databases as well.
Oracle recommends the default and maximum send buffer size (SO_SNDBUF socket option) and
receive buffer size (SO_RCVBUF socket option) to be set to 256 KB. The receive
buffers are used by TCP and UDP to hold the received data for the application
until it's read. This buffer cannot overflow because the sending party is not
allowed to send data beyond the buffer size window. This means that datagrams
will be discarded if they don't fit in the receive buffer. This could cause the
sender to overwhelm the receiver
The default and maximum window size can be changed in the proc file system
without reboot:
# sysctl
-w net.core.rmem_default=262144 # Default setting in bytes of the socket receive
buffer
# sysctl
-w net.core.wmem_default=262144 # Default setting in bytes of the socket send
buffer
# sysctl
-w net.core.rmem_max=262144 # Maximum socket receive buffer size which may
be set by using the SO_RCVBUF socket option
# sysctl
-w net.core.wmem_max=262144 # Maximum socket send buffer size which may be
set by using the SO_SNDBUF socket option
To make the change
permanent, add the following lines to the /etc/sysctl.conf file, which is used
during the boot process:
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144
To improve failover
performance in a RAC cluster, consider changing the following IP kernel
parameters as well:
net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_retries2
net.ipv4.tcp_syn_retries
Changing these settings
may be highly dependent on your system, network, and other applications. For
suggestions, see Metalink Note:249213.1 and Note:265194.1.
Most shells like Bash
provide control over various resources like the maximum allowable number of
open file descriptors or the maximum number of processes available to a user.
To see all shell limits, run:
ulimit -a
For more information on ulimit for the Bash shell, see
man bash and search for ulimit.
$ su
- oracle
$ ulimit
-n
1024
$
To change this limit,
edit the /etc/security/limits.conf file as root and make the following changes or add the following
lines, respectively:
oracle
soft nofile 4096
oracle
hard nofile 63536
The "soft
limit" in the first line defines the number of file handles or open files
that the Oracle user will have after login. If the Oracle user gets error
messages about running out of file handles, then the Oracle user can increase
the number of file handles like in this example up to 63536 ("hard
limit") by executing the following command:
ulimit -n 63536
You can set the
"soft" and "hard" limits higher if necessary.
To see the current limit
of the maximum number of processes for the oracle user, run:
$ su
- oracle
$ ulimit
-u
Note the ulimit options are different
for other shells.
To change the "soft" and "hard" limits for the maximum
number of processes for the oracle user, add the following lines to the /etc/security/limits.conf file:
oracle
soft nproc 2047
oracle
hard nproc 16384
grep MemTotal /proc/meminfo
Alternatively, you can
use the free(1)
command to check the memory:
$ free
total used free
shared buffers cached
Mem:
4040360 4012200 28160 0
176628 3571348
-/+ buffers/cache: 264224 3776136
Swap:
4200956 12184 4188772
$
In this example the
total amount of available memory is 4040360 KB. 264224 KB are used by processes
and 3776136 KB are free for other applications. Don't get confused by the first
line which shows that 28160KB are free! If you look at the usage figures you
can see that most of the memory use is for buffers and cache since Linux always
tries to use RAM to the fullest extent to speed up disk operations. Using
available memory for buffers (file system metadata) and cache (pages with
actual contents of files or block devices) helps the system to run faster
because disk information is already in memory which saves I/O. If space is
needed by programs or applications like Oracle, then Linux will free up the
buffers and cache to yield memory for the applications. So if your system runs
for a while you will usually see a small number under the field
"free" on the first line.
Tuning Page Cache
Page Cache is a disk cache which holds data of files and executable programs,
i.e. pages with actual contents of files or block devices. Page Cache (disk
cache) is used to reduce the number of disk reads. To control the percentage of
total memory used for page cache in RHEL 3, the following kernel parameter can
be changed:
# cat
/proc/sys/vm/pagecache
1
15 30
The above three values
are usually good for database systems. It is not recommended to set the third
value very high like 100 as it used to be with older RHEL 3 kernels. This can
cause significant performance problems for database systems. If you upgrade to
a newer kernel like 2.4.21-37, then these values will automatically change to "1
15 30" unless it's set to different values in /etc/sysctl.conf. For information on
tuning the pagecache kernel parameter, I recommend reading the excellent
article Understanding Virtual Memory. Note this kernel
parameter does not exist in RHEL 4.
The pagecache parameters can be changed in the proc file system without reboot:
# echo
"1 15 30" > /proc/sys/vm/pagecache
Alternatively, you can
use sysctl(8) to change it:
# sysctl
-w vm.pagecache="1 15 30"
To make the change
permanent, add the following line to the file /etc/sysctl.conf. This file is
used during the boot process.
# echo
"vm.pagecache=1 15 30" >> /etc/sysctl.conf
General
In some cases it's good for the swap partition to be used. For example, long
running processes often access only a subset of the page frames they obtained.
This means that the swap partition can safely be used even if memory is
available because system memory could be better served for disk cache to improve
overall system performance. In fact, in the 2.6 kernel, i.e. RHEL 4, you can
define a threshold when processes should be swapped out in favor of I/O
caching. This can be tuned with the /proc/sys/vm/swappiness kernel parameter. The
default value of /proc/sys/vm/swappiness is 60 which means that applications and
programs that have not done a lot lately can be swapped out. Higher values will
provide more I/O cache and lower values will wait longer to swap out idle
applications.
Depending on the system profile you may see that swap usage slowly increases
with system uptime. To display swap usage you can run the free(1) command or you can
check the /proc/meminfo file. When the system uses swap space it will
sometimes not decrease afterward. This saves I/O if memory is needed and pages
don't have to be swapped out again when the pages are already in the swap
space. However, if swap usage gets close to 80% - 100% (your threshold may be
lower if you use a large swap space), then a closer look should be taken at the
system, see also Checking Swap Space Size and Usage. Depending on
the size of your swap space, you may want to check swap activity with vmstat or sar if swap allocation is
lower than 80%. But these numbers really depend on the size of the swap space.
The actual numbers of swapped pages per timeframe from vmstat or sar are the important
numbers. Constant swapping should be avoided at all cost.
Note, never add a permanent swap file to the system due to the
performance impact of the filesystem layer.
<!--[if !supportLineBreakNewLine]-->
<!--[endif]-->
grep SwapTotal /proc/meminfo
cat /proc/swaps
free
Swap usage may slowly
increase as shown above but should stop at some point. If swap usage continues
to grow steadily or is already large, then one of the following choices may
need to be considered:
- Add more RAM or reduce the size of the SGA
- Increase the size of the swap space
If you see constant swapping, then you need to either add more RAM or reduce
the size of the SGA. Constant swapping should be avoided at all cost. You can
check current swap activity using the following commands:
$ vmstat
3 100
procs memory swap io
system cpu
r b
swpd free buff
cache si so
bi bo in
cs us sy id wa
1 0
0 972488 7148 20848
0 0 856
6 138 53
0 0 99 0
0 1
0 962204 9388 20848
0 0 747
0 4389 8859 23 24 11 41
0 1
0 959500 10728 20848
0 0 440
313 1496 2345 4
7 0 89
0 1
0 956912 12216 20848
0 0 496
0 2294 4224 10 13 0 77
1 1
0 951600 15228 20848
0 0 997
264 2241 3945 6 13 0
81
0 1
0 947860 17188 20848
0 0 647
280 2386 3985 9
9 1 80
0 1
0 944932 19304 20848
0 0 705
0 1501 2580 4
9 0 87
The fields si and so show the amount of
memory paged in from disk and paged out to disk, respectively. If the server
shows continuous swap activity then more memory should be added or the SGA size
should be reduced. To check the history of swap activity, you can use the sar command.
For example, to check swap activity from Oct 12th:
# ls
-al /var/log/sa | grep "Oct 12"
-rw-r--r--
1 root root 2333308 Oct 12 23:55 sa12
-rw-r--r--
1 root root 4354749 Oct 12 23:53 sar12
# sar
-W -f /var/log/sa/sa12
Linux 2.4.21-32.0.1.ELhugemem (rac01prd) 10/12/2005
12:00:00 AM
pswpin/s pswpout/s
12:05:00 AM
0.00 0.00
12:10:00 AM
0.00 0.00
12:15:00 AM
0.00 0.00
12:20:00 AM
0.00 0.00
12:25:00 AM
0.00 0.00
12:30:00 AM
0.00 0.00
...
The fields pswpin and pswpout show the total number
of pages brought in and out per second, respectively.
If the server shows sporadic swap activity or swap activity for a short period
time at certain invervals, then you can either add more swap space or RAM. If
swap usage is already very large (don't confuse it with constant swapping),
then I would add more RAM.