I had some fun yesterday with some odd performance problems. So I did a run with oprofile and got this:
Which seemed to me like an awful lot of page faults, especially on a system that wasn't
swapping at all.
Alexander Rubin and I got to talking about it, and I learned a few new things. The first thing is that
machinery for doing
Back to the problem at hand... the following was in the my.cnf:
As I understand it, if read_buffer set to over 256K, it uses
allocation. Actually - this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:
Well, it just so happens that this is a PHP app and is not using any sort of persistent connections, so the app is connecting and disconnecting 60 - 100 times a second. Each of those connections is going to have to authenticate against (at least) the mysql.user table. On the other hand, that _should_ be cached, so it shouldn't be a problem. However, lowering the read buffer is quite easy ... and it'll be interesting.
So, theory to the fire - I reduced read_buffer to 128k (the default) and then got this output from oprofile:
Which is more what we would expect.
The results on the system as a whole were that with read buffer set to 5M, extra 3-5% usr and 6-12% sys cpu time taken on an 8 cpu box (as reported by top) or, to look at it another way, about 100% of a single CPU taken by mysqld.
With read_buffer reduced to 128k, the CPU usage dropped to just about nothing.
561612 25.0417 /lib64/tls/libc-2.3.4.so memset
429457 19.1491 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux clear_page
214268 9.5540 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_page_fault
144293 6.4339 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_no_page
94410 4.2097 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux buffered_rmqueue
64998 2.8982 /lib64/tls/libc-2.3.4.so memcpy
59565 2.6559 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __down_read_trylock
59369 2.6472 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux handle_mm_fault
47312 2.1096 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux free_hot_cold_page
39161 1.7462 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux release_pages
39140 1.7452 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux unmap_vmas
27200 1.2128 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __alloc_pages
21520 0.9596 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux
__pagevec_lru_add_active
20772 0.9262 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux page_add_anon_rmap
19852 0.8852 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux page_remove_rmap
16424 0.7323 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux bad_range
11993 0.5348 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __might_sleep
Which seemed to me like an awful lot of page faults, especially on a system that wasn't
swapping at all.
Alexander Rubin and I got to talking about it, and I learned a few new things. The first thing is that
mmap() appears as page_faults, because the machinery for doing
mmap() and for doing swap is actually the same. Fun.Back to the problem at hand... the following was in the my.cnf:
read_buffer=5M
As I understand it, if read_buffer set to over 256K, it uses
mmap() instead of malloc() for memory allocation. Actually - this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:
Each thread that does a sequential scan allocates a buffer of this size (in bytes) for each table it scans.In this case, the server is pretty much handling click logging, so all of the queries are pure inserts. Why then would the read buffer get used at all?
Well, it just so happens that this is a PHP app and is not using any sort of persistent connections, so the app is connecting and disconnecting 60 - 100 times a second. Each of those connections is going to have to authenticate against (at least) the mysql.user table. On the other hand, that _should_ be cached, so it shouldn't be a problem. However, lowering the read buffer is quite easy ... and it'll be interesting.
So, theory to the fire - I reduced read_buffer to 128k (the default) and then got this output from oprofile:
2220 15.7760 /lib64/tls/libc-2.3.4.so memset
785 5.5785 /lib64/tls/libc-2.3.4.so memcpy
608 4.3206 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux thread_return
398 2.8283 /usr/libexec/mysqld yyparse(void*)
337 2.3948 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux system_call
234 1.6629 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_gettimeoffset_hpet
197 1.3999 /lib64/tls/libc-2.3.4.so _int_malloc
183 1.3005 /lib64/tls/libpthread-2.3.4.so pthread_mutex_lock
181 1.2862 /lib64/tls/libpthread-2.3.4.so pthread_mutex_unlock
181 1.2862 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux clear_page
176 1.2507 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux wake_futex
143 1.0162 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux sys_tgkill
138 0.9807 /usr/libexec/mysqld yylex(void*, void*)
130 0.9238 /usr/libexec/mysqld update_sys_var_str(sys_var_str*, _pthread_rwlock_t*, set_var*)
126 0.8954 /jbd (no symbols)
123 0.8741 /usr/libexec/mysqld my_strntoul_ucs2
106 0.7533 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __might_sleep
106 0.7533 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_futex
104 0.7391 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux tcp_recvmsg
Which is more what we would expect.
The results on the system as a whole were that with read buffer set to 5M, extra 3-5% usr and 6-12% sys cpu time taken on an 8 cpu box (as reported by top) or, to look at it another way, about 100% of a single CPU taken by mysqld.
With read_buffer reduced to 128k, the CPU usage dropped to just about nothing.
0 Comments