[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [cobalt-users] The server encountered an internal error or misconfiguration and was unable to complete your request



I think we're at 256M on the RAQ3.  We're running vbulletin so we've got MySQL and Php. My guess is that there is some sort of memory leak someplace as the server doesn't get that many hits. Anyone know of any tutorials on tracking down memory issues on RAQ3s?

Thanks,
Paul


At 02:27 PM 10/24/2002 +0300, you wrote:
>Paul,
>
>Re-booting a server every 6 hours!! I think it is time to upgrade the memory.
>
>The errors shown in your logs can be caused by lack of memory and swap space.
>
>See the message below:
>http://list.cobalt.com/pipermail/cobalt-security/2001-December/004025.html
>
>Find out what is running and what services you do not need.  Use the top 
>command.
>
>Al-Juhani
>aljuhani@xxxxxxxxx
>
>
>
>>===== Original Message From cobalt-users@xxxxxxxxxxxxxxx =====
>>Well the site mysteriously came back after I asked the ISP to check it.  
>After telnet was turned on they were able to telnet into port 110 and 80 
>(which I didn't realize was possible). I took a look at the messages file and 
>I see the following
>>messages right before the system became unavailable:
>>
>>Oct 21 18:10:22 www init: Switching to runlevel: 6
>>Oct 21 18:10:29 www getty[587]: exiting on TERM signal
>>Oct 21 18:16:18 www sshd[341]: Received signal 15; terminating.
>>Oct 21 18:21:55 www kernel: Unable to load interpreter /lib/ld-linux.so.2
>>Oct 21 19:26:08 www last message repeated 23 times
>>Oct 21 20:30:49 www kernel: pd...
>>Oct 21 20:30:49 www kernel: VM: do_try_to_free_pages failed for httpd...
>>Oct 21 20:30:51 www last message repeated 21 times
>>Oct 21 20:30:51 www kernel: VM: do_try_to_free_pages failed for mysqld...
>>Oct 21 20:30:51 www last message repeated 7 times
>>Oct 21 20:30:51 www kernel: VM: do_try_to_free_pages failed for klogd...
>>Oct 21 20:30:51 www kernel: VM: do_try_to_free_pages failed for httpd...
>>Oct 21 20:30:53 www last message repeated 307 times
>>Oct 21 20:30:53 www kernel: Unable to load interpreter /lib/ld-linux.so.2
>>
>>and just before it was reachable...
>>
>>Oct 23 22:31:21 www kernel:   Receiver lock-up workaround activated.
>>Oct 23 22:31:28 www kernel: portmap: RPC call returned error 111
>>Oct 23 22:31:28 www kernel: RPC: task of released request still queued!
>>Oct 23 22:31:28 www kernel: RPC: (task is on xprt_pending)
>>Oct 23 22:31:33 www kernel: portmap: RPC call returned error 111
>>Oct 23 22:31:33 www kernel: RPC: task of released request still queued!
>>Oct 23 22:31:33 www kernel: RPC: (task is on xprt_pending)
>>Oct 23 22:31:33 www kernel: lockd_up: makesock failed, error=-111
>>Oct 23 22:31:38 www kernel: portmap: RPC call returned error 111
>>Oct 23 22:31:38 www kernel: RPC: task of released request still queued!
>>Oct 23 22:31:38 www kernel: RPC: (task is on xprt_pending)
>>Oct 23 22:31:38 www rpc.statd[247]: unable to register (SM_PROG, SM_VERS, 
>udp).
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-22
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-22
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-33
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-33
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-34
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-34
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-8
>>Oct 23 22:31:39 www last message repeated 4 times
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-13
>>Oct 23 22:31:39 www modprobe: modprobe: Can't locate module block-major-13
>>Oct 24 04:31:40 www sshd[325]: Server listening on 0.0.0.0 port 23.
>>Oct 24 04:31:47 www PAM_pwdb[368]: (su) session opened for user postgres by 
>(uid=0)
>>Oct 24 04:31:48 www PAM_pwdb[368]: (su) session closed for user postgres
>>Oct 24 04:34:07 www sshd[643]: Accepted password for admin from <n.n.n.n> 
>port <n> ssh2
>>Oct 24 04:34:08 www PAM_pwdb[666]: (sshd) session opened for user admin by 
>(uid=110)
>>Oct 24 04:34:12 www PAM_pwdb[681]: (su) session opened for user root by 
>admin(uid=110)
>>
>>I'm worried that this might happen to us again. Do any of these messages ring 
>any bells for anyone.  The server had memory problems a long time ago until I 
>set it up to reboot every 6 hours. I realize that is suboptimal but I wasn't 
>sure how to go
>>about diagnosing where the memory problems were coming from.
>>
>>
>>At 07:29 PM 10/23/2002 -0400, you wrote:
>>>Attempted to access a RAQ3 this morning and was unable to get to it using 
>http.  I tried using SSH and I get a "connection refused" error.
>>>
>>>I asked our ISP to reboot the machine and they did.
>>>
>>>
>>>I fired up the admin console using a browser (uses https) and the buttons on 
>the left display but the main page as the following error message:
>>>
>>>The server encountered an internal error or misconfiguration and was unable 
>to complete your request
>>>
>>>I can click on the "Control Panel" button on the left which brings up list 
>of services running.  I turned on telnet and tried to connect via telnet and I 
>get a connection refused.
>>>
>>>Clicking on "Site Management" on the left tells me
>>>"No Virtual Sites were found with this search, try narrowing your search 
>criteria."
>>>
>>>There were two sites on the machine.
>>>
>>>I clicked on the "Maintenance" button on the left, rebooted the system.
>>>
>>>System reboots, I still can't get in via telnet, ssh or regular http.
>>>
>>>We installed the Installed RaQ3-All-Security-4.0.1-14997.pkg back a month 
>ago and the system has successfully been rebooted since then.
>>>
>>>We have been getting the following Postgres error messages
>>>
>>>NOTICE:  Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (56) IS NOT 
>THE SAME AS HEAP' (55)
>>>NOTICE:  Index pg_class_oid_index: NUMBER OF INDEX' TUPLES (56) IS NOT THE 
>SAME AS HEAP' (55)
>>>
>>>for a few months now but I hadn't found anything in the archives that 
>addressed how to fix it.
>>>
>>>Clicking on the "System Status" button also results in the error message 
>about "internal error or misconfiguration".
>>>
>>>One possibility is that Postgres has gone south but that doesn't explain to 
>me why telnet and ssh aren't working.
>>>
>>>I'm desperately looking for ideas on how we might be able to get back into 
>the box and secondly for ideas on what might have happened.
>>>
>>>Thanks,
>>>Paul
>>>
>
>_____________________________________
>cobalt-users mailing list
>cobalt-users@xxxxxxxxxxxxxxx
>To subscribe/unsubscribe, or to SEARCH THE ARCHIVES, go to:
>http://list.cobalt.com/mailman/listinfo/cobalt-users