[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-users] FIX - can't su to root, email stopped working, gui stopped working, postgres database is down, virtual sites disappeared
- Subject: [cobalt-users] FIX - can't su to root, email stopped working, gui stopped working, postgres database is down, virtual sites disappeared
- From: CBartkowiak (Raven) <lists@xxxxxxxxxxxxxxx>
- Date: Fri Mar 1 16:07:10 2002
- List-id: Mailing list for users to share thoughts on Cobalt products. <cobalt-users.list.cobalt.com>
Forgive the long subject line, it's so this will show up in a variety
of searches.
Yesterday my RaQ went nucking futs. I'm gonna post the symptoms, the
fix (found via the archives), some bitching, and a little secret that
I never previously wanted to divulge.
First, the symptoms:
Suddenly I couldn't su to root from admin. Checked the groups file,
everything was fine, admin was still in the wheel. Changing the
password from the GUI didn't help - it was still telling me incorrect
password when I tried to su, even though I know I was using the
correct password.
No one could get their email via POP. It was telling them incorrect
username/password. They could, however, login to FTP and get their
mail from the Neomail interface with that same username and password.
Then the GUI started having serious problems. Hitting it with a new
browser, it would look fine. Change pages and I'd get an "internal
server error" page, with logs showing premature end of script
headers. Hit it with a new browser, and again it would look fine
until I changed pages.
And the RaQ kept sending me emails about the postgres database being
down, which of course I never saw until I got the damn thing fixed
because I couldn't get my mail from POP and was too busy to screw
with webmail.
Sounds like a hacker's work, right? Wrong.
The RaQ has been just about falling over each morning while
processing the logs. It's not like it's a horrendous job, either -
there are only 82 sites on this box and none of them are what you'd
really call high-traffic.
But, given how the RaQ handles things, it got done processing the
logs and had no memory left for anything else (it's got 256MB in it),
and the postgres db got corrupted.
Which caused all hell to break loose.
The way I figured it out was that people could still get in through
ftp. I know ftp works off of the /etc/passwd file. Since the Cobalt
"handles" the entire POP process with its pop-before-relay, that
means that it was probably dependent on the postgresql database to
get the usernames and passwords rather than using what everything
else does... the passwd file.
The GUI, of course, runs completely off of the postgres db
(proprietary software, don't ya know), which would explain why it
(when it *did* give me a page) was showing no virtual sites and no
users listed.
The entire time, the websites were serving up fine, the machine was
still *receiving* email fine (showing that it wasn't a problem with
sendmail), and ftp was working fine.
Which all led me back to the postgres database. No matter how many
times I restarted it, even rebooted the machine, it would not stay
up. This led me to digging through the archives for a way to "fix"
the postgres db, which I found an exact step-by-step for posted back
in January by Andrew, and if you need it, you can find it here:
http://list.cobalt.com/pipermail/cobalt-users/2002-January/059762.html
The thing is, you have to be root to do these things, and I could not
su to root from admin.
Here's the little secret I never wanted to divulge before.
I've known for quite a long time now that there is a backdoor built
into the Cobalt SSH2 package. I found it out of pure curiosity one
day. Perhaps other people have too, but just didn't want to say
anything because it *is* such a glaring hole but makes a nice saving
grace if you ever find yourself in the situation I was in yesterday.
If you've got the Cobalt SSH2 package installed on your machine, you
can shell to your machine and login AS ROOT. When the login prompt
comes up, where you would normally type "admin", just type "root" (no
quotes). Give it your password and you're in.
For those of you who like the thought of having this in place should
you ever find yourself without root access, enjoy.
For those of you who are security minded, check your sshd. You'll
find "PermitRootLogin yes" in there somewhere. Just change this to no
and restart sshd.
There's the secret. If everyone knew about it and no one was saying
anything for whatever reasons... sorry.
So using this method I was able to get in as root, wipe out the
postgres files, rebuild them and run meta-verify on them to restore
the sites and users to the database. Everything immediately started
working perfectly again.
That's the method, the fix, and the secret. Now the bitch.
I'm sick of getting emails every day about how "var is very close to
being full". It's nowhere near close to being full. There's hardly
anything in there. There are, however, symlinks to all of the mail,
which is kept in the /home partition.
The stupid swatch program counts all symlinked data when it's
figuring up totals - resulting in an annoying daily email, a false
alarm red flashing light whenever I go into the GUI, and probably a
lot of scared webmasters out there wondering what the hell they've
missed when they do a df and don't get anything that would be cause
to worry.
The SSH2 backdoor either needs to be publicized, or fixed. It's a
security hole. I just publicized it, y'all can decide whether you
want to fix it or not.
The RaQ choking on the logs with 256MB of RAM on 82 sites that only
received 24,463 total "web hits" last week (according to Jens'
TrafficLight program) is just ridiculous.
And the swatch not knowing how to correctly judge disk usage (not
having the correct command passed to it to ignore symlinked data)
when the symlinks were put there by Cobalt and the swatch was as well
- the oversight or non-common-sense of the matter just blows my mind.
Anyway, thought I would put this out for anyone else who was having
the same problems listed in the subject line - hopefully there's a
fix in this email for you.
--
CarrieB