[Gllug] CentOS and RHES

Jason Clifford jason at ukpost.com
Wed Oct 26 09:16:29 UTC 2005


On Wed, 26 Oct 2005, Karanbir Singh wrote:

> I still have no idea as to what broke on the machine  ? do you have some 
> bugzilla numbers ? Would really appreciate some info on the problem on 
> the machine itself, that way atleast someon can try to get to the bottom 
> of this.

The biggest problem was a change in the behaviour of mysql with bdb tables 
supported - it was requiring far too much memory and malloc was failing at 
mysql start time hence killing mysqld.

The solution, at least the temporary one as it disables part of the site, 
was to turn off bdb support in mysql.

I didn't raise a bugzilla ticket as it's not my server, I don't use bdb 
tables in mysql and I have no intention of chasing this up.

My involvement in this came about when my client phoned me claiming that a 
yum update overnight had killed the web services and that yum was giving 
error messages indicating that it had become corrupted on his local system 
(the error message was that a local file could not be opened even though 
the problem was actually that it could not access the file on a remote 
server).

> >>the mirror.centos.org network was all updated at the time - but was 
> >>running slow ( we were pretty much maxing out on the networks that we 
> >>have access to ), most of the major external mirrors were also updated, 
> >>while there were indeed some third party mirrors that were out of sync - 
> >>it should not effect yum's running.
> > 
> > That is simply not true.
> 
> Can you give me the server names you were looking at ? I'd be happy to 
> look through the rsync logs to check what time they picked up the 
> /centos/4*/ tree's.

No I cannot as I didn't keep a record of them. My interest at the time was 
to solve my client's immediate problem and get his website (and thus his 
livelihood!) back online.

Your checking logs a week or so later isn't going to help anyone is it?

All I am asking is that for future releases you have a release management 
system in place that ensures that all systems which are resolved to by 
mirror.centos.org have the version in place and available before you tell 
yum to roll it out. 

It's in your own interests (those of the whole CentOS team and community) 
to get this sorted out - not mine.

> If you look at how yum works, and how the $releasever is calculated, you 
> will find its not centos/4.2/os/ its looking at but centos/4/os/ !!!! at 
> release time all machines had a valid /4/ repository structure. So what 
> exactly were you looking for on the 4.2 symlink ?

Look, the files were simply not there on at least 2 servers. No amount of 
weaseling on your part will change that fact and I don't have the time or 
the inclination to debate this with you further.

Either fix the problem for future releases or don't but don't waste 
everyone's time trying to claim there was no problem. You guys fubar'd 
the management of this release but that's the past and you can remedy the 
damage your reputation suffered as a result if you address the issue - if 
you prefer to argue and claim there was no problem it's your reputation 
your harming.

> Also, mirror.centos.org has NO ftp access, so once again , where were 
> you looking ?

mirror.centos.org is not a single system. You do not operate all of the 
mirrors - I'd be surprised is the CentOS team directly manages more than a 
couple of them. Those who do operate them set the protocol support 
policies and oddly enough most large mirrors offer FTP access.

> > Even better you could seek to amend yum so that version upgrades don't 
> > happen as the result of a simple "yum update" without gaining specific 
> > consent from the operator of the system. Version upgrades are significant 
> > events - particularly on an Enterprise system - and should always require 
> > the operator of the system to give prior consideration to major changes.
> 
> errr.. 4.2 is not a Version update. its just EL4 with Update2 rolled in. 
> Exactly how RHEL works ( they call it 4U2 ). Redhat document the 
> Quarterly(ish) update policy on their website, maybe you could scan 
> through that for more info on exactly how this works.

Oh please stop trying to claim that nothing went wrong.

It certainly was a version upgrade - hence the reason it went from version 
4.1 to 4.2. It included significant changes which altered the way packages 
worked, required the installation of extra packages to support the same 
packages already on the system and broke a major system component (mysql 
as detailed above).

Assuming that you simply recompiled the SRPM for mysql the problem was 
caused by RedHat. That's not the fault of the CentOS team then but it does 
reflect on your distributions claims to be enterprise ready.

> A move from CentOS3 to CentOS4 might be considered a version upgrade.

It might. It might also be called a major version upgrade. 

It doesn't change the fact that changing from *version 4.1* to **version 
4.2** is also a version upgrade and also made significant changes.

The fact of the matter is that there were significant changes including 
the addition of new software and a change to the behaviour of mysql. These 
deserved proper planning in any enterprise environment and my suggestion 
was made to offer a way you could improve CentOS. Take it or don't, I 
loose nothing either way.

> I wonder if there was a caching proxy sitting between that machine and 
> the internet causing issues ? (just a guess.. )

No. It's a straight forward connection from our colo facility. There are 
no proxy servers in the way.

Jason Clifford
-- 
UKFSN.ORG		     Finance Free Software while you surf the 'net
http://www.ukfsn.org/	       2Mb ADSL Broadband from just £14.98 / month 
http://www.linuxadsl.co.uk/	     ADSL Routers from just £21.98

-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list