[Gllug] Modern Fault finding techniques
James Hawtin
oolon at ankh.org
Fri Nov 18 12:23:43 UTC 2011
Aaron Trevena wrote:
>
> That's a good open-ended discussion question in interviews and
> appropriate for both developers and sysadmins.
>
>
I didn't have a problem with the question, seems a very real world question.
> If it's not a app dev or app support role I wouldn't expect somebody
> to talk through troubleshooting an app aside from some basic ballpark
> stuff, but I would expect a sysadmin to be able to help isolate the
> problem so that the right team are (re-)assigned the trouble ticket.
>
>
And I did isolate it, it was the problem with a table on a database, it
was running
slow because it was very big. I however isolated it using commands I
knew would
be available on any Unix system, as i could not make assumptions about
any monitoring
tools available.
> There are plenty of system level things that could be problematic that
> a sysadmin would be in a better position to spot : heavy IO, flakey
> network connections to other systems the application uses - not just
> databases - there could be nosql applications, memcached, mogilefs etc
> that are causing problems, some clustering apps support multiple
> fallback handling so the app won't see errors, but under the cover a
> web page request could be making multiple attempts to reach a resource
> on a different load balanced machine, there are also fun things like
> the number of apache processes rapidly increasing because part of the
> application is waiting for a slow database query or an overloaded
> resource elsewhere.
>
That was not my question, I know what might be the problem, what I want
to know
is what "modern/better" fault finding techniques people use.
> The thing is, system administration can and should be more than just
> making sure the operating system on a given machine is running
> smoothly without any applications running - and system administrators
> are part of a team, working with developers and support and hopefully
> providing tools or knowledge that enables them to resolve or prevent
> problems - not just putting there hands up and saying "oh that's a
> software/hardware/application/support problem - not my area.. I'm off
> on a fag break" ;)
>
What makes you think I don't? Just trying to find out what other people
would do,
to improve myself.
James
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list