EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 9 May 2003, 09:02 PM   #1
leisuresuitlary
Senior Member
 
Join Date: Mar 2003
Location: UK
Posts: 168
Question Mechanics of a Server Outage

I do not have the greatest understanding of IT so if this is an obvious question forgive me!

Why am I seeing so many server outages on FM? FM offers a great service but the number of outages is an unsatisfactory point.

I do not seem to remember any before the accounts were transfered to the new servers a few months ago (I am certainly not saying this is the reason but making on observation based on my own experience).

I remember two yesterday and now there is another.

If I understood why this happens it might aleviate the urge to beat my head on the desk which I currently have.

Would someone be kind enough to explain in non-technojargon why this happens, why it seems to happen quite often (user in UK and N Europe) and what might be done to reduce the problem.

Larry
leisuresuitlary is offline   Reply With Quote

Old 9 May 2003, 10:39 PM   #2
ukhobbes
Junior Member
 
Join Date: Oct 2002
Location: Dundee, Scotland
Posts: 25
Re: Mechanics of a Server Outage

Quote:
Originally posted by leisuresuitlary
I do not have the greatest understanding of IT so if this is an obvious question forgive me!
You are forgiven. Go in peace..
Quote:
Why am I seeing so many server outages on FM? FM offers a great service but the number of outages is an unsatisfactory point.
In what way are they unsatisfactory? Do you know of another email service that has 100% uptime, or with the same level of communication and support..? If you look at the stats, the uptime of FM is more than satisfactory.
Quote:
If I understood why this happens it might aleviate the urge to beat my head on the desk which I currently have.
I wonder if your urge to beat your head on the desk is indicative of a deeper mental issue?
Quote:
Would someone be kind enough to explain in non-technojargon why this happens, why it seems to happen quite often (user in UK and N Europe)
Because we live in a world where machines break-down and people make mistakes... yes, even Jeremy Almighty...
Quote:
...what might be done to reduce the problem.
Make machines that never break, and make people who never make mistakes...

You're welcome,
Richard

ukhobbes is offline   Reply With Quote
Old 9 May 2003, 10:44 PM   #3
Si1
Cornerstone of the Community
 
Join Date: Apr 2002
Location: UK
Posts: 590
Re: Re: Mechanics of a Server Outage

Quote:
Originally posted by ukhobbes
In what way are they unsatisfactory? Do you know of another email service that has 100% uptime, or with the same level of communication and support..? If you look at the stats, the uptime of FM is more than satisfactory.
There have been a few short periods of downtime recently, and this could be disconcerting, understandably, for a new user who has not had the benefit of seeing the overall reliability of Fastmail over a longer period of time.
Si1 is offline   Reply With Quote
Old 9 May 2003, 10:49 PM   #4
ukhobbes
Junior Member
 
Join Date: Oct 2002
Location: Dundee, Scotland
Posts: 25
Re: Re: Re: Mechanics of a Server Outage

Quote:
Originally posted by Si1
There have been a few short periods of downtime recently, and this could be disconcerting, understandably, for a new user who has not had the benefit of seeing the overall reliability of Fastmail over a longer period of time.
Yes, that's true. But, not true of the gentleman who began the thread, it seems.

BTW, my reply was kind of tongue-in-cheek, and shouldn't be taken too seriously...

Richard
ukhobbes is offline   Reply With Quote
Old 9 May 2003, 11:10 PM   #5
ukhobbes
Junior Member
 
Join Date: Oct 2002
Location: Dundee, Scotland
Posts: 25
A reply devoid of humour

Quote:
Originally posted by leisuresuitlary

Would someone be kind enough to explain in non-technojargon why this happens, why it seems to happen quite often (user in UK and N Europe) and what might be done to reduce the problem.

Larry
OK, here's a humour-cleansed version of my reply:

My point about hardware failure still stands. Harddrives and fans, etc, all have a limited lifespan and, although the manufacturer can give an *average* lifespan, the average does not mean the device will not fail in 5 minutes after installation. I believe MailSnare had a harddrive failure not so long ago. It's the way things are... Buying good quality stuff, ensuring the environment is appropriate, etc, helps to reduce the failure rate.

Also, every part of the hardware and software was designed and written by humans. If software is written by other humans, we will not necessarily know where the imperfections are. We will only find out when the software crashes or behaves in a way that was not predicted. The only way this can be improved is by good design, rigourous testing, and fixing of the bugs.

I'm afraid that your view that downtime happens "quite often" is a perception. Whether 3 faults occur within 3 days, or 3 faults being spread out over a month, still means there were 3 faults. We cannot predict when faults will occur. If faults are bunched together does not *necessarily* mean that there is a deeper problem (although that may be the case). It may simply be a case of Bad Luck(tm).

Richard
ukhobbes is offline   Reply With Quote
Old 10 May 2003, 12:04 AM   #6
mrg
Member
 
Join Date: Aug 2002
Posts: 45
Also, more often than not I only become aware of an outage when I read about it on the forum, and I check my email a lot.

I'd guess that just about every small hiccup gets reported publicly here, together with the FM-teams response. As long as the response is fast and appropriate I wouldn't worry too much. With a lot of providers you never know about the outages unless you discover them yourself, and even then you don't always know if the problem was at your side, or at the server.
mrg is offline   Reply With Quote
Old 10 May 2003, 12:28 AM   #7
leisuresuitlary
Senior Member
 
Join Date: Mar 2003
Location: UK
Posts: 168
Richard

I found your humorous reply to my post very amusing, I was barely able to contain myself.

I do not know of another e-mail service with the same level of communication and support as offered by FM. As for 100% uptime I wouldn’t have a clue, but if you know what the uptime of other mail providers is then would you tell me as it would be beneficial to have stats rather than a perception-based reply, which as you pointed out does not really cut the mustard.

I think that the reason I wished to beat my head on the desk was more due to frustration at not being able to access my mail rather than due to a deep rooted psychological problem, but if the latter really is the case then my perception may indeed be impaired.

I completely accept your point that machines break at unpredictable times and that human beings are not perfect.

However I do not feel that this explains why I used the most common e-mail provider for several years and never encountered an ‘outage’ whereas there have been quite a number in FM recently. I would have also mentioned outages I have seen before this time but to be honest I can remember exactly how many there were or when they occurred but I can tell you there have been others and out of the web-sites that I regularly use I feel that FM is down more than other major sites.

I think that your comment about up-time should be clarified. There are two types of downtime; planned and unplanned. To say that the total uptime is satisfactory may be true in terms of the total time. However if the downtime is made up of short bursts of unplanned outage this is more disruptive to the user and more costly to the company than planned downtime of which users are notified in advance.

I think that reducing the frequency of these short outages should take priority over the numbers published by the bean-counters about total outage time. If FMs total outage time is really low the possibility that this is due to the fact that problems are cured after they occur rather than before the event should be entertained. I for one would be willing to accept more frequent planned outages if this allowed unplanned downtime to be reduced.

Larry
leisuresuitlary is offline   Reply With Quote
Old 10 May 2003, 12:49 AM   #8
David
Ultimate Contributor
 
Join Date: Dec 2001
Location: Canada.
Posts: 10,355
Having worked for more than twenty years on computerized building control systems I would like add my two cents worth to this thread. Computerized systems for all applications are similar in a way; they all rely on a central computer (or a number of central computers) to get the job done. In the controls industry systems that are deemed esssential are always designed with built in redundancy. This redundancy usually consists of a number of stand by (or slave) machines that are "always" communicating with a master machine. Any computer that can take down a whole network will be configured to have some kind of backup. Data between machines on a master/slave network is synchronized 24/7. A smart box will switch machines if the online machine fails (for any reason) and an alarm will be sent out. Before a new system is put into service redundancy is extensively tested, this happens when a system is first commissioned and on a monthly basis after that. Yes hardware can break down, but one failure (on one machine) does not have to take down a whole network of machines. Building control systems and email systems are similar in way. They are a bunch of computers that are networked together that talk to the real world. So, how does this work in a real world application? Well it works very well. I frequently make calls to buildings where central computers have failed, but systems will still be online and working and will have seen no interruption of service. Having said this I would like to add that these systems are very expensive and it might not be possible to sell a system configured this way and remain competitive. --david
David is offline   Reply With Quote
Old 10 May 2003, 01:01 AM   #9
mcowger
Cornerstone of the Community
 
Join Date: Sep 2002
Location: SF, CA
Posts: 700
One thing I have noticed when people say they can't access fastmail is not really that FM is down, but that it is unreacable from their network location.

The internet is HUGE, and vastly complex. The worldwide routing tables (which we carry in our router) are enormous. So, routes go down, and you can't access parts of the internet, including FM. So, thats one problem which I bet it common

The other one, as others have mentioned, is someftware and hardware failture. From the note that Rob, Jeremy, Onno, etc post, I don't feel like FM has a lot of hardware failure. Maybe they do.

That leaves software. Yes, some of these problems can be mitigated by server redundancy. Here at my college, we have our mail server (iPlanet) clustered across 3 Sun's to ensure avilability, but that takes over $100,000 worth of equipment and knowledge of Sun Cluster software.

Basically, my point is this. I bet FM has fewer problems than most other providers given what they offer. You only don't see it with places like Hotmail because they have some ungodly number of servers load balanced behind an Arrowpoint or something, so you never see one go down. FM I am sure doesn't have that kind of money, so you will see a failure of software or hardware affect users.

Whats my point?
1. These failures come in clusters, so we often dont see any for months
2. They aren't major in terms of data loss (yes, if you rely on it for business email its important to YOU, of course!)
3. They really are short.
4. They aren't common
5. They are to be expected. If you want near 100% uptime, then you need to be going out, buying a Sun Cluster ($50K), and admin to support it ($50K/yr), and service contracts for those machines (15K/yr).
mcowger is offline   Reply With Quote
Old 10 May 2003, 01:26 AM   #10
snsh
Cornerstone of the Community
 
Join Date: Dec 2002
Location: Boston
Posts: 611
My 2-cents: I think a lot of us would like FastMail to have 99.999% scheduled uptime, or 5 minutes downtime a year. As I understand it, "five nines" is the threshold demanded for most server applications.

Regarding Fastmail, FM's mail-queue and ISP seem to have a 99.999+% uptime -- incoming mail almost never bounces. However, outgoing mail has issues (spamcop), and server uptime seems more like 99.9% which is fine for government work, but too low for business power users.

I'm not a computer reliability engineer, but my guess is that it's unrealistic to expect FM to maintain 99.999% uptime when there are frequent updates/upgrades to the software/hardware, and a quickly growing userbase. It's a tradeoff.

What can be done about this? The options I can think of are:
1) tolerate the 99.9% uptime
2) host critical accounts on a dedicated server
3) host noncritical accounts on a beta server
4) give in and pay spamcop the $1000 or whatever it takes to keep FM off their blacklist
snsh is offline   Reply With Quote
Old 10 May 2003, 01:34 AM   #11
leisuresuitlary
Senior Member
 
Join Date: Mar 2003
Location: UK
Posts: 168
Thanks for the replies guys. They help me to understand the situation. As so often is the case frustration is caused through lack of knowledge.

For me at least these comments are good enough to explain the situation what the solutions are and to let everyone decide for themselves whether they are worth implenting.

Larry
leisuresuitlary is offline   Reply With Quote
Old 10 May 2003, 01:40 AM   #12
Edwin
 Administrator 
 
Join Date: Aug 2001
Location: UK
Posts: 3,118
Here's my $0.02 on the (infrequent - this must be stressed) outages.

A) People often start 5-6 threads on the same outage, not all of which necessarily get munged into a single thread if they start to veer off along different conversational directions. So you have to be very careful to distinguish e.g. 3 threads chatting about 1 outage, and 3 threads chatting about 3 separate outages.

B) What has happened a few times in the past (and I think that the recent outage is similar) is that something breaks that has never broken before! The Fastmail team rush to fix the problem, patch the bug or what-have-you... and they're generally on the job very quickly, I can tell you!

Unfortunately, as with most very complex software installations, there are all kinds of dependencies within the Fastmail code, and what can happen in practice is a fix in one part of the code exposes or creates a new vulnerability (in a kind of domino knock-on effect) that takes the server down again within a few hours.

So it's only when the FM tech folks have navigated their way though all the knock-on issues that the server will stabilize itself again for a period of weeks or months.

The above two observations, taken together, go a long way towards explaining why, on the rare occasion when Fastmail "fails", it either fails a number of times in relatively quick succession or appears to have done upon a casual perusal of the threads on this Forum.
Edwin is offline   Reply With Quote
Old 10 May 2003, 01:59 AM   #13
snsh
Cornerstone of the Community
 
Join Date: Dec 2002
Location: Boston
Posts: 611
E-

How about making a sticky FM outages thread?
snsh is offline   Reply With Quote
Old 10 May 2003, 02:01 AM   #14
mcowger
Cornerstone of the Community
 
Join Date: Sep 2002
Location: SF, CA
Posts: 700
Quote:
Originally posted by snsh
E-

How about making a sticky FM outages thread?
I agree with the concept, but the moderators & Edwin would have to be super careful about keeping it on topic!
mcowger is offline   Reply With Quote
Old 10 May 2003, 02:13 AM   #15
Edwin
 Administrator 
 
Join Date: Aug 2001
Location: UK
Posts: 3,118
Quote:
Originally posted by snsh
E-

How about making a sticky FM outages thread?
That would IMO give 100% the wrong impression!

Most of the time, the most recent outages thread is buried about 5 pages into the site (because outages are infrequent) and that's how it should be.
Edwin is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 04:32 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy