dw2

3 January 2011

Some memorable alarm bugs I have known

Filed under: Apple, Psion, usability — David Wood @ 12:24 pm

Here’s how the BBC website broke the news:

iPhone alarms hit by New Year glitch

A glitch on Apple’s iPhone has stopped its built-in alarm clock going off, leaving many people oversleeping on the first two days of the New Year.

Angry bloggers and tweeters complained that they had been late for work, and were risking missing planes and trains.

My first reaction was incredulity.  How could such a first class software engineering company like Apple get such basic functionality wrong?

I remember being carefully instructed, during my early days as a young software engineer with PDA pioneer Psion, that alarms were paramount.  Whatever else your mobile device might be doing at the time – however busy or full or distracted it might be – alarms had to go off when they became due.  Users were depending on them!

For example, even if the battery was too low, when the time came, to power the audio clip that a user had selected for an alarm, Psion’s EPOC operating system would default to a rasping sound that could be played with less voltage, but which was still loud enough that the user would notice.

Further, the startup sequence of a Psion device would take care to pre-allocate sufficient resources for an alarm notifier – both in the alarm server, and in the window server that would display the alarm.  There must be no risk of running out of memory and, therefore, not being able to operate the alarm.

However, as I thought more, I remembered various alarm bugs in Psion devices.

Note: I’ve probably remembered some of the following details wrong – but I think the main gist of the stories is correct.

Insisting on sounding ALL the alarms

The first was from before I started at Psion, but was a legend that was often discussed. It applied to the alarm functionality in the Psion Organiser II.

On that device, all alarms were held in a queue, and for each alarm, there was a record of whether it had been sounded.  When the device powered up, one of the first thing it would do was to check that queue for the first alarm that had not been sounded.  If it was overdue, it should be sounded immediately.  Once that alarm was acknowledged by the user, the same process should be repeated – find the next alarm that had not been sounded…

But the snag in this system became clear when the user manually advanced the time on the device (for example, on changing timezone, or, more dramatically, restoring the correct time after a system restart).  If a user had set a number of alarms, the device would insist on playing them all, one by one.  The user had no escape!

Buffer overflow (part one)

The next story on my list came to a head on a date something like the 13th of September 1989.  The date is significant – it was the first Wednesday (the day with the longest name) with a two-digit day-in-month in September (the month with the longest name).  You can probably guess how this story ends.

At that time, Psion engineers were creating the MC400 laptop – a device that was in many ways ahead of its time.  (You can see some screenshots here – though none of these shots feature the MC Alarms application.  My contribution to that software, by the way, included the Text Processor application, as well as significant parts of the UI framework.)

On the day in question, several of the prototype MC400 devices stopped working.  They’d all been working fine over the previous month or so.  Eventually we spotted the pattern – they all had alarms due, but the text for the date overflowed the pre-allocated memory storage that had been set aside to compose that text as it was displayed on the screen.  Woops.

“The kind of bug that other operating systems can only dream about”

Some time around 1991 I made a rash statement, which entered into Psion’s in-house listing of ill-guarded comments: “This is the kind of bug that other operating systems can only dream about”.  It was another alarms bug – this time in the Psion Series 3 software system.

It arose when the user had an Agenda file on a memory card (which were known, at the time, as SSDs – Solid State Disks), but had temporarily removed the card.  When the time came to sound an alarm from the Agenda, the alarm server requested the Agenda application to tell it when the next Agenda alarm would be due.  This required the Agenda application to read data from the memory card.  Because the file was already marked as “open”, the File Server in the operating system tried to display a low-level message on the screen – similar to the “Retry, Abort, or Cancel” message that users of MS-DOS might remember.  This required action from the Window Server, but the Window Server was temporarily locked, waiting for a reply from the Alarm Server.  The Alarm Server was in turn locked, waiting for the File Server – which, alas, was waiting (as previously mentioned) for the Window Server.  Deadlock.

Well, that’s as much as I can recall at the moment, but I do remember it being said at the time that the deadlock chain actually involved five interconnecting servers, so I may have forgotten some of the subtleties.  Either way, the result was that the entire device would freeze.  The only sign of life was that the operating system would still emit keyclicks when the user pressed keys – but the Window Server was unable to process these keys.

In practice, this bug would tend to strike unsuspecting users who had opened an SSD door at the time the alarm happened to be due – even the SSD door on the other side of the device (an SSD could be inserted on each side).  The hardware was unable to read from one SSD, even if it was still in place, if the other door happened to be open.  As you can imagine, this defect took some considerable time to track down.

“Death city Arizona”

At roughly the same time, an even worse alarms-related bug was uncovered.  In this case, the only way out was a cold reset, that lost all data on internal memory.  The recipe to obtain the bug went roughly as follows:

  • Supplement the built-in data of cities and countries, by defining a new city, which would be your home town
  • Observe that the operating system created a file “World.wld” somewhere on internal memory, containing the details of all the cities whose details you had added or edited
  • Find a way to delete that file
  • Restart the device.

In those days of limited memory, every extra server was viewed as an overhead to be avoided if possible.  For this reason, the Alarm Server and the World Server coexisted inside a single process, sharing as many resources as possible.  The Alarm Server managed the queue of alarms, from all different applications, and the World Server looked after access to the set of information about cities and countries.  For fast access during system startup, the World Server stored some information about the current home city.  But if the full information about the home city couldn’t be retrieved (because, for example, the user had deleted the World.wld file), the server went into a tailspin, and crashed.  The lower level operating system, noticing that a critical resource had terminated, helpfully restarted it – with identical conclusions.  Result: the lower priority applications and servers never had a chance to start up.  The user was left staring at a blank screen.

Buffer overflow (part two)

The software that composed the text to appear on the screen, when an alarm sounded, used the EPOC equivalent of “print with formatting”.  For example, a “%d” in the text would be replaced by a numerical value, depending on other parameters passed to the function.  Here, the ‘%’ character has a special meaning.

But what if the text supplied by the user itself contains a ‘%’ character?  For example, the alarm text might be “Revision should be 50% complete by today”.  Well, in at least some circumstances, the software went looking for another parameter passed to it, where none existed.  As you can imagine, all sorts of unintended consequences could result – including memory overflows.

Alarms not sounding!

Thankfully, the bugs above were all caught by in-house testing, before the device in question was released to customers.  We had a strong culture of fierce internal testing.  The last one, however, did make it into the outside world.  It impacted users who had the temerity to do the following:

  • Enter a new alarm in their Agenda
  • Switch the device off, before it had sufficient time to complete all its processing of which alarm would be the next to sound.

This problem hit users who accumulated a lot of data in their Agenda files.  In such cases, the operating system could take a non-negligible amount of time to reliably figure out what the next alarm would be.  So the user had a chance to power down the device before it had completed this calculation.  Given the EPOC focus on keeping the device in a low-power state as much as possible, the “Off” instruction was heeded quickly – too quickly in this case.  If the device had nothing else to do before that alarm was due, and if the user didn’t switch on the device for some other reason in the meantime, it wouldn’t get the chance to work out that it should be sounding that alarm.

Final thoughts re iPhone alarms

Psion put a great deal of thought into alarms:

  • How to implement them efficiently
  • How to ensure that users never missed alarms
  • How to provide the user with a great alarm experience.

For example, when an alarm becomes due on a Psion device, the sound starts quietly, and gradually gets louder.  If the user fails to acknowledge the alarm, the entire sequence repeats, after about one minute, then after about three minutes, and so on.  When the user does acknowledge the alarm, they have the option to stop it, silence it, or snooze it.  Pressing the snooze button adds another five minutes to the time before the alarm will sound again.  Pressing it three times, therefore, adds 15 minutes, and so on.  (And as a touch of grace: if you press the snooze button enough times, it emits a short click, and resets the time delay to five minutes – useful for sleepyheads who are too tired to take a proper look at the device, but who have enough of a desire to monitor the length of the snooze!)

So it’s surprising to me that Apple, with its famous focus on user experience, seem to have given comparatively little thought to the alarms on that device.  When my wife started using an iPhone in the middle of last year, she found much in it to enchant her – but the alarms were far from delightful.  It seems that the default alarms sound only once, with a rather pathetic little noise which it is easy to miss.  And when we looked, we couldn’t find options to change this behaviour.  I guess the iPhone team has other things on its mind!

Advertisements

9 Comments »

  1. The Psion devices were the best alarm clocks I ever owned. In fact, my S3 had an extended life by about 5 years on the strength of that functionality alone. I awoke to Psion alarms from about 1993 to 2003 and so reliable were they that I remember only ever having one failure- and I remember it because it was so exceptional.

    In every alarm clock I’ve used since I’ve lamented the lack of thought of the snooze functionality. Seriously.

    Looking back on it, having solid and reliable alarm functionality really reinforced the feeling that it was a stable device that lasted weeks on a set of batteries and never, ever crashed.

    Comment by Neil Brewitt — 3 January 2011 @ 12:49 pm

    • Hi Neil,

      I awoke to Psion alarms from about 1993 to 2003…

      I have awoken to Psion alarms from 1988 to 2011 (so far) 🙂

      …and so reliable were they that I remember only ever having one failure- and I remember it because it was so exceptional

      Sorry to hear about the failure!

      Comment by David Wood — 3 January 2011 @ 1:31 pm

  2. I had this discussion with a friend earlier today who insisted that alarm functionality on a device was easy and it should be impossible to “get it wrong”…

    Throughout the years working on phones, I came across all sorts of alarm issues. Memory leaks every time an alarm went off. Strange behaviour if the phone happened to be turned off when an alarm was due to go off. Even stranger behaviour if you changed timezone on the device with an alarm set.

    And then there was a perennial debate about whether an alarm should make an noise if you have the phone set to silent. Having a silent alarm isn’t a great deal of use – but equally if you’ve turned your phone into silent mode, the last thing you want it to do is make unexpected noise…

    Despite having spent years working with mobile devices, I must confess that I never rely on my phone as an alarm clock. If I’m at home, I have a clock radio and if I’m in a hotel I always use the alarm clock provided or book an alarm call…

    Comment by Dan McNeil — 3 January 2011 @ 2:07 pm

  3. @Dan McNeil, very good questions, though I think one reasonable answer to the alarm issues you raise is this: “check with the user”.

    – when they change time zone / home city, show them at that point what your default action will be re- any upcoming alarms – “(Alarm at 8pm will now be 8pm Paris time)” – and let them change it if it’s wrong (“no, I wanted it UK time / in four hours exactly”).

    – when they set “silent”, then if there’s an alarm upcoming in some reasonable timeframe, again, show them your default action (eg “it’ll appear on-screen but be silent, no vibrate” and let them override it “no, ring too!” (it’s my “I have to leave the meeting” alarm) or “ok don’t ring, but vibrate!” (“it’s in my pocket, no-one will know!”)

    In this biz we get a bit too obsessed IMHO with the code doing the “right thing” automatically. But often there’s no “one right thing”. Users don’t mind in the slightest being asked, in such situations – indeed, they *like* it. It shows the device being intelligent about what it can and can’t know. All we have to do is think through the various user scenarios, use the likeliest defaults, and cover the other situations by informing the user and letting them change things.

    Most tech products have rather too little UI, ISTM, because of too little thought-through UX. Eg when you set an iPhone alarm it really would be nice if, like Psions did and Androids do, it would say “Alarm will ring in 8hrs 30mins” (say), to confirm you got the setting right.

    PS, iPhones don’t turn themselves on to ring alarms, if the iphone is properly powered off – come on you hardware guys! David Tupman, you can do it! 😉

    PPS The idea of making a soft beep when the Psion spacebar-snooze “wrapped round” back to 5 minutes was: it’s dark, the alarm rings, you figure you can survive a 15-minute snooze (say), you tap the spacebar a few times, but with outstretched arm, it’s easy to be unsure as to whether a tap registered, or (conversely) you got a double-tap. If you have to turn the light (or backlight) on, and look at the screen, you’re moving a long way from sleep towards waking up, ie no snooze possible. The first time (with lights on) you do this and wrap round and encounter the beep, you hopefully realise that in future you can just tap spacebar a few more times, from under the duvet, eyes closed and lights out, and get back to the “5 minute” state and then set the snooze you wanted. Eyes closed, index finger only, minimal brain exertion.

    Now that “solution” might or might not be so great, but that’s the level of UX we tried to design for at Psion. (Though in tomorrow’s post, I’ll tell you how hard it was, IIRC, to persuade the techies to add the one line of code to make that soft beep… ;^)

    Comment by Nick Healey — 4 January 2011 @ 11:31 am

  4. Re: Insisting on sounding ALL the alarms

    I recently had the exact same problem with my Apple iPod Touch which kept losing all battery charge. Each time it was revived, the clock would be reset to January 1st 1970. When resetting to the correct time, many “recent” calendar reminders (perhaps a month’s worth, I never checked) would appear and have to be dismissed before the device could be used for anything else.

    Comment by Trevor Raynsford — 12 January 2011 @ 12:42 am

    • Hi Trevor,

      Interesting story!

      many “recent” calendar reminders (perhaps a month’s worth, I never checked) would appear and have to be dismissed before the device could be used for anything else

      You may have been lucky that it was only a month’s worth…

      Comment by David Wood — 12 January 2011 @ 7:52 am

  5. I appreciate and I totally agree !!!
    That’s one of the reasons why, even having an iPhone 4s 64gb, I keep being totally dependent on a MC218 (Ericsson improved version of a Psion machine) !!!
    Pls (!) kindly keep me informed asap if you discover an iPhone app that looks as a decent replacement for the Psion Agenda and Contacts …
    Best regards from Rome, Italy
    🙂
    Giorgio

    Comment by Giorgio — 23 June 2012 @ 7:22 am

    • Hi Giorgio,

      I’m impressed by the Contacts app on the (Android) Samsung Galaxy Note that acts as my smartphone these days. On the other hand, I’m still an avid user of the Psion 5mx Agenda app. I accept the overhead of carrying not one, but two devices with me wherever I go: the Samsung Galaxy Note and the Psion Series 5mx. The extra burden of an additional device is one that I’m more than happy to embrace.

      Comment by David Wood — 23 June 2012 @ 8:12 am

      • Hi David,

        thanks for your prompt answer !
        🙂
        I also …. almost (!) happily, accept the burden of always carrying the iPhone and the MC 218, but I’m quite unhappy that the two relevant “Contact” databases are … never “aligned” …
        In fact the one I keep relying upon the “Psion”, is paramount better than the iPhone’s one … but I don’t know a decent way to transfer it to the latter one …
        :-((
        PS: this thread made me remember how excited I was, while visiting the Psion plant in Greenford, guided by the kindest Dan Dhanjal (Quality Manager) ….

        Comment by Giorgio — 23 June 2012 @ 3:04 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: