Here’s how the BBC website broke the news:
iPhone alarms hit by New Year glitch
A glitch on Apple’s iPhone has stopped its built-in alarm clock going off, leaving many people oversleeping on the first two days of the New Year.
Angry bloggers and tweeters complained that they had been late for work, and were risking missing planes and trains.
My first reaction was incredulity. How could such a first class software engineering company like Apple get such basic functionality wrong?
I remember being carefully instructed, during my early days as a young software engineer with PDA pioneer Psion, that alarms were paramount. Whatever else your mobile device might be doing at the time – however busy or full or distracted it might be – alarms had to go off when they became due. Users were depending on them!
For example, even if the battery was too low, when the time came, to power the audio clip that a user had selected for an alarm, Psion’s EPOC operating system would default to a rasping sound that could be played with less voltage, but which was still loud enough that the user would notice.
Further, the startup sequence of a Psion device would take care to pre-allocate sufficient resources for an alarm notifier – both in the alarm server, and in the window server that would display the alarm. There must be no risk of running out of memory and, therefore, not being able to operate the alarm.
However, as I thought more, I remembered various alarm bugs in Psion devices.
Note: I’ve probably remembered some of the following details wrong – but I think the main gist of the stories is correct.
Insisting on sounding ALL the alarms
The first was from before I started at Psion, but was a legend that was often discussed. It applied to the alarm functionality in the Psion Organiser II.
On that device, all alarms were held in a queue, and for each alarm, there was a record of whether it had been sounded. When the device powered up, one of the first thing it would do was to check that queue for the first alarm that had not been sounded. If it was overdue, it should be sounded immediately. Once that alarm was acknowledged by the user, the same process should be repeated – find the next alarm that had not been sounded…
But the snag in this system became clear when the user manually advanced the time on the device (for example, on changing timezone, or, more dramatically, restoring the correct time after a system restart). If a user had set a number of alarms, the device would insist on playing them all, one by one. The user had no escape!
Buffer overflow (part one)
The next story on my list came to a head on a date something like the 13th of September 1989. The date is significant – it was the first Wednesday (the day with the longest name) with a two-digit day-in-month in September (the month with the longest name). You can probably guess how this story ends.
At that time, Psion engineers were creating the MC400 laptop – a device that was in many ways ahead of its time. (You can see some screenshots here – though none of these shots feature the MC Alarms application. My contribution to that software, by the way, included the Text Processor application, as well as significant parts of the UI framework.)
On the day in question, several of the prototype MC400 devices stopped working. They’d all been working fine over the previous month or so. Eventually we spotted the pattern – they all had alarms due, but the text for the date overflowed the pre-allocated memory storage that had been set aside to compose that text as it was displayed on the screen. Woops.
“The kind of bug that other operating systems can only dream about”
Some time around 1991 I made a rash statement, which entered into Psion’s in-house listing of ill-guarded comments: “This is the kind of bug that other operating systems can only dream about”. It was another alarms bug – this time in the Psion Series 3 software system.
It arose when the user had an Agenda file on a memory card (which were known, at the time, as SSDs – Solid State Disks), but had temporarily removed the card. When the time came to sound an alarm from the Agenda, the alarm server requested the Agenda application to tell it when the next Agenda alarm would be due. This required the Agenda application to read data from the memory card. Because the file was already marked as “open”, the File Server in the operating system tried to display a low-level message on the screen – similar to the “Retry, Abort, or Cancel” message that users of MS-DOS might remember. This required action from the Window Server, but the Window Server was temporarily locked, waiting for a reply from the Alarm Server. The Alarm Server was in turn locked, waiting for the File Server – which, alas, was waiting (as previously mentioned) for the Window Server. Deadlock.
Well, that’s as much as I can recall at the moment, but I do remember it being said at the time that the deadlock chain actually involved five interconnecting servers, so I may have forgotten some of the subtleties. Either way, the result was that the entire device would freeze. The only sign of life was that the operating system would still emit keyclicks when the user pressed keys – but the Window Server was unable to process these keys.
In practice, this bug would tend to strike unsuspecting users who had opened an SSD door at the time the alarm happened to be due – even the SSD door on the other side of the device (an SSD could be inserted on each side). The hardware was unable to read from one SSD, even if it was still in place, if the other door happened to be open. As you can imagine, this defect took some considerable time to track down.
“Death city Arizona”
At roughly the same time, an even worse alarms-related bug was uncovered. In this case, the only way out was a cold reset, that lost all data on internal memory. The recipe to obtain the bug went roughly as follows:
- Supplement the built-in data of cities and countries, by defining a new city, which would be your home town
- Observe that the operating system created a file “World.wld” somewhere on internal memory, containing the details of all the cities whose details you had added or edited
- Find a way to delete that file
- Restart the device.
In those days of limited memory, every extra server was viewed as an overhead to be avoided if possible. For this reason, the Alarm Server and the World Server coexisted inside a single process, sharing as many resources as possible. The Alarm Server managed the queue of alarms, from all different applications, and the World Server looked after access to the set of information about cities and countries. For fast access during system startup, the World Server stored some information about the current home city. But if the full information about the home city couldn’t be retrieved (because, for example, the user had deleted the World.wld file), the server went into a tailspin, and crashed. The lower level operating system, noticing that a critical resource had terminated, helpfully restarted it – with identical conclusions. Result: the lower priority applications and servers never had a chance to start up. The user was left staring at a blank screen.
Buffer overflow (part two)
The software that composed the text to appear on the screen, when an alarm sounded, used the EPOC equivalent of “print with formatting”. For example, a “%d” in the text would be replaced by a numerical value, depending on other parameters passed to the function. Here, the ‘%’ character has a special meaning.
But what if the text supplied by the user itself contains a ‘%’ character? For example, the alarm text might be “Revision should be 50% complete by today”. Well, in at least some circumstances, the software went looking for another parameter passed to it, where none existed. As you can imagine, all sorts of unintended consequences could result – including memory overflows.
Alarms not sounding!
Thankfully, the bugs above were all caught by in-house testing, before the device in question was released to customers. We had a strong culture of fierce internal testing. The last one, however, did make it into the outside world. It impacted users who had the temerity to do the following:
- Enter a new alarm in their Agenda
- Switch the device off, before it had sufficient time to complete all its processing of which alarm would be the next to sound.
This problem hit users who accumulated a lot of data in their Agenda files. In such cases, the operating system could take a non-negligible amount of time to reliably figure out what the next alarm would be. So the user had a chance to power down the device before it had completed this calculation. Given the EPOC focus on keeping the device in a low-power state as much as possible, the “Off” instruction was heeded quickly – too quickly in this case. If the device had nothing else to do before that alarm was due, and if the user didn’t switch on the device for some other reason in the meantime, it wouldn’t get the chance to work out that it should be sounding that alarm.
Final thoughts re iPhone alarms
Psion put a great deal of thought into alarms:
- How to implement them efficiently
- How to ensure that users never missed alarms
- How to provide the user with a great alarm experience.
For example, when an alarm becomes due on a Psion device, the sound starts quietly, and gradually gets louder. If the user fails to acknowledge the alarm, the entire sequence repeats, after about one minute, then after about three minutes, and so on. When the user does acknowledge the alarm, they have the option to stop it, silence it, or snooze it. Pressing the snooze button adds another five minutes to the time before the alarm will sound again. Pressing it three times, therefore, adds 15 minutes, and so on. (And as a touch of grace: if you press the snooze button enough times, it emits a short click, and resets the time delay to five minutes – useful for sleepyheads who are too tired to take a proper look at the device, but who have enough of a desire to monitor the length of the snooze!)
So it’s surprising to me that Apple, with its famous focus on user experience, seem to have given comparatively little thought to the alarms on that device. When my wife started using an iPhone in the middle of last year, she found much in it to enchant her – but the alarms were far from delightful. It seems that the default alarms sound only once, with a rather pathetic little noise which it is easy to miss. And when we looked, we couldn’t find options to change this behaviour. I guess the iPhone team has other things on its mind!