10

In this context, an 'exception' is an undesirable scenario, which could be: a code-level signal (like SIGSEGV), incorrect ways of launching an app (like launching a command-line app as a daemon) etc.

For a command-line app, the way to report exceptions to the user is by outputting to stderr - no doubts here.
For a GUI app using GTK, an error window displayed using GTK's MessageDialog can be used. But what if the MessageDialog fails, either due to unstable state of the app (SIGSEGV or SIGBUS may not have any recovery) or the API itself failed... in that case, how can a GUI app inform the user?
Finally, a daemon... A daemon needs to inform user either due to a code-level exception (signals) or an external exception - user could launch a command-line app as a daemon, which is not a desirable way of launch, since a command-line app would've exited after its task is completed, but a daemon is expected to run for a long time. The command-line app could detect it was launched as a daemon and inform the user that it was launched incorrectly, but output to stderr does nothing here... how can a command-line app launched as daemon inform user that it was launched incorrectly?

The main question is, how can each of these apps communicate with the user in the above mentioned scenarios? What is Linux's recommendation?

PS: I'm new to Linux and app development in Linux.

4
  • 2
    Put an event into the system's logs, too. Start by reading man logger rsyslog, and follow the "See Also" refs. Commented May 24 at 22:01
  • 3
    FWIW: Linux strives for compatibility with another family of operating systems whose owners pay big $$$ to use a certain name that ends with "nix." Back when the ancient ancestors of those *nix operating systems roamed the Earth, there was no GUI, and there were no "apps," and one of the guiding principles was that pretty much any program could be execd by some other program. Programmers, therefore, were strongly encouraged to encode most of what the "user" (who might actually be another program) needed to know about why a program failed in the process's exit status. Commented May 24 at 23:18
  • 2
    The usual answer is "poorly". I think there's already a good variety of options in the answers here, but in general I would say that the bar is low. Commented May 25 at 18:54
  • 1
    At the very least, always report the error to stderr, and exit with non-0 exit status. You may (or may not) be able to do more, but those two are crucial. Commented May 27 at 13:34

5 Answers 5

15

First of all, not all errors are equal.

"The username and password you are using to authenticate were rejected by the remote server" is quite different than "The machine is out of memory, so the kernel decided to forcefully kill this program to free resources".

The first one is a "normal error", the second is uncatchable by your own program, and actually there's no need to attempt communicating that to the user. The kernel would extensively log it on dmesg.

For a GUI app using GTK, an error window displayed using GTK's MessageDialog can be used. But what if the MessageDialog fails, either due to unstable state of the app (SIGSEGV or SIGBUS may not have any recovery) or the API itself failed... in that case, how can a GUI app inform the user?

You print that on stderr, and hope the user will have launched it from console from which it can be read. In fact, one of the errors you may find in such MessageDialog would be that there's no X11 server running, so no graphical dialog at all is possible.

incorrect ways of launching an app (like launching a command-line app as a daemon)

I suspect that instead of launching as a daemon (which for an existing program would involve quite explicit steps such as creating a systemd unit for that program) you actually mean running it in the background (which you can generally do by appending an &)

Note that in Linux it's not considered "wrong" to launch a program in background. For example, I opened the current web browser from a console using firefox&. This opens firefox, and -since it would otherwise block the terminal- does so in the background.

If the program is able to do its task (e.g. curl or wget), it should do so no matter if it's running in the background or in foreground. There is an scenario where it would be problematic, which is if the program needs an input from the user and it is in background, but in that case when it tries to read it will get a STOP signal and get resumed when put on foreground, the program doesn't need to specifically handle that, it is done automatically for it.

As for daemons, the normal way of daemons giving back feedback is to log the messages using syslog (3). This can then be redirected to certain files or remote systems by the syslog daemon.

However, the way of writing daemons has also evolved (see daemon (7)). While daemons required performing a long dance, a systemd daemon can just act as a normal program, with the init system performing the daemon-specific steps, and daemon logging can be performed by just by writing into stderr when setting StandardError=syslog in the service unit file.

11

due to unstable state of the app (SIGSEGV or SIGBUS may not have any recovery)

I think it is still possible in a SIGSEGV handler to fork+exec a "crash handler" process (which then would initialize GTK anew and be able to display a normal dialog box), like how Firefox seems to do. But it's rarely done.

Some distributions have automatic crash reporting GUIs (like Apport on Fedora or Ubuntu). Usually though it's just systemd-coredump writing the backtrace to the system log, without a UI notification at all.

Finally, a daemon... A daemon needs to inform user either due to a code-level exception (signals) or an external exception - user could launch a command-line app as a daemon, which is not a desirable way of launch, since a command-line app would've exited after its task is completed, but a daemon is expected to run for a long time. The command-line app could detect it was launched as a daemon and inform the user that it was launched incorrectly

No, that's not for the program to decide. There is nothing incorrect about running a short-lived task this way. For that matter it doesn't really know whether it was launched "as a daemon", nor is there a practical difference between that and "as a Cron job" (whose entire purpose is to run short-lived tasks) – and in the systemd equivalent those are exactly the same thing, only with different start conditions.

So if the user decides to run your short-lived task with & in background (because it's not short-lived enough and they want to do other things while it's running), or as a systemd service (because they want to place a memory limit), or as a Cron job, you don't notify them about anything.

Or even better, you accommodate that, by adapting the program's output so that it avoids interactive frills in such situations (e.g. progress bars or spinners when !isatty()) and produces output suitable for writing to a log file.


There is also the distinction between processes run as user daemons vs system daemons. The former can still connect to the user's GUI session (though there is no guarantee there'll be one!) and use e.g. D-Bus to send a notification popup. System services, on the other hand, can't communicate with the user's GUI at all (much like on Windows with "session 0 isolation").

1
  • 1
    Another, in my opinion more important (though i am biased by my own experience) case in which a program should still work in not a tty, is ssh [host] [command] does not by default allocate a tty to the command in question. This is a frequent problem with sudo for example (but this might be due to security concerns in this specific case) Commented May 25 at 8:41
11

I'm going to try and directly answer the questions [I think] you're actually asking.

How should a program handle UNIX signals?

Generally speaking, it shouldn't.

UNIX signals are a somewhat dated communication method, but they're pretty carefully constructed to handle a certain number of situations fairly well. In most cases, unless you know you have a specific need to handle a signal, you should just let the default behavior happen, and not write a handler for it at all.

The bash shell (and probably other shells, but I don't know) will already report to the user if a program exited due to a signal, so for example if it receives SIGSEGV, then Segmentation fault will be printed. So there's no point in the program itself trying to catch that and display a message, because this is already accounted for. Sure, if you're running a GUI program and not running it from a termimal, then messages like this will typically go missing, but for better or worse, this has become standard fare for almost all programs on Linux by now, so users who need to investigate an odd crash will be used to having to retry something from a terminal to see any error messages if the program mysteriously exited before.

Sometimes, you'll want to handle signals specifically, for example, you might want to handle SIGINT and do some cleanup before exiting. If you do this, it's strongly recommended that after receiving some particular signal and doing your cleanup, you then use the raise function passing the same signal to ensure your process still exits with that signal even after doing its custom cleanup. This is because shell behavior is different when a process exits with a signal, especially if it was called from a shell script rather than directly, so if you exited with a zero or non-zero exit status in response to receiving a signal, you interfere with that intended behavior.

Sometimes, you'll want to handle signals that don't mean exiting afterwards. SIGWINCH for example tells command-line programs that the size of a terminal window has changed. SIGUSR1 and SIGUSR2 have the unfortunate default behavior of terminating the process but can be used for anything the program author likes, for example dd responds to SIGUSR1 by printing progress information (and not terminating).

But in all these cases, if you need to handle a particular signal, you'll know you need to.

How should a program report 'normal' errors?

In exactly the same way it would report a 'normal' success.

A command-line program asked to run in 'verbose' mode will typically produce some 'success' messages on stderr (so as to not interfere with its output on stdout); so its 'normal' errors should also go to stderr (whether verbose or not). A GUI program will typically report successes either in message-boxes or just by updating some label or other non-interactable widget, or displaying a closeable notification widget, or the like; so it should display its 'normal' errors in exactly the same way.

A daemon will typically handle requests from somewhere - that could mean something like HTTP requests, or it could mean arbitrary connections to a socket or D-bus address or whatnot - and on success will typically reply with something to indicate success, so therefore on error it should reply in the same way but with something to indicate failure.

Importantly, the error should only cancel that particular 'user action' and the program should generally stay running. Of course, if the error concerns the startup of the whole program, or it's a non-interactive command-line program, then the 'user action' here is 'run the whole program' meaning that in this case the program should exit - but for an interactive command-line program or a GUI program would typically stay running and let the user input more actions after displaying an error - and a daemon would typically reject the errored request and then stay running to take further requests.

How should a daemon report 'abnormal' errors?

By writing to stderr.

systemd will automatically know how to put this in the relevant log, and the system administrator can configure where that goes. And for the odd systems not using systemd or something broadly equivalent to it, it's generally the job of whoever writes a system-specific wrapper for the daemon to take what ends up on stderr and log it properly.

Either way, in general, the process actually 'doing the work' in a daemon should simply write errors to stderr, and not try and implement some more complicated form of logging itself.

(Of course, errors about a particular request - which are not 'abnormal' errors - should not be written to stderr, they should be reported as the response to the request as above. If you write every request-level error to stderr, it means if someone can make a request that results in that error, someone can just repeatedly make that request to spam your log file, which isn't good.)

And finally...

What is Linux's recommendation?

Linux is a kernel. Linux doesn't provide recommendations on this kind of thing, AFAIK.

On the other hand, Linux distros may well have recommendations. Certain shells and desktop environments may have recommendations. systemd has some recommendations. And there are various other conventions that have been established over time that you'll find outside of any particular project. Many of these will agree with each other, but sometimes, recommendations will disagree with each other.

Ultimately, you should either find one good source for this kind of thing and stick to it (for example, if you have a particular desktop environment in mind, then follow their conventions) or research a number of different sources and consider them all yourself.

9

Modern desktop environments implement the notification dbus API, concretely org.freedesktop.Notifications.Notify. You can interact with that directly via dbus (which is an RPC standard used throughout Linux applications, so there's bindings for every practical programming language), or call tools to do it for you. For example, you might run

notify-send --urgency critical "Coffee goes bad!" "You forgot your coffee and now it's slowly evaporating on the coffee machine's hot plate."

from a shell script.

From a Gtk application written in C, you'd use GDBus as dbus library. For a Qt application, you'd use the QtDBus. For a modern C++ application, sdbus-c++, in Python you'd use python-sdbus; the list goes on. A somewhat useful list of dbus language bindings can be found on https://www.freedesktop.org/wiki/Software/DBusBindings/.

0

You always have an option of exiting the program with an error code, and this is a good option in situations where handling the error is not safe, such as after a SIGSEGV, or if a memory allocation fails.

A lot of shell commands even use a function like

void *xmalloc(size_t size)
{
    void *const ret = malloc(size);
    if(!ret)
        abort();
    return ret;
}

The assumption behind this is that out-of-memory situations that are detected by malloc are really scarce, and implementing proper handlers here is both a waste of effort, and difficult to test, as it is more likely for the program to eat a SIGKILL (on Linux) or SIGBUS (a few other Unix systems) than to have the kernel explicitly reject an allocation.

The function of the check here is not to provide an error handling path, but to prove to the compiler that the return value of this function with never be a NULL pointer (which allows further optimizations down the line).

For any other error, how you handle it depends on the context in which your application is to be used: a daemon writes to syslog, a command line application prints an error message, removes temporary and output files, and exits with a nonzero status, and a GUI application displays a dialog.

Command line applications should not suddenly show a GUI to complain about an error, because there is an expectation that they will remain noninteractive.

Handling fatal errors and collecting crash telemetry is something only few programs do on Unix -- for one, it gives the program an aura of flakiness if the authors invested several hours into generating crash reports, so this must be a common occurrence. In addition to that, the legality of the data collection hinges on an End User License Agreement, which is uncommon in the free software world, and difficult to get right if your users are distributed across the entire world.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.