Below are some thoughts about the design of statistical dialogs. They are by no means definitive, or even correct.
There is definite value in resisting the urge to include every possible eventuality or user need into a single dialog. You certainly don't want this to happen. A UI should as simple and streamlined as possible, with commonly used elements at the forefront, and little used tweaks hidden (perhap in a sub-dialog).
That said, going too simple can paradoxically increase complexity. If you narrow the use of a single dialog to include only a very specific function, for example loading am SPSS dataset, then you will be forced, later down the road, to create dialogs for each specific function that was not included (i.e. for Stata, SAS, and csv data files). This can lead to a proliferation of menu items, where only one was needed (Load Data).
Think about a user sitting down to use your software. What are they trying to accomplish? What is his/her goal?
A function call is not a goal. Users don't sit down to do a t-test. A t-test is the chosen procedure to accomplish the task of comparing two distributions. t-tests, Mann-Whitney, Kolmogorov-Smirnov, etc., all relate to the same task of comparing two distributions, so they should be put in the same dialog.
It is better to cover one task very well then to cover 10 tasks poorly. It is the matter of 10 minutes to create a dialog to create a scatter plot of two variables. But it is something quite different to add options for:
If the user is likely to want to do the same action on several variables. Don't make them go through the dialog once for every variable. Make the dialog such that the action can be applied to multiple variables. If the action is an analysis format the results into a nice table (see multi.test
).
Format results into easy to read tables.
Test your GUI on multiple platforms in multiple consoles.
You are not the owner of R, the user is.
Add help buttons. Feel free to add pages to this manual for your dialogs. The password for editing is the primes between 4 and 12 with no spaces. I hate spammers.
Don't be SPSS prior to 2006. Let the user resize your dialog.
If a dialog does not remember it's settings the last time it was run, it is nearly useless. Data analysis is an iterative process, it is very rare that that the user will specify exactly the right set of options the first time.
In the creation of a statistical GUI the analysis philosophy of the author is necessarily imposed on the software. By choosing what to include, what options to make default, and where to place those options, the author guides the default behavior of the user. Below are some decisions that I made that have a bearing on how Deducer is used. Many of these are open to debate.
Humans are visual creatures and best understand data when it is presented in a visual manner. This helps both in the understanding of results, and in the diagnosis of possible assumption violations.
Standard 'exact' p-values are slightly conservative. The mid p-value is a minor modification that maintains an alpha level closer to the nominal level.
R uses type II sum of squares where as most other packages use type III (SAS, SPSS).
If you have a small sample size, you have no power to detect even major violations. If you have a large sample size you will almost surely find a statistically significant violation, even if the magnitude is small.