Using computer in the statistical part of the research
Using computer (or statistical methods) doesn't make the research
any better or any more reliable than it is itself. It's very often
said, that the computer (or statistical methods) tells us something,
and that's why the result must be reliable. The computer (and
statistical resesrch methods) is just a tool to calculate
something (like a calculator), and it isn't always even necessary.
For exaple, siple statistics (like mean, median standard deviation)
are often easier to calcuate with calculator in a small sample.
Researcher must know the theory of the statistical method he/she
is using. The computer program must be understood as a tool to
calculate known formulas. The programs does't usually warn if
the assumptions of a test are not valid (like normal distribution).
Also there is very little help to select the correct statistical
method. The programs might quide the researcher to use those statistical
methods, that the program offers and the are not always the correct
ones. Do not consume what you get, but ask what you need
(First think, then do).
There is several programs, that the researcher can use to the
statistical part of the research or to present the results. The
variability of the level of the programs is large. As a common
observation can be said, that the use of small, unexpensive and
easy programs is limited and they are stiff. The expensive and
large programs are often difficult to use. There is not the best
statistical program. The selection of the program depends on what
statistical methods will be used (and what programs are available).
Because the statistical results must be written in the report,
there is some tabulating, drawing and word processing features
in the statistical programs. Sometimes the data must be manipulated
and for that, there is database features in some statistical programs.
That's why those capabilities affect to the selection of the program.
The following list of the programs is partly artificial (like
all categories):
- Programming languages
- for exaple C, Fortran, APL, Pascal, Basic, ...
- sometimes necessary in manipulating data and in very special
methods, that cannot be found in statistical programs
- the demand programming skills and good knowledge of a statistical
theory
- Mathematical "toolprograms"
- for exaple IMSL, NAG, Matlab, Mathematica, Mable, ...
- like programming languages, but they offer ready functions
and features
- Large statistical programs
- for example SPSS, SAS, BMDP, Statistica, Survo, ...
- first versions were made for mainframe computers, those needed
their own command language
- the new Windows-versions are quite easy to use, but for the
"sofisticated" use demands the knowledge of a command
language
- offer many statistical methods (statistics, tests and analyses)
- limited possibility to make (to program) own statistical methods
- contents the report making, graphical and database features
- usually expensive from 10 000 mk
- Small statistical programs
- for exaple Quickstat, TILO, PATO, Statsgraphics, ...
- menudriven programs, that are easy to use
- suitable for the simple data analyses
- menu stiffs the use of the statistical methdos
- little data handling and multivariate methods capabilities
- because these are easy to use, they can easily to quide to
the misuse of the statistical methods
- unexpensive or free (for exaple source in the internet: garbo.uwasa.fi/pc/stat)
- Special programs
- for exaple GLIM, Lisrel, Gauss, Statxact, Solo Power Analysis,
Shazam, time series programs, ...
- made for the certain statistical methods
- usually not very easy to use, but easier than to make own
program
- Other programs
- graphical programs
- many statistical programs include graphical features
- most used graphical presenting programs are made for "business
garphics" not for scientific presentation
- datasheet programs
- can be calculated many statistics and siple tests
- graphics
- data can be removed to a statistical parogram and vice versa
- word-processors
- idea "all in one": data entering, data base, data
manipulating, statiststical methods, graphics, tables and text
processor could be in one program. The programs are going that
way, but the size (and complexity) of the programs grows exponentially
(tens Megas of disk space and megas of CPU)
- it is easy to move the results of statistical program to a
word processor
- statistical expert systems
- programs, that gives advices in choosing the correct statistical
method, using the chosen method correctly and expalining the result
- the small expert programs are not good, because they simplify
the problems too mutch and they have only a narrow view of solving
a problem. That's because the knowledge of the expert is difficult
to include in the program
- that's one potential progress in the statistical programs
(and needed)
- learning programs
- made in the universities to teach the sudents to learn statistics
themselvs
- can be found in the internet (for exaple garbo.uwasa.fi/pc/education)
or you can bring me a disk to copy
- the latest development is to put the material in the internet
for all people to read it (for example http://math.montana.edu/~umsfjban/STAT438/Stat438.html)
What can be done with a statistical program
When the researcher selects the statistical program, s/he must
look what following features s/he needs
- Common features
- tranforming data
- statistical help and info
- easy / difficult to use: menu or command language
- Manipulating data
- editing data
- transforming variables
- Statistical analysis
- statistics, tests, analyses
- default printing should give only the mostly used results
and there should be possibility to get more results with extra
asking.
- Making report
- Graphics
- Printing and transforming the files (data, graphics, tables)suudet
- Possibility make own programs and macros (flexibility to make
own solutions)