"False"
Skip to content
printicon
Main menu hidden.
Syllabus:

Programming in statistics, 7.5 Credits

Swedish name: Statistisk programmering

This syllabus is valid: 2011-07-04 valid to 2013-09-29 (newer version of the syllabus exists)

Course code: 2ST035

Credit points: 7.5

Education level: Second cycle

Main Field of Study and progress level: Statistics: Second cycle, has only first-cycle course/s as entry requirements

Grading scale: Three-grade scale

Responsible department: Department of Statistics

Contents

There are many programs for statistical computations on the commercial market nowadays. Such software is capable for solving most statistical standard problems. In statistics there is often a demand for specific analyses that cannot be done within the framework of existing statistical software. Knowledge in statistical programming is then a necessity. The purpose of the course is to provide an introduction to such statistical programming. The course is based on the the statistical work and programming environment R, which implements the statistical programming language S. This environment is well on its way to become a de facto standard for professional statisticians. The program is freely available under a GPL license. The student learns how to download, install and use the program. The Emacs editor is introduced as a convenient tool for programming. The first part of the course gives an introduction of R as a common statistical program. The student learns how to enter data, directly or from other programs, how data are organized in R, and how data can be analyzed using already available tools. The incremental mode of work in R is emphasized, i.e. the results of one analysis can be used as input to further analyses of the same problem, e.g. using graphical illustrations. A common introduction to different data types and their representation in the computer and in R is given. The data type factor which is used for analysis involving categorical variables is introduced. The concept of a function, both in its mathematical sense and a function as a special type of R object is studied. The important difference between a script and a function is emphasized. The latter part of the course considers programming in the context of common statistical problems. The main focus is here on simulation and optimization. More specific techniques that are studied here are numerical computation of expected values, maximum liklelihood, bootstrapping and the EM algorithm. All of these techniques belong to the area of computer intensive statistical methods. Last, a brief overview of how C- and FORTRAN functions can be included into R functions is given.

Expected learning outcomes

After completed course the student should: - have basic knowledge about using the computer as a tool for statistical analysis, - know and be able to use the basic structures of the statistical programming language S (implemented in R or S-PLUS) for own programs, - be able to perform stochastic simulation from simple probability models, - be able to perform simple statistical analyses using bootstrap techniques, - be able to make numerical computations of expected values, - be able to perform statistical analyses based on maximum-likelihood using numerical methods, - be able to illustrate statistical models and results of statistical surveys graphically.

Required Knowledge

At least 75 credits in statistics and/or mathematical statistics, or equivalent. Proficiency in English equivalent to Swedish upper secondary course English B (IELTS (Academic) with a minimum overall score of 6.5 and no individual score below 5.5. TOEFL PBT (Paper-based Test) with a minimum score of 575 and a minimum TWE score of 4.5). TOEFL iBT (Internet-based Test) with a minimum score of 90 and a minimum score of 20 on the Writing Section). Where the language of instruction is Swedish, applicants must prove proficiency in Swedish to the level required for basic eligibility for higher studies.

Form of instruction

A major part of the course is of a laboratory character and is given as computer lessons, computer exercises and supervised computer work. The course also includes lectures and seminars. There are several compulsory assignments to be completed and handed in.

Examination modes

The examination is based upon assignments. Reports of assignments should be handed in or presented at predetermined dates. For the grade G (Pass) it is required that all assignments are satisfactory reported and have been judged with approval. The grades used are: VG (Pass with distinction), G (Pass), and U (Fail). Further information can also be obtained from the student counsellor Crediting previous courses It can be tested whether a (part or a whole) previous course can be credited for. For more information about these rules please visit www.umu.se/studentcentrum/regler_riktlinjer/index.html (Note that the information is only available in Swedish).

Literature

Valid from: 2011 week 27

Broström Göran
Statistical Programming in R
Umeå universitet, statistiska institutionen :
Mandatory

An Introduction to R
Venables W. N., Smith D. M., the R Development Core Team

http://ftp.sunet.se/pub/lang/CRAN/doc/manuals/R-intro.pdf

A first course in statistical programming with R
Braun John, Murdoch Duncan James
Cambridge, N.Y. : Cambridge University Press : 2007 : 163 s. :
ISBN: 978-0-521-87265-2 (inb.)
Mandatory
Search the University Library catalogue

S programming
Venables W. N.q (William N.), Ripley Brian D.
New York : Springer : cop. 2000 : x, 264 s. :
ISBN: 0-387-98966-8 (alk. paper)
Search the University Library catalogue

Chambers John M.
Software for data analysis : programming with R
New York, N.Y. : Springer : cop. 2008. : 498 p. :
ISBN: 978-0-387-75935-7 (hbk.)
Search the University Library catalogue