Skip to main content

Description

Introduction to data science

Module titleIntroduction to data science
Module codeGEO1419
Academic year2021/2
Credits15
Module staff

Dr Jo Browse (Convenor)

Duration: Term123
Duration: Weeks

11

Number students taking module (anticipated)

100

Description - summary of the module content

Module description

This module will give you practical insights into how scientists address fundamental questions and hypotheses using data. We start with simple toolkits to describe data and move on to more advanced ways of comparing data and describing data trends. Once you finish the module you will be competent in managing data and handling data using the statistical programming language R, and you will know how to critique different methods commonly used in scientific data analysis.

This module uses a combination of lectures, group discussions, supervised practical classes, online (ELE) teaching resources, and help sessionsto provide you with the support necessary for achieving the learning objectives. Weekly lectures provide a synoptic overview of the techniques covered in each week’s practical class, while group discussions will evaluate and critique the use of these techniques in published scientific studies. Each practical class is led by lecturing staff, and support staff. Staff are also accessible through an online (ELE) discussion forum to answer your queries. The emphasis is placed upon learning how to apply statistical techniques to answer research questions in geography, environmental science and marine science using numerical data of various forms. As such, data from a range of environmental applications are provided in the practical classes for analysis. Weekly summative assessment, completed online, provides you with the opportunity to evaluate your progress, since these tests cover the same techniques (different data) to those learned in the lectures and practical sessions. Practical classes also focus on developing essential IT skills where R is used for data manipulation and analysis, allowing you to learn key transferable skills in data science..

Module aims - intentions of the module

This module aims to introduce you to the use of data-centred quantitative analysis techniques in ¬†research. The module will establish the purpose and scope of statistical analysis methods, focusing on analytical tests and their execution. ¬†We follow the ‘scientific method’ through from first principles (hypothesis development, distribution testing) to hypothesis testing. We ask you to think about the underlying principles of data collection, sampling and hypothesis-driven research. We use computers to assist us in the aggregation, analysis and presentation of data.

Through lectures, assisted practical classes, and group discussion you will be encouraged to evaluate and critique statistical methods as one of a suite of analytical techniques available to researchers. Assisted practical classes complement the lecture series and will provide you with key transferable skills in data handling which will increase your future employability. You will undertake an independent research project during which you will quantitatively explore an unseen dataset using the skills acquired during the module. These skills are relevant for a range of different careers from environmental management and assessment through to energy policy

Intended Learning Outcomes (ILOs)

ILO: Module-specific skills

On successfully completing the module you will be able to...

  • 1. Describe and critique a range of approaches to collecting data
  • 2. Critique poor statistical or data collection techniques across geographical and environmental fields
  • 3. Calculate and understand the use of basic descriptive statistics including the mean, median, mode, standard deviation and coefficient of variation
  • 4. Discuss the limitations associated with different descriptive statistics in your own and others work
  • 5. Apply appropriate techniques to determine whether data are normally distributed and explain the role of gaussian distributions in statistical approaches
  • 6. Explain the difference between parametric and non-parametric tests..
  • 7. Choose the correct statistical test for different data distributions
  • 8. Use the statistical programming language R , to apply appropriate statistical test in order to answer research questions.
  • 9. Understand statistical significance and interpret p-values

ILO: Discipline-specific skills

On successfully completing the module you will be able to...

  • 10. Describe essential facts and theory across data management and analysis in geography and the naturall sciences
  • 11. Identify critical questions from the literature and synthesise research-informed examples into written work
  • 12. Identify and implement, with some guidance, appropriate methodologies and theories for addressing a specific research problem in geography and the natural sciences
  • 13. With guidance, deploy established techniques of data science including collection, analysis and management within geography and the natural sciences
  • 14. Describe and begin to evaluate approaches to the development of research questions in geography and the natural sciences with reference to primary literature, reviews and research articles

ILO: Personal and key skills

On successfully completing the module you will be able to...

  • 15. Develop, with guidance, a logical and reasoned argument with sound conclusions
  • 16. Communicate ideas, principles and theories using a variety of formats in a manner appropriate to the intended audience
  • 17. Collect and interpret appropriate data and undertake straightforward research tasks with guidance
  • 18. Evaluate own strengths and weaknesses in relation to professional and practical skills identified by others
  • 19. Reflect on learning experiences and summarise personal achievements

Syllabus plan

Syllabus plan

There will be several key themes covered in this module as follows:

  • Lecture: Introduction to module and overview of subject
  • Lecture: Introduction to research design;
  • Descriptive statistics; central tendency and dispersion.
  • Practical: Calculate and display descriptive statistics showing key data attributes (e.g. in Excel)
  • Lecture: Theoretical frequency distributions
  • Practical: Exploring frequency distributions using computer software
  • Lecture: Parametric inferential statistics
  • Practical: Parametric hypothesis testing
  • Lecture: Counts and frequencies, non-parametric techniques
  • Practical: Non?parametric hypothesis testing
  • Lecture: Correlation analysis
  • Practical: Exploring correlation
  • Lecture: Linear regression
  • Practical: Modeling data trends
  • Lecture: Transforming data and alternative distributions

Learning and teaching

Learning activities and teaching methods (given in hours of study time)

Scheduled Learning and Teaching ActivitiesGuided independent studyPlacement / study abroad
301200

Details of learning activities and teaching methods

CategoryHours of study timeDescription
Scheduled Learning and Teaching10Group discussion
Scheduled Learning and Teaching20Practicals
Guided Independent Study120Additional research, reading and preparation for module assessments and group discussions

Assessment

Formative assessment

Form of assessmentSize of the assessment (eg length / duration)ILOs assessedFeedback method
Short answer questions during lectures and practical sessionsOngoing throughout the module1-16, 18-19Oral

Summative assessment (% of credit)

CourseworkWritten examsPractical exams
70030

Details of summative assessment

Form of assessment% of creditSize of the assessment (eg length / duration)ILOs assessedFeedback method
Weekly tests30Not applicable1-17Model answers
Statistics project701000 words1-17Written

Re-assessment

Details of re-assessment (where required by referral or deferral)

Original form of assessmentForm of re-assessmentILOs re-assessedTimescale for re-assessment
Weekly testsNot applicableNot applicableNot applicable
Statistics projectStatistics project1-17August Assessment Period

Re-assessment notes

Deferral – if you miss an assessment for certificated reasons judged acceptable by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. The weekly tests are not deferrable because of their cumulative and practical nature. The mark given for a re-assessment taken as a result of deferral will not be capped and will be treated as it would be if it were your first attempt at the assessment.

Referral – if you have failed the module overall (i.e. a final overall module mark of less than 40%) you will be required to complete a further statistics project. The mark given for a re-assessment taken as a result of referral will count for 100% of the final mark and will be capped at 40%.

Resources

Indicative learning resources - Basic reading

  • Grolemund, Garrett, Hands-on programming with R, First edition. Sebastopol, Calif. : O'Reilly, 2014
  • Matloff, Norman S.,The art of R programming : tour of statistical software design, San Francisco : No Starch Press, 2011.
  • Rogerson, Peter,. A., Statistical methods for geography, London : SAGE, 2001.

Module has an active ELE page

Key words search

Data, data science, analysis, statistics, hypothesis testing, scientific method

Credit value15
Module ECTS

7.5

Module pre-requisites

None

Module co-requisites

None

NQF level (module)

4

Available as distance learning?

No

Origin date

03/02/2021

Last revision date

03/02/2021