Introduction Master Thesis Data Science (2020)

  1. What is it?
  2. When is it?
  3. How does it work?
  4. Questions

Master Thesis Data Science

  • Coordinator: Maarten Marx (Informatics Institute, UvA) 2020-21: Cristian Rodriquez Rivero will take over this year
  • 18 ECTS = 12 weeks full time (periods 5 and 6)
  • Both internal and external projects
    • External projects have UvA or VU supervisor and external day-to-day supervisor
    • Internal projects can be done both at UvA and VU

Information on the web

Thesis project steps

  1. Thesis fair
  2. Match with a company
    • get (access to) data and the assignment/task
  3. Explore data and convince yourself that the task can be done with this data.
  4. Create Thesis Design (Help provided)
    • Research questions
    • Data description
    • Exploratory Data Analysis (EDA)
    • Ground your work in existing literature
  5. With your fantastic thesis design find an internal supervisor at UvA or VU.
  6. Thesis project period
    • Midterm Supervisor can decide to stop the project.
    • Hand in thesis
    • Defense

Warning

Do not underestimate step 3

Explore data and convince yourself that the task can be done with this data.

If the company cannot/does not/...not gives you access to the data before the project starts, (best before Christmas), it is strongly advised to find another project.

In previous years we had many examples were

  • first 4-6 weeks were lost because data was "difficult to get"
  • quality of the data turned out to be too low for the assignment/research questions

Time line

  1. November all external projects available
  2. Information Night
    • Friday the 6th of November from 17:00-19:00
    • This event will be held as a webinar.
    • The webinar will inform the students about the expectations for the upcoming Fair and for the thesis itself. Students will be shown a presentation about their thesis and the Thesis Fair, voting for projects and will end with a Q&A session.
  3. November 6: students can start to vote for projects
  4. 26th of November online Thesis Fair
    • speeddates and chance meetings
    • Students can meet people from the companies with external projects
  5. December Student-project matchmaking
    • students get in touch with external advisors
    • students with own project find advisor
  6. January-February THESIS DESIGN PERIOD
    • students prepare solid plan, research questions and obtain (and if needed, clean) data
    • students find internal (UvA or VU) supervisor fitting their project Supervisor must accept the thesis design, otherwise you cannot start project.
  7. March
    • rest from all that thesis worries, or
    • second chance for failed thesis designs/projects
  8. April-June Full time work on thesis project
    • With external projects this most often include being at the company for several days a week

More Time line

  1. Second half of block 4:
    • Hand in project data through datanose
    • Supervisor accepts thesis design: this is a go/no-go moment
  2. Block 5 and 6: Full time work on thesis project
    • weekly meeting with your UvA/VU supervisor
    • most often (partly) interning at an external organization
When What
week 1 Mon: introduction; Fri:planning, milestones, risk analysis
week 1 students set up private github account, invite supervisors and fill in Google forms
week 4 Hand in related work section and logbook
week 6 Hand in mid-term results via BB go/no-go point
week 8 Student finds second reader
week 11 Hand in thesis via datanose (7 days before defense)
week 12 Thesis defense (last week of academic year)
week 13 Thesis defense (for delayonists)
week 14 Thesis defense (for even worse delayonists)

And if you do not make it in time?

  • Last resort is to defend in the last week of August
  • Not a good idea: maximum grade is 7.
    • To have a fair comparison with your peers who did the same thing in two months less time.

no go points and reasons

There are two moments when your supervisor can decide to stop the thesis project.

This means that you cannot graduate this academic year.

Insufficient thesis design

  • Supervisor believes that this is no solid basis for your project:
    • data is not (yet) available
    • insufficient research question(s)
    • no clear methodological research plan

Insufficient progress at midterm moment

What about delays?

  • Not needed, and not recommended.
  • Your thesis is 18 ECTS, and that is what it should be.
  • Quit your job for these 12 weeks, quit everything else.
  • During summer, you will receive no or hardly any supervision.
  • You must defend before end of August, or enroll and pay again.
  • All these delays are bad for you, your holiday, your family and friends, your grade, your supervisors
    • but most of all: the quality of your thesis does not improve by delaying it
    • neither does your grade! (capped at 7 for defenses in August)

Internship documents

Some companies like to draft some form of contract. UvA has a template for that at http://www.uva.nl/shared-content/studentensites/fgw/fgw-gedeelde-content/nl/az/stage-in-de-master/stappenplan-stage-lopen/3.-stageovereenkomst/stageovereenkomst.html in both Dutch and English.

Thesis project period

  • You work mostly alone on your thesis project.

Meetings (this depends on the supervisor)

  • "Daily" with external supervisor
  • Weekly with internal supervisor

Requirements on the thesis

Thesis project github repo

  • Each thesis is accompanied by a privite github account which contains
    • clear folder structure
    • daily or weekly logbook
    • all documents
      • thesis design
      • mid term report
      • thesis (drafts and final)
    • all created software and (if big links to) data
    • You give your supervisors (write) access to this git repository

Defense

  1. Presentation (15 min)
  2. Questions (defense) (15 min)
  3. Grading (15 min)

Presentation

Advise: Follow Kent Beck summary.

  1. State the problem
  2. Say why it's an interesting problem
  3. Say what your solution achieves
  4. Say what follows from your solution

Defense 2

  • Defense is public: everybody is welcome.
  • Whole defense takes 45 minutes.
  • Defenses can be combined, e.g. 2 or 3 presentation-question sessions after each otherwhich are then followed by grading and (individual) feedback.
In [ ]: