This blog post covers an open-source timetable parsing project I released a couple of months ago. It is available at https://timetable.josephduffy.co.uk and the source is available on GitHub. The post won't go too in-depth on the technical side of the project, but rather the story of how I discovered it was possible.
Since starting my studies at the University of Huddersfield I've always wanted an easy way to see my timetable on my phone. The timetable available on the website isn't responsive and relies on POST data to display future weeks timetables, 2 things that don't work great on mobile, especially when the page is kept open in the background.
To get around this I would manually add each of my lectures and practicals to my calendar. These events could be set as recurring, however they would often need removing on specific days (such as during holidays) or have different information on another date, such as a room change. All of this eventually led me think about the famous XKCD Automation comic, so I started work on a method of automating adding it to my calendar.
Automating
My first idea for how to automate the process was to create a Google Chrome extension. To try and figure out if this was possible I loaded up my timetable.
Here I noticed that the URL didn't have any information about the timetable. My next thought is that my student number must be stored in a cookie. So I opened up the Developer Tools to inspect the request. To my surprise it was a POST request... with the student number as part of the form data.
But surely it would be just be using that to validate that I was the user I said I was, right? I loaded up my favourite HTTP utility, DHC, and created a basic request to load my own timetable.
It worked! Just to double check I messaged one of my friends explaining the situation and asking him for his student number. He sent me the number and, again, it worked! My first thought was that I was happy that I'd found a way that I might be able easily scrape the data I needed. My second thought was that it was a bit worrying that by only knowing someone's student number you could find out where someone was likely to be. Despite this I started thinking of how I could truly automate this.
EaaS - Exploitation as a Service
Since it was so easy to access the data I thought it'd be a good idea to make the service available to others. Creating a Google Chrome extension would prevent some users from using the service and could make it a little harder to get it in to a user's calendar. The calendar would also not automatically update. The overall user experience would be worse.
Look at the state of this place!
Getting the current weeks timetable is easy, but what about future weeks? To figure this out I loaded up my timetable again and changed the value in the "Week beginning" dropdown.
As you can see, there's a bit more going on this time. However, having worked with ASP.NET before, I can see that it won't be too hard to make the request work. So I make another request:
Now we have ~~where people will be for the rest of the academic year~~ all of my future timetables!
Any application that can be written in JavaScript, will eventually be written in JavaScript
As per Atwood's Law, any application that can be written in JavaScript, will eventually be written in JavaScript, so naturally I turned towards Node.js.
I stuck with Express and found jsdom, a lovely framework for working with a DOM on the server. This would then allow me to pull the information and traverse the DOM on the server side. It might not handle errors too well but my timetable's markup doesn't appear to have changed since I started University, so it'll do.
Since I've got the DOM on the server-side to get future timetables I can simply take the value from the hidden input
s __VIEWSTATE
and __EVENTVALIDATION
and send them with the request. Simple!
Who doesn't love a good RFC?
Now came the tedious part: extracting the data and converting it to a format that calendar applications will understand. I've created single calendar events in the past, but never a full calendar with lots of extra fields, such as the VALARM
. Overall the iCalendar specification is rather long and complicated, but it's easy enough to focus on the parts needed for the project. Primarily I had to ensure that events would trigger at the right time, independent of time daylight saving time, which means adding ;TZID=Europe/London
to all event dates.
Add a couple of options for adding alarms prior to events and set the correct Content-Type
headers and you're set! There were a few kinks to work out but I've been using it for a few weeks now and love it.
What you don't know can't hurt you
Before I released the code or created the website I spoke to one of my lecturers to ask whether he knew if I was breaking any rules. Apparently he (along with other members of staff) has told the team responsible for the timetable website about the security issue and they've decided not to do anything about it and essentially ignore the problem. That's up to them, but personally I think it's a little creepy that someone could make a website where people can view anyone at the University's timetable. But who'd do that?