[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
intro to websched
- To: tvdevel at Bleb <tvdevel@xxxxxxxx>
- Subject: intro to websched
- From: Geraint Edwards <gedge-lists@xxxxxxxx>
- Date: Mon, 21 Aug 2006 13:23:24 +0100
- Organisation: Caerdydd, Cymru / Cardiff, Wales
- User-agent: Mutt/22.214.171.124i
Just joined, so here's my - or, rather, websched's - intro.
For years (probably over ten years), I've used and developed
<websched>, a CGI program, written in perl, which presents events
(television programmes as well as cinema, theatre, gigs, etc) to
the viewer, in a rather large (700k) HTML page. According to
user preferences, it shows TV programmes in bigger/smaller fonts,
so you don't miss that crucial repeat of Big Brother. ;-)
The script runs locally on my Unix box, but may well work on a
Windows box with perl, lynx (or similar) and (perhaps, optionally)
a web server (happy to help make it more portable). It downloads
pages from various sources across the net (using lynx), parses
them and collates into a personalised entertainment schedule.
An (inactive) example of the output can be seen here:
(it seems to look best in Firefox). For telly listings, the
Day StartTime EndTime ResizeFont Length-indicator Title(OptionalEpisode) Channel
(If channel is marked in red, there is a regional clash, for those
of us that get multiple regions - e.g. BBC2 versus BBC2Wales).
The script uses listings from many sites, so I assume that the
page that it produces is probably subject to personal-use-only
clauses, so I do not allow public access to run the script off my
web server, but I have shared the script with a few friends so
It's heavily biased towards my location (Cardiff venues, Welsh
telly channels). This is not hard to change, but it is also
non-trivial, particularly to the perl-challenged. Listings are
typically downloaded as rendered pages (not as HTML), and I have
developed what could be called a format-description syntax which
grabs the event details by parsing the rendered listings.
It may not be 100% accurate, but I reckon it's better than 99%.
The code is a mixture of really old and relatively new, so it
could use a major tidy up and update. But, hey, it works, and
it's only tv! ;-)
Anyway, back to the present: I was sourcing a handful of
listings from sky.com, but that recently changed format which
made it useless (or a very complicated programming job!) as far
as my parsing system goes. So, I found a few other sources for
the channels that had been sourced from Sky (I prefer to get them
from the publisher themselves, where possible), but could not
find sources for two of them: Sky Two and Eurosport GB 2.
Now, thanks to Andrew for providing the site/info, my script is
happily fed with Sky Two's XML listings, and has thus been
updated with an XML parser (wow, a lot easier than parsing
rendered HTML pages!) and a page-fetch delaying algorithm (as
per the house rules).
- download from multiple sites in parallel (delay optional, per source)
- new sources can often be added "easily" (if the website format is regular)
- pages are cached, only being updated after N hours (configurable per channel)
- multi-level user-configurable highlighting (lowlighting?) for telly programmes (regexps)
- make my PC send reminders to my set-top box (using infra-red) (!)
- popup reminders on my PC (not that bothered)
- make the viewer's list of channels browser-selectable
- tidy the rendered-page parser to produce XML
(may be handy to someone else - e.g. for ITV channels,
which I've not got the lack-of-XML problem with)
next time I need a diversion from the day job:
- try to find Eurosport GB 2 listings again!
(No rush, as the Tour de France isn't on at the moment.)
Hope someone finds this useful. Must get to work.
Geraint A. Edwards (aka "Gedge")