Live bus tracker (aka an exercise in scraping and API calls)

S0ULphIRE

Golden Master
Messages
9,232
Location
Australia
My city runs free bus services inside the city called CATs (Central Area Transit).
They have a "Live" time service where you can check to see when the next bus is due at your chosen stop. - http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times
But their app is unreliable and sometimes takes up to 5 mins to refresh - kind of defeating the purpose of having a "live" tracker...this is the "new and improved" version too lol

So I decided to try and make my own tracker that I can put on our company intranet, quick and reliable checks so that you know when to head out the door instead of getting to the stop and having to wait 5-15 mins

Unfortunately it's a closed API, but with a little help from Fiddler you can see the call to the API being made, so all you have to do is reproduce it!

I used Powershell to start with, but may move it to Python if I end up deciding to make a little screen/button/arduino build next to the exit door.

Code:
#Get our initial data from the page (cookies, viewstates, etc) and store it all in a session variable I decided to call Cookies.
$r=Invoke-WebRequest -Uri 'http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times' -SessionVariable Cookies

#Dump $r and check which form or field we need to access to get the required data to pass to the API. In this case, Fiddler (by Telerik) revealed we need the "RequestVerificationToken"
#So find where that data is in $r and save it so we can pass it.
$requestToken = $r.InputFields.FindByName("__RequestVerificationToken").Value

#Fiddler also showed that we needed the TabId and ModuleId set - I've just hardcoded for now, but may need to find where they're generated using the same procedure as above.
$postParams = @{RequestVerificationToken=$requestToken;Referer='http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times';TabId="249";ModuleId="1478"}

#POST our data to transperth's API page, using the fields above in the header, and not forgetting to include our SessionVariable that contains our cookie data!
$response = Invoke-WebRequest -Uri 'http://www.transperth.wa.gov.au/DesktopModules/CatLiveTimesMap/API/CatLiveTimesMapApi/GetLiveCatInfo' -WebSession $Cookies -Method POST -Headers $postParams

#Bam. JSON received. Now we just parse it and grab what we want.
$catTime = $response.Content | ConvertFrom-Json
$secondsLeft = $catTime.data.Stops | where {$_.Name -eq "Lord Street"} | select -ExpandProperty ETA

$ETA = [timespan]::FromSeconds($secondsLeft)

Write-Host "The bus is due in $($ETA.Minutes) mins and $($ETA.Seconds) seconds"
 
Last edited:
Yay API's!

I'm the "API guy" at where I work trying to promote better architecture of applications & usage of API's.
 
Yup, so much easier than the alternatives!

Rewrote in Python too just in case :p I think I'll go ahead and build the little screen just for funsies. Yes I'm bored at work.

Code:
import requests
from bs4 import BeautifulSoup as bs

###------STATIC VARIABLES-----###
mainURL="http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times"
API = 'http://www.transperth.wa.gov.au/DesktopModules/CatLiveTimesMap/API/CatLiveTimesMapApi/GetLiveCatInfo'
s=requests.Session()
###----------------------------###

###-------MAIN FUNCTION--------###
def main():
      response = SessionData();
      ETA = GetEta(response.json());
      NotifyUser(ETA);
###----------------------------###

###---------FUNCTIONS----------###
def SessionData():
    r=s.get(mainURL)
    soup=bs(r.content)
    input_list = soup.find_all(name="input")
    Token=input_list[len(input_list)-1]['value']
    headers={"User-Agent":"Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36",
             "RequestVerificationToken":Token,
             "Referer":"http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times",
             "TabId":"249",
             "ModuleId":"1478"}
    s.headers.update(headers)
    return s.post(API)

def GetEta(data):
    for x in data['data']['Stops']:
        if x['Name'] == 'Lord Street':
            Time = divmod(x['ETA'],60)
            return Time

def NotifyUser(ETA):
    print("Your bus will arrive in",ETA[0],"minutes and",ETA[1],"seconds.")

###----------------------------###
if __name__ == "__main__":
    main()
 
Now make it into your own, easier to use rest service lol.

Or does the Python script already expose it as a web service?

Sent from my HTC One
 
MY. GOD. I'm at screen punching levels right now. I decided it'd be easier to host on our intranet if I just rewrote it as a PHP plugin instead of python, but for some reason it works and works and works....then fails and fails and fails and fails. Then works again.

Snipped to relevant bits. $contents contains a JSON string without fail. So I try to parse it. That's where **** hits the fan. I've even stolen someone's special "json_clean_decode" function in an attempt to sanitize the JSON string juuuust in case it's got something weird in there, but still it'll work 5-20 times in a row...then magically doing a var_dump on $decodedResult returns NULL. WHY.

Code:
$contents = file_get_contents($apiURL, false, $context);

$decodedResult = json_clean_decode($contents);

var_dump($decodedResult);

function json_clean_decode($json, $assoc = True, $depth = 512, $options = 0) {
    // search and remove comments like /* */ and //
    $json = preg_replace("#(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|([\s\t]//.*)|(^//.*)#", '', $json);
   
    if(version_compare(phpversion(), '5.4.0', '>=')) {
        $json = json_decode($json, $assoc, $depth, $options);
    }
    elseif(version_compare(phpversion(), '5.3.0', '>=')) {
        $json = json_decode($json, $assoc, $depth);
    }
    else {
        $json = json_decode($json, $assoc);
    }
    return $json;
}
 
Here's a bit I did in a quick C# console app :

Code:
string token;
using (WebClient wClient = new WebClient())
{
	string html = wClient.DownloadString("http://www.transperth.wa.gov.au/Timetables/Live-Perth-CAT-Times");
	var m = Regex.Match(html, "<input name=\"__RequestVerificationToken\" type=\"hidden\" value=\"(?<Token>.*?)>", RegexOptions.IgnoreCase);
	token = m.Groups["Token"].ToString();
}


var client = new RestClient("http://www.transperth.wa.gov.au/");
var request = new RestRequest("DesktopModules/CatLiveTimesMap/API/CatLiveTimesMapApi/GetLiveCatInfo", Method.POST);

var moduleParam = new RestSharp.Parameter();
moduleParam.ContentType = "text/plain";
moduleParam.Type = ParameterType.HttpHeader;
moduleParam.Name = "ModuleId";
moduleParam.Value = "1478";
request.Parameters.Add(moduleParam);

var tabParam = new RestSharp.Parameter();
tabParam.ContentType = "text/plain";
tabParam.Type = ParameterType.HttpHeader;
tabParam.Name = "TabId";
tabParam.Value = "249";
request.Parameters.Add(tabParam);

var verifToken = new RestSharp.Parameter();
verifToken.ContentType = "text/plain";
verifToken.Type = ParameterType.HttpHeader;
verifToken.Name = "RequestVerificationToken";
verifToken.Value = token;
request.Parameters.Add(verifToken);

var response = client.Execute(request) as IRestResponse;
Console.WriteLine(response.Content);

Didn't try parsing the JSON because I didn't feel like making another class object to map to an object :lol:. Would have done it with Newtonsoft's JSON.NET, though.
 
Last edited:
Welp, irony of ironies, I've decided to screw JSON and just do a string search :p seems that php's "json_decode" is a delicate little flower that likes to **** up whenever it wants, so ya know what I'm not even gonna use ya! Take that!
Code:
$stop = "\"Lord Street\",\"Description\":\"Yellow\",\"ETA\":";
$posETA = strpos($contents,$stop) + 43;
$ETA = substr($contents,$posETA,4);
$numETA = preg_replace("/[^0-9,.]/", "", $ETA);
echo "Your cat is due in " . floor($numETA / 60) . " minutes and " . ($numETA % 60) . " seconds\n";

Dirty as hell but works 100% of the time, until Transperth decide to change their json layout, buuut I don't think that's too likely given how slow they are to change anything at all.

Ah well, added to intranet and it does what it's supposed to.
Next up, making a little screen to put next to the exit door
 
Back
Top Bottom