Yesterday I was watching a show with a friend and I tried to manually track it in Trakt – but it wasn’t there! That is, the show was there but the 2 most recent seasons were missing. I found the show’s website and they had a complete listing of the episodes, with descriptions and air dates. How can I get this data in TMDB and TVDB so that I can track it in Trakt?
Both sites are missing a “bulk add” feature – likely to prevent flooding the site with bad data. You can enter episodes one at a time though, so I decided to take a peek at how the frontend was sending this to the backend. I found the two POST requests and they seemed simple enough to replicated. We’ll walk through it here.
Scrobbling
I’m a big fan of quantified self (QS) and media consumption. Last.fm introduced “scrobbling” so that music playback data could be pulled to a central location. Insights can be gained from data like this.
Spotify has their year-end round-up, which is possible since they have all the data – consumption goes through them. “Spotify Wrapped” is a funway to show a user their data.
On the video side, services like XBMC and Plex can be set up to scrobble to Trakt. They do some version of the “Wrapped” year-end report. It’s also useful for pushing out data – your watched data (progress on watching a TV show or movie) can be pushed back to some clients, so they can be restored after reinstall or synced across devices.
Trakt pulls its own data of what movies/shows/episodes exist in the universe from other data sources like TVDB and TMDB. Occasionally shows, seasons, or episodes are missing from these sources. Users can contribute to correct, update, or otherwise fill in the gaps in this data.
Prepare our Tools
I’m using just basic built-in browser tools. I’ll use Chrome here, but all the major browsers have decent enough developer tools. We’ll want to make sure the console logs and network requests are tracked AND persist across page navigation.
- Open developer tools
- Open Console, hit gear icon on second row (far right), check “Preserve log”
- Open Network, check “Preserve log” on second row
We’ll leave the developer tools open so it can record new network requests
1) Getting the Data
The show’s website has tiles for each episode, with name, number, description and air date. I didn’t see a JSON file getting loaded in the network traffic on page load, so it looks like scraping from the HTML itself is necessary.
In the next section, we’ll see that TMDB takes these fields: episode_number, name, overview, air_date. We can structure the data with those property names.
Run the code in the console:
var exportEpisodes = []; document.querySelectorAll(".content.mosaic.active .component-wrapper > div.row > div.columns").forEach((ep, epnum) => { var buildEp = { id: null, episode_number: epnum + 1, name: "", overview: "", air_date: null, // "2021-10-09T04:00:00.000Z", runtime: null, locked_fields: "", }; buildEp.name = ep.querySelector("h4.mosaic-title").innerText.split(" | ")[1].trim() || buildEp.name; var caParts = ep.querySelector(".mosaic-caption").innerText.split("Original Air Date: "); buildEp.overview = caParts[0].trim(); buildEp.air_date = new Date(caParts[1]).toJSON() || buildEp.air_date; exportEpisodes.push(buildEp); }); JSON.stringify(exportEpisodes); // copy this for use later
Here, we initialize with some default empty values, grab innerText, and do some light processing for formatting purposes.
2) Bulk Add to TMDB
TMDB has a form for a single episode that does a POST through Javascript when you submit. We can see this in the network log.
Based on that, I played around with fetch() but the server didn’t like the data I was sending. The headers of my initial test run with the form said it was made with XMLHttpRequest, so I switched to it (more complicated but more control).
On that same page with the form, enter this in the console. Add the output from step (1) above to the parse function. This will run a separate POST request for each episode – if you have a lot, it’s possible you could get rate limited.
var newEps = JSON.parse(`...the stringify output from step 1...`); if (document.location.pathname.match(/\/tv\/[^\/]+\/season\/\d+\/edit/i)) { postNextQueueItem(newEps); } function postNextQueueItem(queue) { var nextItem = queue.pop(); if(!nextItem) return; console.log(`Adding episode ${nextItem.episode_number}`); var fetchUrl = document.location.origin + document.location.pathname.slice(0, -4) + "remote/episodes?timezone=America/New_York&translate=false"; $.ajax({ url: fetchUrl, method: 'POST', dataType: 'json', data: { data: JSON.stringify(nextItem) } }).fail(function(e) { console.error('There was a problem.', e); }).done(function(response) { if (response.success) { console.log(`Added episode ${nextItem.episode_number}`, response); postNextQueueItem(queue); } }); }
3) Bulk Add to TVDB
First things I notice: the entire page is a form so we’ll need to persist network requests in the dev tools to see the form submission, multiple episodes can be submitted with one request but not more than 25 at a time.
// we'll run this from the season episode list page, e.g. https://thetvdb.com/series/some-series-name/seasons/official/9 var newEps = JSON.parse(`...the stringify output from step 1...`); var fData = new FormData(); newEps.forEach(ep => { fData.append('number[]', ep.episode_number); fData.append('name[]', ep.name); fData.append('overview[]', ep.overview); fData.append('date[]', String(ep.air_date).substring(0,10)); fData.append('runtime[]', ''); });
As seen in the original form request, each episode is just appended with the same field name, so there is no index or anything to differentiate – only the order.
The form has a bunch of extra fields that we don’t need to fill, but the original site request included, so we’ll just add as empty values.
fData.append('companytypes[]', ''); fData.append('companynames[]', ''); fData.append('companyids[]', ''); fData.append('companydetails[]', ''); fData.append('contentratings[arg]', ''); fData.append('contentratings[arm]', ''); fData.append('contentratings[aus]', ''); fData.append('contentratings[bra]', ''); fData.append('contentratings[khm]', ''); fData.append('contentratings[can]', ''); fData.append('contentratings[chl]', ''); fData.append('contentratings[col]', ''); fData.append('contentratings[hrv]', ''); fData.append('contentratings[dnk]', ''); fData.append('contentratings[ecu]', ''); fData.append('contentratings[slv]', ''); fData.append('contentratings[fin]', ''); fData.append('contentratings[fra]', ''); fData.append('contentratings[deu]', ''); fData.append('contentratings[grc]', ''); fData.append('contentratings[hkg]', ''); fData.append('contentratings[hun]', ''); fData.append('contentratings[isl]', ''); fData.append('contentratings[ind]', ''); fData.append('contentratings[idn]', ''); fData.append('contentratings[irl]', ''); fData.append('contentratings[isr]', ''); fData.append('contentratings[ita]', ''); fData.append('contentratings[ltu]', ''); fData.append('contentratings[mys]', ''); fData.append('contentratings[mex]', ''); fData.append('contentratings[mar]', ''); fData.append('contentratings[nld]', ''); fData.append('contentratings[nzl]', ''); fData.append('contentratings[nor]', ''); fData.append('contentratings[per]', ''); fData.append('contentratings[phl]', ''); fData.append('contentratings[pol]', ''); fData.append('contentratings[prt]', ''); fData.append('contentratings[rou]', ''); fData.append('contentratings[rus]', ''); fData.append('contentratings[sgp]', ''); fData.append('contentratings[svk]', ''); fData.append('contentratings[svn]', ''); fData.append('contentratings[zaf]', ''); fData.append('contentratings[kor]', ''); fData.append('contentratings[esp]', ''); fData.append('contentratings[twn]', ''); fData.append('contentratings[tha]', ''); fData.append('contentratings[tur]', ''); fData.append('contentratings[ukr]', ''); fData.append('contentratings[usa]', ''); fData.append('contentratings[ven]', ''); fData.append('contentratings[gbr]', ''); fData.append('contentratings[jpn]', ''); fData.append('contentratings[swe]', '');
Now we can submit. I played with this a bit because it wasn’t accepting the input. Seems automatic data processing (format? infer?) was failing, so we can just disable these particular features.
$.ajax({ url: document.location.href + '/savebulkadd', type: 'POST', data: fData, contentType: false, processData: false, cache: false }).fail(function(e) { console.error('[savebulkadd] error'); }).done(function(e) { if (e.success) { console.log('[savebulkadd] done - success', e); } else { console.log('[savebulkadd] done', e); } });
Conclusion
It can take some trial and error to get the right settings to replicate their add-episode requests, but once you do it’s not much work to do bulk submissions. The browser developer tools are great for reverse engineering this stuff from the network requests. Now I’m off to do some scrobbling!