Try 30 days of free premium.

Missing Data for 2018, 2019

alecwinshel wrote 4 years ago: 1

I'm working on a summer project for my Projects in Programming class using the TV Maze API. I'm trying to recreate this info graphic developed by FX Networks Research that shows the growth of programming across Broadcast, Pay Cable, Basic Cable, and OTT Services in the past decade. The data I found here matches closely to theirs and all seemed perfect! When it came to 2018, though, there was a sharp drop off. Although more scripted shows were released in 2019 than in 2018, the information I downloaded from here suggests otherwise.

The TV Maze data suggests there were 6,110 scripted US episodes in 2017 but only 2,071 scripted US episodes in 2018. It doesn't seem like that can be true! Is there missing data? A change in the TV Maze methodology? Let me know if you can help me out.


gazza911 wrote 4 years ago: 1

TVMaze relies on its users to enter data (some of which can be imported from other sources by those with sufficient permissions), therefore it's possible that there are some missing shows/episodes.

Please could you explain which endpoints you used to gather the data so that I could verify if they would have provided the data you'd need.

alecwinshel wrote 4 years ago: 1

I wrote a Python script to parse through all episode numbers in the database (140 at a time) using:

http://api.tvmaze.com/episodes/#?embed=show

It's such a rich database prior to 2018 that it's hard to understand the drop-off. I would have thought that recent years would be even more completely filled out.


JuanArango wrote 4 years ago: 1

alecwinshel wrote:
I wrote a Python script to parse through all episode numbers in the database (140 at a time) using:

http://api.tvmaze.com/episodes/#?embed=show

It's such a rich database prior to 2018 that it's hard to understand the drop-off. I would have thought that recent years would be even more completely filled out.

A question about the scripted episodes.

Are those aired ones or added ones in that year, because that is a huge difference.

alecwinshel wrote 4 years ago: 1

JuanArango wrote:
A question about the scripted episodes.

Are those aired ones or added ones in that year, because that is a huge difference.

Only scripted US episodes that aired in that calendar year - not the time that they were added.


JuanArango wrote 4 years ago: 1

alecwinshel wrote:
Only scripted US episodes that aired in that calendar year - not the time that they were added.

I find the huge difference very weird. The only person who can answer this is probably david :)

alecwinshel wrote 4 years ago: 1

JuanArango wrote:
I find the huge difference very weird. The only person who can answer this is probably david :)

Dang - okay! Hope he catches this thread.


david wrote 4 years ago: 1

alecwinshel wrote:
I wrote a Python script to parse through all episode numbers in the database (140 at a time) using:

http://api.tvmaze.com/episodes/#?embed=show

It's such a rich database prior to 2018 that it's hard to understand the drop-off. I would have thought that recent years would be even more completely filled out.

First of all, this sounds like an awfully inefficient method compared to using the schedule or show/:id/episodes endpoints.


david wrote 4 years ago: 1

I crunched the numbers on my end for you, only considering episodes from US broadcast networks (so ignoring web channels like Netflix). That gives me 6172 episodes in 2017 and 5067 in 2018. So it sounds like you were on the right track for 2017, but something broke in your 2018 calculations.

Try 30 days of free premium.