Hi,
First of all, I applaud the effort here to create a new set of data for TV information. Great job on picking elasticsearch for the API also.
Secondly, apologies if these issues have been raised before, but the forum search doesn't work that great ;-)
My suggestions:
- The API offers an endpoint for scraping data en-masse for those looking to build their own cache. This won't scale very well (for you or I) when there are 100k+ shows. Would it be possible for you to provide a dump of the database in tar.gz format or something? It would be quicker for us to download, process and it will also be a lot easier on your servers. My suggestion would be to drop it on S3 or even dropbox automatically.
- Enhance the show search to support passing in a premiere year. See [1] below for more info on this.
- Enhance the show search to support passing in the show's country. See [2] below for details.
[1] A complicated example is the TV show "The Bridge" which is shown on BBC. However, the show is listed as "Bron / Broen" which is the name it was given in Sweden/Denmark where it originally aired in. If you search the API using "The Bridge" it's listed but a less known US tv show comes first. Whilst I understand why that happens as it's doing a fuzzy match on the title, it could be made more accurate by applying weight to extra parameters. If I were able to pass in the premiere year for example (2011) that could act as a direct filter to try and get rid of results that aren't applicable.
[2] Similarly as [1] if I were able to pass in SE as the show's origin it could act as a filter or addition of weight to help find the right show a little easier.
N.B. I know that the filters can in theory be applied client-side, but in [1] the real show actually appears quite low in the rankings and with low confidence (position 5) http://api.tvmaze.com/search/shows?q=the%20bridge - Also in this case where the language is different I would not feel confident picking result #5 client side as the only kind of naive check I can do is to see if the name's match up and in this case they don't. I guess one solution would be to list the aka's in the search results too.