mellowtigger: (mst3k)
[personal profile] mellowtigger
Just in time for folk going to CONvergence to have trivia for dazzling the natives! And how appropriate that their webpage shows the MST:3K profile. *laugh* Suppose there was a government experiment in which a person born right now was forced to watch television every single day for the rest of their lives. Is there already enough Hollywood content to meet the need?

Short answer: Yes, but only if you require the subject to also view non-English films.

I've pondered this matter before, but then I said it out loud at Bear Coffee last week that I was thinking about downloading IMDB data and determining for myself if there was already more video/film content than a person could ever watch. I finally got around to following through on the commitment. :)

It turns out that IMDB is a huge mess that somehow, somehow continues to work. It is a collection of flat text files (yes, that's right), each devoted to a particular set of data. Records do not have key fields (yes, really). The primary field is just the name of the movie, which the end user can enter with any combination of double and single quotes that they'd like. (really, such a headache for a programmer trying to do string processing.) The other two fields in the "running-times" table are supposed to indicate the number of minutes and the number of episodes, respectively. But there is no data validation because users can enter anything they want, just like a wiki. Minutes might be listed as a whole number "60", or minutes and seconds "20:47", or a range "28-29", or might contain whatever text notes a person thought to be useful information. (Yes, the internet movie so-called database really is this bad.)  Oh, and the field that's supposed to name how long a movie is in minutes, well that's also the place where they dump in the country-of-origin information.  *boggle*

Nevertheless, I managed to import it to an OpenOffice database. I produced the following stats:
890,100 KB memory needed by OpenOffice to import the data using my OOBasic macro
466,181 records processed
 19,926 records that I was unable to clean up well enough to determine the minutes
446,255 records left with countable data

29,506,703 minutes total
491,778 hours total
35,127 days total (allowing 14 hours per day for continuous viewing)
96 years total (which exceeds average lifespan for both males and females)
If I limit consideration only to entries that either do not specify the country of origin or mention specifically USA, UK, Canada, and Australia, then I assume I'm looking mostly just at the English movies. Those results are as follows:
22,427,409 minutes English
373,790 hours English
26,699 days English (at 14 hours per day)
73 years total English
Which becomes doable, but just barely squeaking by within the average lifespan. For Americans, the current average male lives 75 years and female lives 80 years.  All of my numbers are underestimates, I should point out.  I'm not convinced that my macro for cleaning up the episode-count data did a very good job.  It looks like most everything was counted as a movie with only 1 episode rather than allowing for tv series which may have had multiple episodes.  My totals may revise upwards if I ever decide to further clean up the awful IMDB table.

Where's Joel when you need him?

Profile

mellowtigger: (Default)
mellowtigger

About

June 2025

S M T W T F S
1 2 34567
891011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Powered by Dreamwidth Studios
Page generated 2025-Jun-07, Saturday 08:38 pm