Audience fragmentation is going from bad to worseThis fragmentation is wrecking effective campaign reach and creating a massive frequency imbalanceAudience re-aggregation will be key for brand advertisers to maintain scaleTV is not going to the web. The web is going to television.
Audience fragmentation is going from bad to worseThis fragmentation is wrecking effective campaign reach and creating a massive frequency imbalanceAudience re-aggregation will be key for brand advertisers to maintain scaleTV is not going to the web. The web is going to television.
The Huntington copy is one of eleven surviving copies printed on vellum, and one of three such copies in the United States. An additional thirty-six copies printed on paper also survive.
Our claim of the world's largest actionable set of TV viewing data at 75tb would be hard for anyone to challenge. The fact that we link schedule information, set-top box data and ratings data makes it even more difficult to challenge. The most interesting discovery was that we're 3x larger than Nielsen's biggest single instance transactional datastore. (Netezza has similar kinds of multiplying factors as our data storage scheme, Hadoop.) The Numbers:Wal-Mart: 1 petabyte (800 million transactions/day across 7000 stores globally) (3) (This is probably in a combination of HP Neoview and Teradata.)Yahoo!: 700 terabytes (1) (Doesn't include their Hadoop cluster which is approx 15 petabytes.)Australian Bureau of Statistics: 250 terabytes (1)AT&T: 250 terabytes (1)AC Nielsen: Largest single instances: Netezza: 20 tera, Oracle: 10 tera (500 terabytes TOTAL in Netezza, 45 tera in Oracle) Most are distributed databases with client data. (1)(2)Adidas: 13 terabytesLargest Hadoop cluster (4):Facebook: 30 petabytes of storage---------------------------------------------The fine print----------NOTES:(1) From Oracle F1Q10 Earnings Call September 16, 2009 5:00 pm ET Transcript (Charles E. Phillips Jr.)Yahoo!: 700 terabytes Australian Bureau of Statistics: 250 terabytesAT&T: 250 terabytesAC Nielsen: 45-terabyte data [mart], they called itAdidas: 13 terabytes2) DBMS2:September 29, 2009What Nielsen really uses in data warehousing DBMSIn its latest earnings call, Oracle made a reference to The Nielsen Companythat was — to put it politely — rather confusing. I just plopped down in a chair next to Greg Goff, who evidently runs data warehousing at Nielsen, and had a quick chat. Here’s the real story.The Nielsen Company has over half a petabyte of data on Netezza in the US. This installation is growing.The Nielsen Company indeed has 45 terabytes or whatever of data on Oracle in its European (Customer) Information Factory. This is not particularly growing. Nielsen’s Oracle data warehouse has been built up over the past 9 years. It’s not new. It’s certainly not on Exadata, nor planned to move to Exadata.These are not single-instance databases. Nielsen’s biggest single Netezza database is 20 terabytes or so of user data, and its biggest single Oracle database is 10 terabytes or so.Much (most?) of the rest of the installations are customer data marts and the like, based in each case on the “big” central database. (That’s actually a classic data mart use case.) Greg said that Netezza’s capabilities to spin out those databases seemed pretty good.That 10 terabyte Oracle data warehouse instance requires a lot of partitioning effort and so on in the usual way.Nielsen has no immediate plans to replace Oracle with Netezza.Nielsen actually has 800 terabytes or so of Netezza equipment. Some of that is kept more lightly loaded, for performance.(3) Stair, Principles of Information Systems, 2009, p 181.(4) Dhruba Borthakur who is the Hadoop Engineer for Facebook.30petabytes in December 2010. This is really interesting.... http://www.facebook.com/note.php?note_id=468211193919In May 2010The Datawarehouse Hadoop cluster at Facebook has become the largest known Hadoop storage cluster in the world. Here are some of the details about this single HDFS cluster:21 PB of storage in a single HDFS cluster2000 machines12 TB per machine (a few machines have 24 TB each)1200 machines with 8 cores each + 800 machines with 16 cores each32 GB of RAM per machine15 map-reduce tasks per machineThat's a total of more than 21 PB of configured storage capacity! This is larger than the previously known Yahoo!'s cluster of 14 PB. Here are the cluster statistics from the HDFS cluster at Facebook:
Two reasons for light viewing:Modality. People have busy lives.Fragmentation to lower measured networksThe heaviest viewers watch 3X the volume of television of the average viewer.The lightest viewers watch 5% the volume of television of the average viewer.60% of the television audience accounts for 90% of television viewing (and therefore ad impressions). Call them the Heavier Viewers.The remaining 40% of the viewers account for only 10% of total attention to television. These Lighter Viewers’ attention to television generates less than 1/10 the volume of impressions that a Heavier Viewer does.Without careful planning based on the best possible data resource, every 12 impressions an advertiser buys will yield one unit of reach against the 40% of the audience that are Lighter Viewers.Ratio of Heavier Viewer viewing to Lighter Viewer viewing varies by network. Networks with a relatively greater share of viewing attributable to heavier viewers will tend to accumulate audience more slowly that networks with lower share of viewing attributable to heavier viewers. All else equal, impressions on networks with more heavier viewer viewing will create more frequency and less reach than networks with less heavier viewer viewing.
SYFY 2010.02.28 7:00:00PM to 2010.10.14 12:30PM10645 Observations for 514 stationsSometimes easy to spotFiles corruptedWhat about inconsistency in field level data?Possibly a logging problem at the STB level?Possibly an aggregation problem?
Learning the difference between “bank” of a river vs “bank” as a place where you put your money.In search we called this the “Madonna problem” Madonna the religious icon vs Madonna pop culture icon
Learning the difference between “bank” of a river vs “bank” as a place where you put your money.In search we called this the “Madonna problem” Madonna the religious icon vs Madonna pop culture icon
Learning the difference between “bank” of a river vs “bank” as a place where you put your money.In search we called this the “Madonna problem” Madonna the religious icon vs Madonna pop culture icon
Nielsen has Over The Air, Analog, Digital
Nielsen has Over The Air, Analog, Digital
Nielsen has Over The Air, Analog, Digital
Nielsen has Over The Air, Analog, Digital
Nielsen has Over The Air, Analog, DigitalImputed Nielsen’s numbers
The first chart shows the Fraction of view time for women of ages 18-54 (F18-54) as fraction of view time for all tv viewers for week 2 vs the same fraction for week 1 (two weeks in January). The data is for three markets Philadelphia in blue, Atlanta in red and Chicago in green. Each point represents a zip code in one of these markets. The second chart is similar but for men 18-54 (M18-54).The distance of a point away from the diagonal line represents the variation from one week to the next for that zip code. The separation along the diagonal line represents the varying fraction of adult women between the zip codes. As an example, if there had been no change from the first week to the second, all points would have been along the diagonal.We see strong overlap of all three markets and they can't be separated in these views. However, we see significant spread of the fraction of the F18-54 group and M-18-54 group between the zip codes that compose these markets. Women appear to show more geographically variation in their viewing habits
Audience fragmentation is going from bad to worseThis fragmentation is wrecking effective campaign reach and creating a massive frequency imbalanceAudience re-aggregation will be key for brand advertisers to maintain scaleTV is not going to the web. The web is going to television.
Audience fragmentation is going from bad to worseThis fragmentation is wrecking effective campaign reach and creating a massive frequency imbalanceAudience re-aggregation will be key for brand advertisers to maintain scaleTV is not going to the web. The web is going to television.