2. INFORMATION PRINTED ON PAPER

Paper is the primary medium for the formal publication of information, although it represents only 0.01% of new information recorded in all media. Information is recorded on paper in three distinct genres: books; serials (including newspapers, mass market and trade periodicals, scholarly periodicals and newsletters) and documents printed or copied in offices. There are interesting differences among nations in the proportion of printed information in each of these genres: the United States produces far more office documents than other nations, the European Union produces more books and serials, and the rest of the world produces relatively more newspapers and mass market periodicals.

I. World Paper Printed Information Storage Capacity

A tree can produce about 80,500 sheets of paper, thus it requires about 786 million trees to produce the world's annual paper supply. The UNESCO Statistical Handbook for 1999 estimates that paper production provides 1,510 sheets of paper per inhabitant of the world on average, although in fact the inhabitants of North America consume 11,916 sheets of paper each (24 reams), and inhabitants of the European Union consume 7,280 sheets of paper annually (15 reams), according to the ENST report. At least half of this paper is used in printers and copiers to produce office documents.

Table 2.1: World Paper Production 1997 and 2001

Production Statistics

Metric Tons (1997)

Capacity in Terabytes (1997)

Metric Tons (2001)

Capacity in Terabytes (2001)

Printing and Writing Paper (World)

90.0

540,000

94.8

568,800

Newsprint (World)

36.0

432,000

37.8

453,600

Source: How much information 2003

World output for printing and writing paper was 94.8 million tons in 2001, and newsprint production was about 37.8 million tons. Using a conversion of 6 GB of data per metric ton of paper there exists a net storage capacity of 568,800 TB on the total amount of printing and writing paper produced in 2001 (94.8 million tons), were every bit of it be used for information recording. Using a conversion of 12 GB of data per metric ton of newsprint there exists a net storage capacity of 453,600 TB on the total amount of newsprint produced in 2001 (37.8 million tons), were every bit of it be used for information recording. If all of the writing paper and newsprint produced in 2001 were used to store printed information, this would be equivalent to about 1,022,400 TB, assuming each page was converted into one large image of 130 KB. This represents a 2% capacity rise from 1999.

U.S. Paper Printed Information Storage Capacity

Total U.S. paper and paperboard production for 2001 was 80.66 million metric tons, or 25.4% of the world's total. This percentage figure has fallen from previous years. At the time of the last study, U.S. production figures were closer to 30% of the world total. The U.S. produced about 22.166 million metric tons of printing and writing paper and approximately 5.77 million tons of newsprint.

Using the conversion rate of 6 GB of data per metric ton of paper there exists a net storage capacity of 132,996 TB using the total amount of printing and writing paper produced in the U.S. be used for information recording. Using the conversion of 12 GB of data per metric ton of newsprint there exists a net storage capacity of 69,240 TB on the total amount of newsprint produced in the U.S. in 2001, should every bit of it be used for information recording.

In 1999, the U.S. produced 23.8 million metric tons of printing and writing paper and 6.4 million metric tons of newsprint.

Table 2.2: United States Paper Production

Production Statistics

Metric Tons (1999)

Capacity in Terabytes (1999)

Metric Tons (2001)

Capacity in Terabytes (2001)

Printing and Writing Paper (U.S.)

23.8

142,800

21.66

132,996

Newsprint (U.S.)

6.4

80,000

5.77

69,240

Source: How much information 2003

If all of the writing paper and newsprint produced in the U.S. in 2001 were used to store printed information, this would be equivalent to about 202,236 TB, assuming each page was converted into one large image of 130 KB. This represents a 8% capacity decrease from 1999.

II. GENRES OF ORIGINAL INFORMATION STORED ON PAPER

A. World Flow

In this section, we examine the global flows and stock of original print media. In "World Flow", we discuss the various forms of original information produced around the world and in the United States. This refers to all forms of written information created worldwide, excluding copies, and stored in some form for posterity. The single largest component of print media flow is office documents. The term broadly refers to all paper storage of printed and written materials in various forms throughout the world. This is not surprising, as much original information is created yearly in offices and other institutions for their own (and sometimes public) consumption. Newspapers are a very significant source of original flow information, comprising the single largest source of the public consumption of information, followed by mass-market magazines and books. There has been a slight increase in the worldwide consumption of newspapers, despite the phenomenal increase in the use of the Internet in information gathering.

Table 2.3: World Original Print Information Flow (2003)

Media Type
(Data Sources and Year Data Cited)

Unique Items per Year

Conversion Factor

Total Terabytes
(Annual Worldwide)

Books (UNESCO [???])

950,000 [???] books

Scanned image (600 dpi): 39 MB/book
(300 page average)

39 TB

Digital compression: 7.8 MB/book

8 TB

Plain text: 0.75 MB/book

0.8 TB

Newspapers (ISSN 2001)

25,276 publications

Scanned image (600 dpi): 500 KB/page

(30 page average)

138.4 TB

Digital compression: 100 KB/page

27.7 TB

Plain text: 10 KB/page

2.7 TB

Mass Market and Trade Periodicals (Ulrich’s 2001)

80,000 publications

Scanned image (600 dpi): 13 MB/issue

(96 page average)

52 TB

Digital compression: 5.4 MB/issue

10 TB

Plain text: 0.5 MB/year

1 TB

Scholarly Periodicals (Ulrich’s 2001)

37,609 publications

Scanned image (600 dpi): 27 MB/issue

(208 page average)

6.0 TB

Digital compression: 5.4 MB/issue

1.3 TB

Plain text: 0.5 MB/year

0.1 TB

Newsletters (Ulrich’s 2001)

40,000 publications

Scanned image (600 dpi): 1.6 MB/item

(12 page average)

0.9 TB

Digital compression: 0.3 MB/item

0.2 TB

Plain text: 0.03 MB/item

0.02 TB

Archiveable, original office documents (National Archives)

(107) * 1075 pages

(10.75 Bn pgs)

Scanned image (600 dpi): 130 KB/page

1397.5 TB

Digital compression: 26 KB/page

279.5 TB

Plain text: 2.5 KB/page

27.9 TB

Totals:

Scanned: TB

1634 TB

Compressed: TB

327 TB

Text: TB

33 TB

Source: How much information 2003

These estimates are based upon trade publications. However much higher estimates for serials and office documents have been made by the ENST study.

Serials. The ISSN register (see International Standard Serial Number (ISSN) Statistics ), used here only for Newspapers, includes some 671,988 registered serials, 98% of which are paper based; we have used the Ulrich numbers because they eliminate duplication, and our goal is to estimate the number of original titles. The ENST numbers estimate that the U.S. produces 17.6% of world paper serial publications, and the European Union 44.5%. (ENST also estimates that the U.S. produces 17% of world books, the EU 42%, and the rest of the world 41%; they agree with our estimate of book production.)

Office Documents. Our estimates of printed office document production is based upon the practices of the U.S. National Archive and Records Administration (NARA), which archive about 2% of office documents produced by the Federal Government. The ENST study uses a methodology based upon working assumptions from the production of printing and copier paper: if 20% of printer paper contains original information (not copies), and if half of this is archived, then the world produces 2238 TB of office documents. Interestingly, they estimate 662 TB (29%) was produced by the EU and 775TB (35%) by the U.S. Our 2% estimate is based upon a very narrow definition of "archival documents," since NARA essentially archives paper forever only if it is thought to have historical value; the ENST estimate that 10% of all paper is archived original information is very broad. We know of no empirical study that identifies the archival practices of different kinds of institutions (i.e., government, business, education) in different countries, thus this issue remains on the research agenda.

NOTE ON OFFICE DOCUMENTS CONVERSIONS

To estimate the amount of information generated by offices, we looked at the statistics of the Federal Government, which is the single largest employer in the United States, with 1.9 million civilian workers and 1.5 million individuals in the armed services as of 2000. The Federal Government, in total, employs about 2.3% of the nation's workforce.

The National Archives in Washington D.C. retains 2% of what the government produces, across a range of media. NARA retains only what is deemed to be of some permanent historical value. Document types include correspondence, registers, reports, forms, treaties, case files, and log books. The perceived value determines how long a record will be retained--some will be kept indefinitely, while others are retained for no more than 6 months. An effort is made to prevent duplicating records but there is still some degree of overlap. The current textual archival holdings, as of October 2001 occupy a total of 21.5 million cubic feet. If we use the rule of thumb that 200 pieces of paper fit in one cubic foot (in an archive), then we arrive at a total of 4.3 billion pages.

If one divides 4.3 billion by the number of years NARA has existed (60 years), one could obtain a rough number of pages collected per year by dividing 4.3 billion by 60 - the result is about 72 million pages per year.

The current accession rate, however, appears to be much higher. Each year, Federal agencies submit about 4,000 items and about 75% of these (3,000) are processed for archival. Although the Archives does not publish statistics on the average size of these items, it is known that NARA adds a total of 500,000 cubic feet of mostly paper-based records each year. As previously noted, in archives, 1 cubic foot can hold 200 pieces of paper, so the total annual accession rate is therefore about 100 million pages per year.

If this represents 2.3% of the nation's workforce, then one could estimate that United States companies produce a total of more than 4 billion archiveable pages each year, equivalent to 1,400 terabytes.

Table 2.4. World Original Print Information Flow by Percent (2003)

Media Type

Total Terabytes (scanned)

Percent of total

Books

39 TB

2.3%

Newspapers

138.4 TB

8.5%

Mass Market Periodicals

52 TB

3.2%

Scholarly Periodicals

6 TB

0.37%

Newsletters

0.9 TB

0.05%

Office Documents

1,397 TB

85.5%

Total:

1,634 TB

100%

Source: How much information 2003

B. United States Flow

The U.S. print information flow is dominated by office documents, but unlike for global data, the second most prevalent source of print information is not mass-market periodicals, but books. This is partly a feature of the United States being a key player in the worldwide publishing industry, and also a factor of its high educational levels and of local markets for books. Third world nations have proportionally higher information being created in mass distribution channels like magazines and newspapers than in books.

Table 2.5: U.S. Original Print Information Flow (2003)

Media Type
(Sources and Year Cited)*

Unique Items per Year

Conversion Factor

Total Terabytes
(Annual Worldwide)

Books (ISBN)

141,901 books

Scanned image (600 dpi): 39 MB/book

(300 page average)

5.5 TB

Digital compression: 7.8 MB/book

1.1 TB

Plain text: 0.75 MB/book

0.1 TB

Newspapers (Newspaper Association of America)

10,170 publications

Scanned image (600 dpi): 500 KB/page

(30 page average)

13.5 TB

Digital compression: 100 KB/page

2.8 TB

Plain text: 10 KB/page

0.3 TB

Mass Market and Trade Periodicals (Bowker 2001)

16,615 publications

Scanned image (600 dpi): 13 MB/issue

(96 page average)

3.5 TB

Digital compression: 5.4 MB/issue

0.7 TB

Plain text: 0.5 MB/year

0.07 TB

Scholarly Periodicals (Bowker 2001)

10,632 publications

Scanned image (600 dpi): 27 MB/issue

(208 page average)

1.6 TB

Digital compression: 5.4 MB/issue

0.3 TB

Plain text: 0.5 MB/year

0.03 TB

Newsletters (Bowker 2001)

11,581 publications

Scanned image (600 dpi): 1.6 MB/item

<(12 page average)

0.3 TB

Digital compression: 0.3 MB/item

0.06 TB

Plain text: 0.03 MB/item

0.006 TB

Archiveable, original office documents (National Archives 2001)

4.3 Bn pgs

Scanned image (600 dpi): 130 KB/page

559 TB

Digital compression: 26 KB/page

112 TB

Plain text: 2.5 KB/page

11.6 TB

Totals:

Scanned: TB

583.4 TB

Compressed: TB

117 TB

Text: TB

12 TB

Source: How much information 2003

Interestingly, the information flow figures for U.S. consumption represent over 35% of the total original worldwide print information flow. While the U.S. is indeed the global leader in production and storage of information, these flow figures are largely influenced by the office document use in the U.S.. If the office documents component were removed from the equation, the U.S. accounts for slightly over 10% of the world's original information flow in print. This underlines the high use of paper in U.S. offices as compared to that in nations around the world.

There have been some changes in the creation of new information between our previous study and the current one, but most of the changes have been fairly small except in the area of office documents. Contrary to notions of paperless offices floated in the late 80s and early 90s, the consumption of office paper has gone up substantially in the recent years, especially following the move to laser/inkjet printers from dot matrix printers. Paper use in offices has further risen with the increasing speed of laser printing coupled with its decreasing cost. Each year, almost 500 billion copies are produced on copiers in the United States; nearly 15 trillion copies are produced on copiers, printers, and multi-function machines. (Source: XeroxParc).

An increase in the number of information on newsprint is seen - the proportion of this increase is slightly higher worldwide than in the U.S., possibly an outcome of an expansion of media markets in developing nations.

C. Rate of Change

Table 2.6: World Flow: Comparison of 1999 Data with 2001 Data

 

WORLD

U.S.

1999 (TB)

2001 (TB)

1999 (TB)

2001 (TB)

Books

Scanned Images

39

39

3

5.5

Compressed

8

8

0.5

1.1

Plain Text

1

1

0.05

0.1

Newspapers

Scanned Images

124

138.4

13

13.5

Compressed

25

27.7

3

2.8

Plain Text

2.5

2.7

0.3

0.3

Mass Mkt Periodical

Scanned Images

52

52

4.2

3.5

Compressed

10

10

2.6

0.7

Plain Text

1

1

0.26

0.07

Scholarly Periodical

Scanned Images

9

6

2

1.6

Compressed

2

1.3

0.5

0.3

Plain Text

0.2

0.1

0.04

0.03

Newsletters

Scanned Images

0.8

0.8

0.2

0.3

Compressed

0.2

0.2

0.04

0.06

Plain Text

0.02

0.02

0.004

0.006

Office Documents

Scanned Images

975

1397.5

390

559

Compressed

195

279.5

78

112

Plain Text

19

26.9

7.5

11.6

Totals:

Scanned Images

1200

1637

421

583

Compressed

240

327

84

117

Plain Text

24

33

8.2

12

Source: How much information 2003

However, there are other trends that are not shown on this chart. The rise in the amount of office documents has a component of Internet information printouts. Similarly, the amount of information now available online - such as news, has not severely affected the distribution of newspapers. What it has possibly done is alter the readership of the average newspaper. Consumption studies may show that the number of pages, and possibly even the types of news sections, read by the average newsreader has changed since the Internet came into use.

Consumption of major journals may have also shifted to online formats, as more are published online, as documented in Ulrich's periodicals directories from 1998, 2000, and 2002.

Increasing Shift to New Media by Scholarly Periodicals


Table 2.7: Increase of Digital Versions of Scholarly Periodicals

Year

Number of Periodicals

With Internet Websites

Number of Periodicals

With CD-ROM versions

1997

8,672

2,903

1999

14,757

4,625

2001

27,083

5,577

Source: Raw data from Ulrich's international Periodical Directory; Chart and Table, How much information 2003

Efforts to place contents online need no explanation. However, the data above is more interesting for the fact that the rate at which publishers have shifted to the use of CD-ROMs for digital storage has been slower than the rate at which they have replicated their print contents online. There was a gap of about 3.7% between the total number of online and the total number of CD-ROM versions of the print magazines; this has risen to over 13% in 4 years. Publishers clearly see creating and developing an Internet readership strategy as vital to the future of scholarly periodicals.

III. THE STOCK OF PRINTED INFORMATION - COPIES

In this segment we examine the amount of information existing in multiple copies of the original material discussed above. This dimension represents the use of paper as a distribution medium for the consumption of information.

A. Books

1. The flow of books.

World. We calculate that the total world flow of book copies was approximately 4.0 billion books, extrapolating once again from U.S. production figures [???]. This is equivalent to about 32,000 TB of digital data. In our last study, there were 2.75 billion books sold worldwide, which was equivalent to about 22,000 TB of digital data. This represents growth of over 45% since 1999.

United States. About 1.62 billion books were purchased in the United States in 2001, according to Barrie Rappaport at IPSOS-NPD. Using the 8 MB/book estimate, this is equivalent to 129,600 TB. In our last study, there were 1.6 billion books sold worldwide, which was equivalent to about 128,000 TB of digital data. This represents a growth of 1.25% since 1999.

2. The stock of books.

United States. In March 2003, 2,078,051 book titles were available and for sale in some form. In all there were 4,123,094 book titles, including the "out of print" books according to the Books in Print records provided by Andrew Grabois of R.R. Bowker. These numbers include only books made for commercial publication, not books or other materials meant for private circulation. During our last study in 1999, Books in Print had about 1.66 million titles for sale and 3.2 million titles in all listed. This marks a 25% increase in books available for purchase and about a 30% increase in the number of listed books. Increasingly out of print book titles are available from publishers through 'just in time' printing, which may account for some of the increase. This represents a fraction of the total number of books ever written. That there has been an increase in the number of books in Books in Print's universe indicates that even the data on the total number of books being retroactively catalogued is changing constantly. If one wished to more fully address the universe of book titles in the United States, including government publications, books that are no longer in print, and some books that were written and submitted for copyright but never published commercially, one could look to the holdings of the larger national libraries and copyright repositories. For example, the Library of Congress print media collection includes 26 million titles (208 terabytes).

World. To estimate the international stock of books currently available for purchase, we extrapolate from the United States production figures. The U.S. engages in the world's largest trade in printed products, producing about 40% of the world's printed material, according to the U.S. Industry and Trade Outlook 2000. However, in the past, the U.S. has not been as dominant, though production was also significantly less. It could be speculated then that somewhere less than 40% of worldwide stock of books exists in the U.S. - more specifically, it could be speculated that this number is probably in the range of 15% and 35%, more likely on the lower end.

Using the same ballpark estimates, we can also estimate the worldwide stock of books (including those out of print). The national library and copyright repository of the United States - the Library of Congress - contains about 26 million books. Therefore, the world stock of original books might be between 74 million books and 175 million books (the lower figure if the U.S. represents 35% of world's published books, the upper figure if the U.S. is closer to 15%)

NOTE ON eBooks

eBooks can be defined most broadly as books available in digital format. Within this definition, there are several sub-categories: books that are written purely for electronic media, books that are published in both print and electronic media, out of print books that have been digitized for archival purposes, and books that are out of print but are printed or supplied in electronic format. At the time of our last study, eBooks seemed to be a technology to consider closely for the future. In 2001, only 180,000 e-books were sold worldwide (almost a hundred-thousandth of U.S. book sales alone). At the most basic level, eBooks did not do well because of a preference for paper. But a number of other factors also influenced the growth of this technology.

B. Newspapers

Worldwide, there were at least 436.2 million copies of newspapers sold daily in 2001 according to the World Association of Newspapers (WAN). Using this, we calculate at least 159.14 billion copies of newspapers were sold worldwide. This represents 2,387,100 terabytes of data annually. The above figure is an "at least" calculation - which excludes data from several print material producing nations including China, Russia, Mexico and Pakistan. Closer examination of the statistics shows that the rise in the worldwide sales of newspapers has almost exclusively been in developing nations worldwide.

Table 2.8: Daily Circulation (Worldwide - Millions), 1997-2001

 

1997

1998

1999

2000

2001

% change

USA

58.7

56.1

55.9

55.9

55.5

-2.03

Japan

72.7

72.4

72.2

71.9

71.7

-1.38

EU

82.2

81.4

80.5

80.1

78.7

-4.3

Other

204.5

204.1

212.4

228.3

230.3

12.57

Total

416.2

414.2

421.2

434.2

436.2

4.81

Source: World Association of Newspapers. Note: Figures Exclude Chile Lithuania, Mali, Mexico, Pakistan, Russia, Serbia, South Korea, Taiwan

Table 2.9: Average Circulation Per Title (Worldwide – ‘000s)

 

1997

1998

1999

2000

2001

% change

USA

37.6

37.7

37.7

37.9

37.8

-0.11

Japan

667

670.5

662.8

653.8

676.4

3.48

EU

71.7

71.5

71.1

71

69.5

-2.94

Other

72.2

73.7

69.2

67

68.8

-4.78

Total

74.3

74.7

72.7

71.3

72

-3.08

Source: World Association of Newspapers. Note: Figures Exclude Chile Lithuania, Mali, Mexico, Pakistan, Russia, Serbia, South Korea, Taiwan

It is evident that the growth in newspapers has been away from developed economies, which seem to have a universal stagnation or fall in the sales of newspapers. The downward trend has been slow, yet the fall in circulation has been accompanied by a slight fall in the number of newspapers in existence.

The effects of both Internet technology and of the growing reach of education and access to media are seen in the worldwide newspaper sales trends.

Table 2.10: Selected Annual Copies Sold (Worldwide – Millions)

Country

Sales (Million)

Percent Change

(5-years)

Last Year of Available Data

Belgium

484

-3.8

2001

Canada

1753

2.3

2001

Costa Rica

104

15.3

2001

Croatia

173

-3.1

2001

Czech Republic

508

-18.6

2001

Denmark

519

-9.6

2001

Estonia

72

2.1

2001

Finland

87

-11.2

2001

Germany

7368

-5.6

2001

Greece

194

-6.1

2001

Hungary

515

-2

2001

India

10893

19.8

2001

Ireland

183

6.4

2001

Italy

2096

1.4

2001

Latvia

73

-7.6

2001

Luxembourg

36

9.1

2001

Netherlands

1338

-8.9

2001

New Zealand

238

-5.1

2001

Norway

750

-2.3

2001

Slovakia

132

-50.9

2001

South Africa

282

-3

2001

Spain

1556

17

2001

Sri Lanka

164

0.8

2001

Sweden

1233

-3.8

2001

Switzerland

796

-2.5

2001

Turkey

1061

-33.4

2001

United States

17485

-2

1999

Source: World Association of Newspapers.

Thus nations with higher access to the Internet have seen stagnations in newspaper circulation in the 5-year period between 1997 and 2001, while nations like India and China with rapidly expanding economies and consumer bases have seen fairly significant increases in newspaper circulation. Exact percentages are not available but China has seen an increase of roughly 20% in the circulation of dailies since 1997. This trend may also be partly attributed to urbanization and the increase in the number of young professionals (and households) in many developing nations.

According to the WAN, Ireland was the only developed nation with a significant positive increase in annual newspaper sales; growth was dominated by East European, South American and Asian nations. On the contrary, most developed nations saw negative growth, with Turkey heading the list for the most significant fall in newspaper sales. Only two Asian nations, Sri Lanka and Mongolia, were on the negative growth list, otherwise dominated by West European nations.

U.S. Flow: Newspapers

According to the World Association of Newspapers (WAN), there were approximately 55.5 million copies of newspapers sold in the U.S. in 2001. This represents a very slight decrease over a 5-year period. Similarly the total flow of newspapers has decreased since 1997, but remains in the range of 17.4 billion copies sold yearly. This represents 261,000 terabytes of data annually. Compared with the change in the number of newspapers in circulation, there is a slight difference, suggesting either that there has been some merger of interest, or simply that some newspapers ceased publishing, and their clientele thereafter moved to other publications.

Table 2.11: Selected Annual United States Newspaper Readership Trends

 

1997

1998

1999

2000

2001

5 yr Change

Number of Newspapers

1509

1489

1483

1476

1468

-2.72

Number of Copies Sold (Million)

58.7

56.1

55.9

55.9

55.5

-2.03

Copies Per Publication ('000s)

37.6

37.7

37.7

37.9

37.8

-0.11

Source: World Association of Newspapers (WAN)

C. Mail

This table of facts about postal mail in the United States is from Odlyzko (2000), supplemented by the most recent U.S. Postal Service Annual Report (2002). About half of all mail is currently first class and about half is junk mail. If we assume 2 pages per piece of mail, and digitize it at 15 Kbytes per page, 2002 U.S. mail is about 6.2 petabytes per year. This represents an increase of about 0.3 petabytes over the 2000 study.

The U.S. Postal Service handles approximately 40 percent of the world's card and letter mail volume. Next largest: Japan with 6 percent. The total volume of mail worldwide equals 15.6 petabytes.

Table 2.12: Statistics about U.S. mail service, from Odlyzko (2000).

Year

Cost
(millions)

Cost/GDP
(percent)

Pieces
(millions)

Mail per Person

1790

0.032

0.02

0.8

0.20

1800

0.214

0.05

3.9

0.73

1810

0.496

0.09

7.7

1.07

1820

1.161

0.18

14.9

1.55

1830

1.933

0.21

29.8

2.32

1840

4.718

0.28

79.9

4.68

1850

5.213

0.20

155.1

6.66

1860

14.87

0.39

 

 

1870

24.00

0.33

 

 

1880

36.54

0.35

 

 

1890

66.26

0.51

4,005

63.7

1900

107.7

0.58

7,130

93.8

1910

230.0

0.65

14,850

161

1920

454.3

0.50

 

 

1930

803.7

0.89

27,887

227

1940

807.6

0.81

27,749

211

1950

2,223

0.78

45,064

299

1960

3,874

0.77

63,675

355

1970

7,876

0.81

84,882

418

1980

19,412

0.70

106,311

469

1990

40,490

0.70

166,301

669

1998

57,778

0.68

197,943

733

1999

60,418

 

201, 644

 

2000

62,284

 

207,882

739

2001

63,425

 

207,463

717

2002

63,761

 

202,822

738

Source: The history of communications and its implications for the Internet, A. M. Odlyzko. [Abstract] [PostScript] [PDF] [LaTeX]

IV. CONVERSION FACTORS AND ASSUMPTIONS

Printing / writing paper production. We estimate that one ream (500 sheets) of standard grade 8 1/2 X 11" paper would weigh 5 pounds. Therefore each metric ton (2,204 pounds) equals about 440 reams or 220,000 sheets. Multiplying by 26 KB per page results in 6 GB per metric ton.

Newsprint production. We estimate that storage requirements would be about 2 times that for printing / writing paper: 12 GB per metric ton. Newspapers tend to contain more words and graphics per page, requiring, on average, 1 MB per scanned page. Furthermore, newsprint is thinner and lighter, so each metric ton contains more individual sheets.

Books. We estimate 300 pages per book, for a total of 39 MB apiece. (Source: Robert M. Hayes, UCLA, "The Economics of Digital Libraries" www.usp.br/sibi/economics.html)

Newspapers. We estimate 30 pages per newspaper, then multiply that number by the total units produced every year (The page number is low, to reflect the number of small and non-daily newspapers published around the world.) A double-truck (center fold) full broadsheet is 24 X 36 inches. Because a newspaper would be scanned at higher resolution and contains detailed graphics, a double-truck would require about 1 megabyte (uncompressed) and a single full broadsheet page (18 X 24 inches) would require about 0.5 MB.

Mass Market Periodicals. We estimate 96 pages per periodical per issue, averaging larger and smaller sized publications. (Similar calculation by: Robert M. Hayes, UCLA, "The Economics of Digital Libraries" www.usp.br/sibi/economics.html)

Newsletters. On the estimated 150 pages per newsletter per year - on a ballpark 12 pages per issue averaged over all newspapers. (Source: Oxbridge Directory of Newsletters - 1997)

Office documents. The estimate above is limited to documents that an organization might retain permanently such as documents comparable to those retained by the National Archives in Washington D.C., which estimates that they retain 2% of U.S. government documents produced each year. This study does not account for the documents generated and disposed of yearly without recording.

Table 2.13: Conversion Assumptions

Average # of Pages per Issue / Item

Storage Format

Average File Size Per Page

Conversion Factor Per Issue/ Item (rounded)

300

(Books)

Scanned TIFF (600 dpi)

130 KB

39 MB

Compressed

26 KB

7.8 MB

Plain text

2.5 KB

0.75 MB

208

(Scholarly Periodicals)

Scanned TIFF (600 dpi)

130 KB

27 MB

Compressed

26 KB

5.4 MB

Plain text

2.5 KB

0.5 MB

96

(Mass Market Periodicals)

Scanned TIFF (600 dpi)

130 KB

13 MB

Compressed

26 KB

2.6 MB

Plain text

2.5 KB

.25 MB

12

(Newsletters)

Scanned TIFF (600 dpi)

130 KB

1.6 MB/Item

Compressed

26 KB

0.3 MB/item

Plain text

2.5 KB

.03 MB/Item

Source: Raw data from Bowker Inc., May 2003. Conversion factors to bytes from the ArchiveBuilders web page. Compilation by How much information 2003

Table 2.14: Academic and Scholarly Periodicals (Worldwide)

Frequency

Titles

Multiplier

Total Publications

Annual

4885

(*1)

4,885

Biennial

246

(*0.5)

123

Bimonthly

3990

(*6)

23,940

Daily

27

(*365)

9,855

Irregular

7057

(*4)

28,228

Monthly

7248

(*12)

86,976

Quarterly

9814

(*4)

39,256

Semiannually

4147

(*2)

8,294

Weekly

195

(*52)

10,140

 

 

TOTAL

211,697

Source: Raw data from Bowker Inc., May 2003

Table 2.15: Mass Market and Trade Periodicals (United States)

Frequency

Titles

Multiplier

Total Publications

Annual

2354

(*1)

2,354

Biennial

130

(*0.5)

65

Bimonthly

2294

(*6)

13,764

Daily

287

(*365)

104,755

Irregular

627

(*4)

2,508

Monthly

6834

(*12)

82,008

Quarterly

2659

(*4)

10,636

Semiannually

484

(*2)

968

Weekly

946

(*52)

49,192

 

 

TOTAL

266,250

Source: Raw data from Bowker Inc., May 2003

Table 2.16: Academic and Scholarly Periodicals (United States)

Frequency

Titles

Multiplier

Total Publications

Annual

1268

(*1)

1,268

Biennial

86

(*0.5)

43

Bimonthly

1089

(*6)

6,534

Daily

11

(*365)

4,015

Irregular

1474

(*4)

5,896

Monthly

1962

(*12)

23,544

Quarterly

3450

(*4)

13,800

Semiannually

1210

(*2)

2,420

Weekly

82

(*52)

4,264

 

 

TOTAL

61,784

Source: Raw data from Bowker Inc., May 2003

Table 2.17: Newsletters (United States)

Frequency

Titles

Multiplier

Total Publications

Annual

653

(*1)

653

Biennial

38

(*0.5)

19

Bimonthly

1210

(*6)

7,260

Daily

220

(*365)

80,300

Irregular

588

(*4)

2,352

Monthly

4670

(*12)

56,040

Quarterly

2714

(*4)

10,856

Semiannually

498

(*2)

996

Weekly

990

(*52)

51,480

 

 

TOTAL

209,956

Source: Raw data from Bowker Inc., May 2003

REFERENCES

Special thanks to:

 

Release date: October 27, 2003. © 2003 Regents of the University of California